Pattern

License: 
BSD
Description: 

Pattern is a web mining module for Python. It bundles tools for data retrieval (Google + Twitter + Wikipedia, web spider, HTML parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics) and data visualization (graph networks).

Extensive documentation can be found on here.