Package net.nutch.analysis.lang

Text document language identifier.

See:
          Description

Class Summary
HTMLLanguageParser Adds metadata identifying language of document if found
LanguageIdentifier  
LanguageQueryFilter Handles "lang:" query clauses, causing them to search the "lang" field indexed by LanguageIdentifier.
NGramProfile This class runs a ngram analysis over submitted text, results might be used for automatic language identifiaction.
 

Package net.nutch.analysis.lang Description

Text document language identifier.

Language profiles are based on material from http://www.isi.edu/~koehn/europarl/.



Copyright © 2004 The Nutch Organization.