net.nutch.analysis.lang
Class LanguageIdentifier

java.lang.Object
  extended bynet.nutch.analysis.lang.LanguageIdentifier
All Implemented Interfaces:
IndexingFilter

public class LanguageIdentifier
extends Object
implements IndexingFilter

Author:
Sami Siren

Field Summary
static Logger LOG
           
 
Fields inherited from interface net.nutch.indexer.IndexingFilter
X_POINT_ID
 
Constructor Summary
LanguageIdentifier()
           
 
Method Summary
 Document filter(Document doc, Parse parse, FetcherOutput fo)
          Adds fields or otherwise modifies the document that will be indexed for a parse.
static LanguageIdentifier getInstance()
          return handle to singleton instance
 String identify(InputStream is)
          Identify language from inputstream
 String identify(String text)
          Identify language based on submitted content
 String identify(StringBuffer text)
           
static void main(String[] args)
          main method used for testing
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG
Constructor Detail

LanguageIdentifier

public LanguageIdentifier()
Method Detail

getInstance

public static LanguageIdentifier getInstance()
return handle to singleton instance


main

public static void main(String[] args)
main method used for testing

Parameters:
args -

identify

public String identify(String text)
Identify language based on submitted content

Parameters:
text - text of doc
Returns:
ISO code of language (en, fi, sv...) , or null if unknown

identify

public String identify(StringBuffer text)

identify

public String identify(InputStream is)
                throws IOException
Identify language from inputstream

Parameters:
is -
Returns:
Throws:
IOException

filter

public Document filter(Document doc,
                       Parse parse,
                       FetcherOutput fo)
                throws IndexingException
Description copied from interface: IndexingFilter
Adds fields or otherwise modifies the document that will be indexed for a parse.

Specified by:
filter in interface IndexingFilter
Throws:
IndexingException


Copyright © 2004 The Nutch Organization.