net.nutch.parse.msword
Class WordExtractor

java.lang.Object
  extended bynet.nutch.parse.msword.WordExtractor

public class WordExtractor
extends Object

This class extracts the text from a Word 6.0/95/97/2000/XP word doc

Author:
Ryan Ackley

Constructor Summary
WordExtractor()
          Constructor
 
Method Summary
 String extractText(InputStream in)
          Gets the text from a Word document.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WordExtractor

public WordExtractor()
Constructor

Method Detail

extractText

public String extractText(InputStream in)
                   throws Exception
Gets the text from a Word document.

Parameters:
in - The InputStream representing the Word file.
Throws:
Exception


Copyright © 2004 The Nutch Organization.