org.creativecommons.nutch
Class CCParseFilter

java.lang.Object
  extended byorg.creativecommons.nutch.CCParseFilter
All Implemented Interfaces:
HtmlParseFilter

public class CCParseFilter
extends Object
implements HtmlParseFilter

Adds metadata identifying the Creative Commons license used, if any.


Nested Class Summary
static class CCParseFilter.Walker
          Walks DOM tree, looking for RDF in comments and licenses in anchors.
 
Field Summary
static Logger LOG
           
 
Fields inherited from interface net.nutch.parse.HtmlParseFilter
X_POINT_ID
 
Constructor Summary
CCParseFilter()
           
 
Method Summary
 Parse filter(Content content, Parse parse, DocumentFragment doc)
          Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG
Constructor Detail

CCParseFilter

public CCParseFilter()
Method Detail

filter

public Parse filter(Content content,
                    Parse parse,
                    DocumentFragment doc)
             throws ParseException
Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page.

Specified by:
filter in interface HtmlParseFilter
Throws:
ParseException


Copyright © 2004 The Nutch Organization.