|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectnet.nutch.db.WebDBReader
The WebDBReader implements all the read-only parts of accessing our web database. All the writing ones can be found in WebDBWriter.
| Constructor Summary | |
WebDBReader(File dir)
Open a web db reader for the named directory. |
|
| Method Summary | |
void |
close()
Shutdown |
Link[] |
getLinks(MD5Hash md5)
Grab all the links from the given MD5 hash. |
Link[] |
getLinks(UTF8 url)
Get all the hyperlinks that link TO the indicated URL. |
Page |
getPage(String url)
Get Page from the pagedb with the given URL |
Page[] |
getPages(MD5Hash md5)
Get Pages from the pagedb according to their content hash. |
Enumeration |
links()
Return all the links, by target URL |
static void |
main(String[] argv)
The WebDBReader.main() provides some handy utility methods for looking through the contents of the webdb. |
long |
numLinks()
Return the number of links in our db. |
long |
numPages()
Return the number of pages we're dealing with |
boolean |
pageExists(MD5Hash md5)
Test whether a certain piece of content is in the database, but don't bother returning the Page(s) itself. |
Enumeration |
pages()
Iterate through all the Pages, sorted by URL |
Enumeration |
pagesByMD5()
Iterate through all the Pages, sorted by MD5 |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public WebDBReader(File dir)
throws IOException,
FileNotFoundException
| Method Detail |
public void close()
throws IOException
close in interface IWebDBReaderIOException
public Page getPage(String url)
throws IOException
getPage in interface IWebDBReaderIOException
public Page[] getPages(MD5Hash md5)
throws IOException
getPages in interface IWebDBReaderIOException
public boolean pageExists(MD5Hash md5)
throws IOException
pageExists in interface IWebDBReaderIOException
public Enumeration pages()
throws IOException
pages in interface IWebDBReaderIOException
public Enumeration pagesByMD5()
throws IOException
pagesByMD5 in interface IWebDBReaderIOExceptionpublic long numPages()
numPages in interface IWebDBReader
public Link[] getLinks(UTF8 url)
throws IOException
getLinks in interface IWebDBReaderIOException
public Link[] getLinks(MD5Hash md5)
throws IOException
getLinks in interface IWebDBReaderIOExceptionpublic Enumeration links()
links in interface IWebDBReaderpublic long numLinks()
numLinks in interface IWebDBReader
public static void main(String[] argv)
throws FileNotFoundException,
IOException
FileNotFoundException
IOException
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||