net.nutch.tools
Class FetchListTool

java.lang.Object
  extended bynet.nutch.tools.FetchListTool

public class FetchListTool
extends Object

This class takes an IWebDBReader, computes a relevant subset, and then emits the subset.

Author:
Mike Cafarella

Nested Class Summary
static class FetchListTool.SortableScore
          SortableScore is just a WritableComparable Float!
 
Field Summary
static Logger LOG
           
 
Constructor Summary
FetchListTool(File dbDir, boolean refetchOnly, boolean anchorOptimize, float cutoffScore, int seed)
          FetchListTool takes a page db, and emits a RECNO-based subset of it.
 
Method Summary
 void emitFetchList(File segmentDir, long topN, long curTime)
          Spit out the fetchlist, to a BDB at the indicated filename.
 void emitMultipleLists(File dir, int numLists, long topN, long curTime)
          Spit out several fetchlists, so that we can fetch across several machines.
static void main(String[] argv)
          Generate a fetchlist from the pagedb and linkdb
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG
Constructor Detail

FetchListTool

public FetchListTool(File dbDir,
                     boolean refetchOnly,
                     boolean anchorOptimize,
                     float cutoffScore,
                     int seed)
              throws IOException,
                     FileNotFoundException
FetchListTool takes a page db, and emits a RECNO-based subset of it.

Method Detail

emitMultipleLists

public void emitMultipleLists(File dir,
                              int numLists,
                              long topN,
                              long curTime)
                       throws IOException
Spit out several fetchlists, so that we can fetch across several machines.

Throws:
IOException

emitFetchList

public void emitFetchList(File segmentDir,
                          long topN,
                          long curTime)
                   throws IOException
Spit out the fetchlist, to a BDB at the indicated filename.

Throws:
IOException

main

public static void main(String[] argv)
                 throws IOException,
                        FileNotFoundException
Generate a fetchlist from the pagedb and linkdb

Throws:
IOException
FileNotFoundException


Copyright © 2004 The Nutch Organization.