net.nutch.fs
Class FSNamesystem

java.lang.Object
  extended bynet.nutch.fs.FSNamesystem
All Implemented Interfaces:
FSConstants

public class FSNamesystem
extends Object
implements FSConstants

The FSNamesystem tracks several important tables. 1) valid fsname --> blocklist (kept on disk, logged) 2) Set of all valid blocks (inverted #1) 3) block --> machinelist (kept in memory, rebuilt dynamically from reports) 4) machine --> blocklist (inverted #2) 5) LRU cache of updated-heartbeat machines


Field Summary
 
Fields inherited from interface net.nutch.fs.FSConstants
BLOCK_SIZE, BLOCKREPORT_INTERVAL, CHUNKED_ENCODING, DATANODE_STARTUP_PERIOD, FILE_COMPLETE_FAILED, FILE_COMPLETE_ONGOING, FILE_COMPLETE_SUCCESS, HEARTBEAT_INTERVAL, OP_ACK, OP_BLOCKRECEIVED, OP_BLOCKREPORT, OP_CLIENT_ADDBLOCK, OP_CLIENT_ADDBLOCK_ACK, OP_CLIENT_COMPLETEFILE, OP_CLIENT_COMPLETEFILE_ACK, OP_CLIENT_DELETE, OP_CLIENT_DELETE_ACK, OP_CLIENT_LISTING, OP_CLIENT_LISTING_ACK, OP_CLIENT_OPEN, OP_CLIENT_OPEN_ACK, OP_CLIENT_RENAMETO, OP_CLIENT_RENAMETO_ACK, OP_CLIENT_STARTFILE, OP_CLIENT_STARTFILE_ACK, OP_CLIENT_TRYAGAIN, OP_ERROR, OP_FAILURE, OP_HEARTBEAT, OP_INVALIDATE_BLOCKS, OP_READ_BLOCK, OP_TRANSFERBLOCKS, OP_TRANSFERDATA, OP_WRITE_BLOCK, RUNLENGTH_ENCODING, SYSTEM_STARTUP_PERIOD
 
Constructor Summary
FSNamesystem(File dir)
          dir is where the filesystem directory state is stored
 
Method Summary
 void blockReceived(Block block, UTF8 name)
          The given node is reporting that it received a certain block.
 void close()
           
 int completeFile(UTF8 src)
          Finalize the created file and make it world-accessible.
 boolean delete(UTF8 src)
          Remove the indicated filename from the namespace.
 Object[] getAdditionalBlock(UTF8 src)
          The client would like to obtain an additional block for the indicated filename (which is being written-to).
 Object[] getListing(UTF8 src)
          Get a listing of all files at 'src'.
 void gotHeartbeat(UTF8 name, UTF8 machineName, int port, long capacity, long remaining)
          The given node has reported in.
 Object[] open(UTF8 src)
          The client wants to open the given filename.
 Object[] pendingTransfers(DatanodeInfo srcNode)
          Return with a list of Block/DataNodeInfo sets, indicating where various Blocks should be copied, ASAP.
 void processReport(Block[] newReport, UTF8 name)
          The given node is reporting all its blocks.
 Block[] recentlyInvalidBlocks(UTF8 name)
          Return with a list of Blocks that should be invalidated at the given node.
 boolean renameTo(UTF8 src, UTF8 dst)
          Change the indicated filename.
 Object[] startFile(UTF8 src)
          The client would like to create a new block for the indicated filename.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FSNamesystem

public FSNamesystem(File dir)
             throws IOException
dir is where the filesystem directory state is stored

Method Detail

close

public void close()

open

public Object[] open(UTF8 src)
The client wants to open the given filename. Return a list of (block,machineArray) pairs. The sequence of unique blocks in the list indicates all the blocks that make up the filename. The client should choose one of the machines from the machineArray at random.


startFile

public Object[] startFile(UTF8 src)
The client would like to create a new block for the indicated filename. Return an array that consists of the block, plus a set of machines. The first on this list should be where the client writes data. Subsequent items in the list must be provided in the connection to the first datanode.


getAdditionalBlock

public Object[] getAdditionalBlock(UTF8 src)
The client would like to obtain an additional block for the indicated filename (which is being written-to). Return an array that consists of the block, plus a set of machines. The first on this list should be where the client writes data. Subsequent items in the list must be provided in the connection to the first datanode. Make sure the previous blocks have been reported by datanodes and are replicated. Will return an empty 2-elt array if we want the client to "try again later".


completeFile

public int completeFile(UTF8 src)
Finalize the created file and make it world-accessible. The FSNamesystem will already know the blocks that make up the file. Before we return, we make sure that all the file's blocks have been reported by datanodes and are replicated correctly.


renameTo

public boolean renameTo(UTF8 src,
                        UTF8 dst)
Change the indicated filename.


getListing

public Object[] getListing(UTF8 src)
Get a listing of all files at 'src'. The Object[] array exists so we can return file attributes (soon to be implemented)


delete

public boolean delete(UTF8 src)
Remove the indicated filename from the namespace. This may invalidate some blocks that make up the file.


gotHeartbeat

public void gotHeartbeat(UTF8 name,
                         UTF8 machineName,
                         int port,
                         long capacity,
                         long remaining)
The given node has reported in. This method should: 1) Record the heartbeat, so the datanode isn't timed out 2) Adjust usage stats for future block allocation


processReport

public void processReport(Block[] newReport,
                          UTF8 name)
The given node is reporting all its blocks. Use this info to update the (machine-->blocklist) and (block-->machinelist) tables.


blockReceived

public void blockReceived(Block block,
                          UTF8 name)
The given node is reporting that it received a certain block.


recentlyInvalidBlocks

public Block[] recentlyInvalidBlocks(UTF8 name)
Return with a list of Blocks that should be invalidated at the given node. Done in response to a file delete, which eliminates a number of blocks from the universe.


pendingTransfers

public Object[] pendingTransfers(DatanodeInfo srcNode)
Return with a list of Block/DataNodeInfo sets, indicating where various Blocks should be copied, ASAP. The Array that we return consists of two objects: The 1st elt is an array of Blocks. The 2nd elt is a 2D array of DatanodeInfo objs, identifying the target sequence for the Block at the appropriate index.



Copyright © 2004 The Nutch Organization.