public static class FetcherJob.FetcherMapper extends org.apache.gora.mapreduce.GoraMapper<String,WebPage,IntWritable,FetchEntry>
Mapper class for Fetcher.
This class reads the random integer written by GeneratorJob
as its
key while outputting the actual key and value arguments through a
FetchEntry
instance.
This approach (combined with the use of PartitionUrlByHost
) makes
sure that Fetcher is still polite while also randomizing the key order. If
one host has a huge number of URLs in your table while other hosts have
not, FetcherReducer
will not be stuck on one host but process URLs
from other hosts as well.
Mapper.Context
Constructor and Description |
---|
FetcherJob.FetcherMapper() |
Modifier and Type | Method and Description |
---|---|
protected void |
map(String key,
WebPage page,
Mapper.Context context) |
protected void |
setup(Mapper.Context context) |
protected void setup(Mapper.Context context)
setup
in class Mapper<String,WebPage,IntWritable,FetchEntry>
protected void map(String key, WebPage page, Mapper.Context context) throws IOException, InterruptedException
map
in class Mapper<String,WebPage,IntWritable,FetchEntry>
IOException
InterruptedException
Copyright © 2015 The Apache Software Foundation