public class WebTableReader extends NutchTool implements Tool
Modifier and Type | Class and Description |
---|---|
static class |
WebTableReader.WebTableRegexMapper
Filters the entries from the table based on a regex
|
static class |
WebTableReader.WebTableStatCombiner |
static class |
WebTableReader.WebTableStatMapper |
static class |
WebTableReader.WebTableStatReducer |
Modifier and Type | Field and Description |
---|---|
static org.slf4j.Logger |
LOG |
currentJob, currentJobNum, numJobs, results, status
Constructor and Description |
---|
WebTableReader() |
Modifier and Type | Method and Description |
---|---|
static void |
main(String[] args) |
void |
processDumpJob(String output,
Configuration config,
String regex,
boolean content,
boolean headers,
boolean links,
boolean text) |
void |
processStatJob(boolean sort) |
Map<String,Object> |
run(Map<String,Object> args)
Runs the tool, using a map of arguments.
|
int |
run(String[] args) |
getProgress, getStatus, killJob, stopJob
getConf, setConf
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getConf, setConf
public void processDumpJob(String output, Configuration config, String regex, boolean content, boolean headers, boolean links, boolean text) throws IOException, ClassNotFoundException, InterruptedException
Copyright © 2015 The Apache Software Foundation