Package | Description |
---|---|
org.apache.nutch.api.impl |
Implementations of REST API interfaces.
|
org.apache.nutch.crawl |
Crawl control code and tools to run the crawler.
|
org.apache.nutch.fetcher |
The Nutch robot.
|
org.apache.nutch.indexer |
Index content, configure and run indexing and cleaning jobs to
add, update, and delete documents from an index.
|
org.apache.nutch.parse |
The
Parse interface and related classes. |
Modifier and Type | Method and Description |
---|---|
NutchTool |
JobFactory.createToolByClassName(String className,
Configuration conf) |
NutchTool |
JobFactory.createToolByType(JobManager.JobType type,
Configuration conf) |
Constructor and Description |
---|
JobWorker(JobConfig jobConfig,
Configuration conf,
NutchTool tool) |
Modifier and Type | Class and Description |
---|---|
class |
DbUpdaterJob |
class |
GeneratorJob |
class |
InjectorJob
This class takes a flat file of URLs and adds them to the of pages to be
crawled.
|
class |
WebTableReader
Displays information about the entries of the webtable
|
Modifier and Type | Class and Description |
---|---|
class |
FetcherJob
Multi-threaded fetcher.
|
Modifier and Type | Class and Description |
---|---|
class |
CleaningJob |
class |
IndexingJob |
Modifier and Type | Class and Description |
---|---|
class |
ParserJob |
Copyright © 2015 The Apache Software Foundation