public class CCIndexingFilter extends Object implements IndexingFilter
Modifier and Type | Field and Description |
---|---|
static String |
FIELD
The name of the document field we use.
|
static org.slf4j.Logger |
LOG |
X_POINT_ID
Constructor and Description |
---|
CCIndexingFilter() |
Modifier and Type | Method and Description |
---|---|
void |
addUrlFeatures(NutchDocument doc,
String urlString)
Add the features represented by a license URL.
|
NutchDocument |
filter(NutchDocument doc,
String url,
WebPage page)
Adds fields or otherwise modifies the document that will be indexed for a
parse.
|
Configuration |
getConf() |
Collection<WebPage.Field> |
getFields() |
void |
setConf(Configuration conf) |
public static final org.slf4j.Logger LOG
public static String FIELD
public void addUrlFeatures(NutchDocument doc, String urlString)
public void setConf(Configuration conf)
setConf
in interface Configurable
public Configuration getConf()
getConf
in interface Configurable
public Collection<WebPage.Field> getFields()
getFields
in interface FieldPluggable
public NutchDocument filter(NutchDocument doc, String url, WebPage page) throws IndexingException
IndexingFilter
filter
in interface IndexingFilter
doc
- document instance for collecting fieldsurl
- page urlIndexingException
Copyright © 2015 The Apache Software Foundation