public class ScoringFilters extends Configured implements ScoringFilter
ScoringFilter
implementing plugins.X_POINT_ID
Constructor and Description |
---|
ScoringFilters(Configuration conf) |
Modifier and Type | Method and Description |
---|---|
void |
distributeScoreToOutlinks(String fromUrl,
WebPage row,
Collection<ScoreDatum> scoreData,
int allCount)
Distribute score value from the current page to all its outlinked pages.
|
float |
generatorSortValue(String url,
WebPage row,
float initSort)
Calculate a sort value for Generate.
|
Collection<WebPage.Field> |
getFields() |
float |
indexerScore(String url,
NutchDocument doc,
WebPage row,
float initScore)
This method calculates a Lucene document boost.
|
void |
initialScore(String url,
WebPage row)
Calculate a new initial score, used when adding newly discovered pages.
|
void |
injectedScore(String url,
WebPage row)
Calculate a new initial score, used when injecting new pages.
|
void |
updateScore(String url,
WebPage row,
List<ScoreDatum> inlinkedScoreData)
This method calculates a new score during table update, based on the values
contributed by inlinked pages.
|
getConf, setConf
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getConf, setConf
public ScoringFilters(Configuration conf)
public float generatorSortValue(String url, WebPage row, float initSort) throws ScoringFilterException
generatorSortValue
in interface ScoringFilter
url
- url of the pageinitSort
- initial sort value, or a value from previous filters in chainScoringFilterException
public void initialScore(String url, WebPage row) throws ScoringFilterException
initialScore
in interface ScoringFilter
url
- url of the pageScoringFilterException
public void injectedScore(String url, WebPage row) throws ScoringFilterException
injectedScore
in interface ScoringFilter
url
- url of the pagerow
- new page. Filters will modify it in-place.ScoringFilterException
public void distributeScoreToOutlinks(String fromUrl, WebPage row, Collection<ScoreDatum> scoreData, int allCount) throws ScoringFilterException
ScoringFilter
distributeScoreToOutlinks
in interface ScoringFilter
fromUrl
- url of the source pagescoreData
- A list of OutlinkedScoreDatum
s for every outlink. These
OutlinkedScoreDatum
s will be passed to
#updateScore(String, OldWebTableRow, List)
for every
outlinked URL.allCount
- number of all collected outlinks from the source pageScoringFilterException
public void updateScore(String url, WebPage row, List<ScoreDatum> inlinkedScoreData) throws ScoringFilterException
ScoringFilter
updateScore
in interface ScoringFilter
url
- url of the pageScoringFilterException
public float indexerScore(String url, NutchDocument doc, WebPage row, float initScore) throws ScoringFilterException
ScoringFilter
indexerScore
in interface ScoringFilter
url
- url of the pagedoc
- document. NOTE: this already contains all information collected by
indexing filters. Implementations may modify this instance, in
order to store/remove some information.initScore
- initial boost value for the Lucene document.ScoringFilterException
public Collection<WebPage.Field> getFields()
getFields
in interface FieldPluggable
Copyright © 2015 The Apache Software Foundation