This processor queries Solr and writes results to FlowFiles. The processor can be used at the beginning of dataflows and later. Solr results can be written to FlowFiles as Solr XML or using records functions (supporting CSV, JSON, etc.). Additionally, facets and stats can be retrieved. They are written to FlowFiles in JSON and sent to designated relationships.
The processor can either be configured to retrieve only top results or full result sets. However, it should be emphasized that this processor is not designed to export large result sets from Solr. If the processor is configured to return full result sets, the configured number of rows per request will be used as batch size and the processor will iteratively increase the start parameter returning results in one FlowFile per request. The processor will stop iterating through results as soon as the start parameter exceeds 10000. For exporting large result sets, it can be considered to make use of the processor GetSolr. Principally, it is also possible to embed this processor into a dataflow iterating through results making use of the attribute solr.cursor.mark that is added to FlowFiles for each request. Notice that the usage of Solr's cursor mark requires queries to fulfil several preconditions (see Solr documentation for deep paging for additional details).
The most common Solr parameters can be defined via processor properties. Other parameters have to be set via dynamic properties.
Parameters that can be set multiple times also have to be defined as dynamic properties (e. g. fq, facet.field, stats.field). If these parameters must be set multiple times with different values, properties can follow a naming convention: name.number, where name is the parameter name and number is a unique number. Repeating parameters will be sorted by their property name.
Example: Defining the fq parameter multiple times
Property Name | Property Value |
---|---|
fq.1 | field1:value1 |
fq.2 | field2:value2 |
fq.3 | field3:value3 |
This definition will be appended to the Solr URL as follows: fq=field1:value1&fq=field2:value2&fq=field3:value3
Facets and stats can be activated setting the respective Solr parameters as dynamic properties. Example:
Property Name | Property Value |
---|---|
facet | true |
facet.field | fieldname |
stats | true |
stats.field | fieldname |
Multiple fields for facets or stats can be defined in the same way as it is described for multiple filter queries:
Property Name | Property Value |
---|---|
facet | true |
facet.field.1 | firstField |
facet.field.2 | secondField |
This definition will be appended to the Solr URL as follows: facet=true&facet.field=firstField&facet.field=secondField