public class RegexpQueryHandler extends java.lang.Object implements CustomQueryHandler
This implementation will filter out more wildcard queries than TermFilteredPresearcher, at the expense of longer document build times. Which one is more performant will depend on the type and number of queries registered in the Monitor, and the size of documents to be monitored. Profiling is recommended.
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_MAX_TOKEN_SIZE
The default maximum length of an input token before ANYTOKENS are generated
|
static java.lang.String |
DEFAULT_NGRAM_SUFFIX
The default suffix with which to mark ngrams
|
static java.lang.String |
DEFAULT_WILDCARD_TOKEN
The default token to emit if a term is longer than MAX_TOKEN_SIZE
|
private java.util.Set<java.lang.String> |
excludedFields |
private int |
maxTokenSize |
private java.lang.String |
ngramSuffix |
private java.lang.String |
wildcardToken |
private BytesRef |
wildcardTokenBytes |
Constructor and Description |
---|
RegexpQueryHandler()
Creates a new RegexpQueryHandler using default settings
|
RegexpQueryHandler(int maxTokenSize)
Creates a new RegexpQueryHandler with a maximum token size
|
RegexpQueryHandler(java.lang.String ngramSuffix,
int maxTokenSize,
java.lang.String wildcardToken,
java.util.Set<java.lang.String> excludedFields)
Creates a new RegexpQueryHandler
|
Modifier and Type | Method and Description |
---|---|
QueryTree |
handleQuery(Query q,
TermWeightor termWeightor)
Builds a
QueryTree node from a query |
private static java.lang.String |
parseOutRegexp(java.lang.String rep) |
private static java.lang.String |
selectLongestSubstring(java.lang.String regexp) |
TokenStream |
wrapTermStream(java.lang.String field,
TokenStream ts)
Adds additional processing to the
TokenStream over a document's
terms index |
public static final java.lang.String DEFAULT_NGRAM_SUFFIX
public static final int DEFAULT_MAX_TOKEN_SIZE
public static final java.lang.String DEFAULT_WILDCARD_TOKEN
private final java.lang.String ngramSuffix
private final java.lang.String wildcardToken
private final BytesRef wildcardTokenBytes
private final int maxTokenSize
private final java.util.Set<java.lang.String> excludedFields
public RegexpQueryHandler(java.lang.String ngramSuffix, int maxTokenSize, java.lang.String wildcardToken, java.util.Set<java.lang.String> excludedFields)
ngramSuffix
- the suffix with which to mark ngramsmaxTokenSize
- the maximum length of an input token before WILDCARD tokens are generatedwildcardToken
- the token to emit if a token is longer than maxTokenSize in lengthexcludedFields
- a Set of fields to ignore when generating ngramspublic RegexpQueryHandler()
public RegexpQueryHandler(int maxTokenSize)
maxTokenSize
- the maximum length of an input token before WILDCARD tokens are generatedpublic TokenStream wrapTermStream(java.lang.String field, TokenStream ts)
CustomQueryHandler
TokenStream
over a document's
terms indexwrapTermStream
in interface CustomQueryHandler
public QueryTree handleQuery(Query q, TermWeightor termWeightor)
CustomQueryHandler
QueryTree
node from a queryhandleQuery
in interface CustomQueryHandler
private static java.lang.String parseOutRegexp(java.lang.String rep)
private static java.lang.String selectLongestSubstring(java.lang.String regexp)