www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

Free Text Search

Basic Concepts
Creating Free Text Indexes
Querying Free Text Indexes
CONTAINS predicate Comments Text Expression Syntax
Text Triggers
Generated Tables and Internals
Removing A Text Index
Removing A Text Trigger
Internationalization & Unicode
Performance
Free Text Functions

18.3. Querying Free Text Indexes

18.3.1. CONTAINS predicate

Returns TRUE if a free text indexed column matches a text expression.

Syntax

contains_pred:
	contains (column, expr, opt_or_value ....)

opt_or_value:
	  DESCENDING
	  | START_ID ',' scalar_exp
	  | END_ID ',' scalar_exp
	  | SCORE_LIMIT ',' scalar_exp
	  | RANGES ',' variable
	  | OFFBAND column

variable: IDENTIFIER

The column must refer to a column for which there exists a free text index. The expr must be a narrow or wide string expression whose syntax matches the rules in 'Text Query Syntax'. The START_ID is the first allowed document ID to be selected by the expression in its traversal order, e.g. least or equal for ascending and greatest or equal for descending. END_ID is the last allowed ID in the traversal order. For descending order the START_ID must be >= END_ID for hits to be able to exist. For ascending order the START_ID must be <= END_ID for hits to be able to exist.

DESCENDING specifies that the search will produce the hit with the greatest ID first, as defined by integer or composite collation. RANGES specifies that the query variable following the RANGES keyword will be bound to the word position ranges of the hits of the expression inside the document. The variable is in scope inside the enclosing SELECT statement.

SCORE_LIMIT specifies a minimum score that hits must have or exceed to be considered matches of the predicate.

OFFBAND specifies that the following column will be retrieved from the free text index instead of the actual table. For this to be possible the column must have been declared as offband with the CLUSTERED WITH option of the CREATE TEXT INDEX statement.

See Also:

contains()

The XCONTAINS Predicate.


18.3.2. Comments

Order - If the select statement containing the contains predicate does not specify an exact match of the primary key of the table having the contains predicate, then the contains predicate will be the 'driving' condition, meaning that rows come in ascending or descending order of the free text document ID.

The DESCENDING keyword specifies the descending order of the free text index document ID and has nothing to do with a possible ORDER BY of the enclosing statement. Even if there is an ORDER BY in the enclosing statement the DESCENDING keyword of contains has an effect in the interpretation of the STRT_ID and END_ID contains options.

If there is a full equality match of the primary key of the table, this will be the driving predicate and contains will only be used to check if the text expression matches the single row identified by the full match of the primary key.

The contains predicate may not appear outside of a select statement and may only reference a column for which a free text index has been declared. The first argument must be a column for which there is such an index. The text expression may be variable and computed, although it must be constant during the evaluation of the select statement containing it.

The contains predicate must be a part of the top level AND of the WHERE clause of the containing select. It may not for example be a term of an OR predicate in the select but can be AND'ed with an OR expression.


18.3.3. Text Expression Syntax

expr ::= proximity_expr
	expr AND expr
	| expr OR expr
	| expr AND NOT  expr
	| '(' expr ')'

word_expr ::=
	  word
	| '"' phrase '"'

proximity_expr ::=
	  word_expr
	| proximity_expr NEAR word_expr

word ::=
	<word char>*

phrase ::=
	  word
	| phrase <whitespace> word

word_char ::=  alphanumeric characters, '*',  ISO Latin accented characters.

A word is a sequence of word characters. A phrase is a sequence of words separated by white spaces and enclosed in double quotes. If a word contains a wildcard character it must be quoted with double quotes.

Note:

An expression may not consist of all negative terms, e.g. (not a) and (not b) is not a valid expression but 'c and not a and not b' is a valid expression.

Note that the NEAR connective may not be used between AND'ed or OR'ed terms. It can be used to combine words or phrases.

Querying Free Text Indexed Columns
select count (*) from docs
where contains (text, '"virtual database"')

returns the count of documents with one or more occurrences of "virtual" immediately followed by "database".

'performance and (tuning or optimization)'

specifies documents with performance and either 'tuning' or optimization' in any respective positions.

'graphics and not (graphics near user near interface)'

matches documents with the word graphics more than 100 words away from 'user' or 'interface'.

'"sql interfac*"'

matches documents with SQL followed by a word beginning with 'interfac'.

'"dragon*" and not "once upon a time"'

matches documents with words beginning with 'dragon' and not containing the phrase 'once upon a time'.