<type>tsvector</type> representation
of a document — the original text need only be retrieved
when the document has been selected for display to a user.
We therefore often speak of the <type>tsvector</type> as being the
document, but of course it is only a compact representation of
the full document.
</para>
</sect2>
<sect2 id="textsearch-matching">
<title>Basic Text Matching</title>
<para>
Full text searching in <productname>PostgreSQL</productname> is based on
the match operator <literal>@@</literal>, which returns
<literal>true</literal> if a <type>tsvector</type>
(document) matches a <type>tsquery</type> (query).
It doesn't matter which data type is written first:
<programlisting>
SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector @@ 'cat & rat'::tsquery;
?column?
----------
t
SELECT 'fat & cow'::tsquery @@ 'a fat cat sat on a mat and ate a fat rat'::tsvector;
?column?
----------
f
</programlisting>
</para>
<para>
As the above example suggests, a <type>tsquery</type> is not just raw
text, any more than a <type>tsvector</type> is. A <type>tsquery</type>
contains search terms, which must be already-normalized lexemes, and
may combine multiple terms using AND, OR, NOT, and FOLLOWED BY operators.
(For syntax details see <xref linkend="datatype-tsquery"/>.) There are
functions <function>to_tsquery</function>, <function>plainto_tsquery</function>,
and <function>phraseto_tsquery</function>
that are helpful in converting user-written text into a proper
<type>tsquery</type>, primarily by normalizing words appearing in
the text. Similarly, <function>to_tsvector</function> is used to parse and
normalize a document string. So in practice a text search match would
look more like this:
<programlisting>
SELECT to_tsvector('fat cats ate fat rats') @@ to_tsquery('fat & rat');
?column?
----------
t
</programlisting>
Observe that this match would not succeed if written as
<programlisting>
SELECT 'fat cats ate fat rats'::tsvector @@ to_tsquery('fat & rat');
?column?
----------
f
</programlisting>
since here no normalization of the word <literal>rats</literal> will occur.
The elements of a <type>tsvector</type> are lexemes, which are assumed
already normalized, so <literal>rats</literal> does not match <literal>rat</literal>.
</para>
<para>
The <literal>@@</literal> operator also
supports <type>text</type> input, allowing explicit conversion of a text
string to <type>tsvector</type> or <type>tsquery</type> to be skipped
in simple cases. The variants available are:
<programlisting>
tsvector @@ tsquery
tsquery @@ tsvector
text @@ tsquery
text @@ text
</programlisting>
</para>
<para>
The first two of these we saw already.
The form <type>text</type> <literal>@@</literal> <type>tsquery</type>
is equivalent to <literal>to_tsvector(x) @@ y</literal>.
The form <type>text</type> <literal>@@</literal> <type>text</type>
is equivalent to <literal>to_tsvector(x) @@ plainto_tsquery(y)</literal>.
</para>
<para>
Within a <type>tsquery</type>, the <literal>&</literal> (AND) operator
specifies that both its arguments must appear in the document to have a
match. Similarly, the <literal>|</literal> (OR) operator specifies that
at least one of its arguments must appear, while the <literal>!</literal> (NOT)
operator specifies that its argument must <emphasis>not</emphasis> appear in
order to have a match.
For example, the query <literal>fat & ! rat</literal> matches documents that
contain <literal>fat</literal> but not <literal>rat</literal>.
</para>
<para>
Searching for phrases is possible with the help of
the <literal><-></literal> (FOLLOWED BY) <type>tsquery</type> operator, which
matches only if its arguments have matches that are adjacent and in the
given order.