Home Explore Blog CI



postgresql

4th chunk of `doc/src/sgml/textsearch.sgml`
bb08c41e0454e0a2a984d42498d81df2e7cd98bca3e00b270000000100000fa2
 <type>tsvector</type> representation
    of a document &mdash; the original text need only be retrieved
    when the document has been selected for display to a user.
    We therefore often speak of the <type>tsvector</type> as being the
    document, but of course it is only a compact representation of
    the full document.
   </para>
  </sect2>

  <sect2 id="textsearch-matching">
   <title>Basic Text Matching</title>

   <para>
    Full text searching in <productname>PostgreSQL</productname> is based on
    the match operator <literal>@@</literal>, which returns
    <literal>true</literal> if a <type>tsvector</type>
    (document) matches a <type>tsquery</type> (query).
    It doesn't matter which data type is written first:

<programlisting>
SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector @@ 'cat &amp; rat'::tsquery;
 ?column?
----------
 t

SELECT 'fat &amp; cow'::tsquery @@ 'a fat cat sat on a mat and ate a fat rat'::tsvector;
 ?column?
----------
 f
</programlisting>
   </para>

   <para>
    As the above example suggests, a <type>tsquery</type> is not just raw
    text, any more than a <type>tsvector</type> is.  A <type>tsquery</type>
    contains search terms, which must be already-normalized lexemes, and
    may combine multiple terms using AND, OR, NOT, and FOLLOWED BY operators.
    (For syntax details see <xref linkend="datatype-tsquery"/>.)  There are
    functions <function>to_tsquery</function>, <function>plainto_tsquery</function>,
    and <function>phraseto_tsquery</function>
    that are helpful in converting user-written text into a proper
    <type>tsquery</type>, primarily by normalizing words appearing in
    the text.  Similarly, <function>to_tsvector</function> is used to parse and
    normalize a document string.  So in practice a text search match would
    look more like this:

<programlisting>
SELECT to_tsvector('fat cats ate fat rats') @@ to_tsquery('fat &amp; rat');
 ?column?
----------
 t
</programlisting>

    Observe that this match would not succeed if written as

<programlisting>
SELECT 'fat cats ate fat rats'::tsvector @@ to_tsquery('fat &amp; rat');
 ?column?
----------
 f
</programlisting>

    since here no normalization of the word <literal>rats</literal> will occur.
    The elements of a <type>tsvector</type> are lexemes, which are assumed
    already normalized, so <literal>rats</literal> does not match <literal>rat</literal>.
   </para>

   <para>
    The <literal>@@</literal> operator also
    supports <type>text</type> input, allowing explicit conversion of a text
    string to <type>tsvector</type> or <type>tsquery</type> to be skipped
    in simple cases.  The variants available are:

<programlisting>
tsvector @@ tsquery
tsquery  @@ tsvector
text @@ tsquery
text @@ text
</programlisting>
   </para>

   <para>
    The first two of these we saw already.
    The form <type>text</type> <literal>@@</literal> <type>tsquery</type>
    is equivalent to <literal>to_tsvector(x) @@ y</literal>.
    The form <type>text</type> <literal>@@</literal> <type>text</type>
    is equivalent to <literal>to_tsvector(x) @@ plainto_tsquery(y)</literal>.
   </para>

   <para>
    Within a <type>tsquery</type>, the <literal>&amp;</literal> (AND) operator
    specifies that both its arguments must appear in the document to have a
    match.  Similarly, the <literal>|</literal> (OR) operator specifies that
    at least one of its arguments must appear, while the <literal>!</literal> (NOT)
    operator specifies that its argument must <emphasis>not</emphasis> appear in
    order to have a match.
    For example, the query <literal>fat &amp; ! rat</literal> matches documents that
    contain <literal>fat</literal> but not <literal>rat</literal>.
   </para>

   <para>
    Searching for phrases is possible with the help of
    the <literal>&lt;-&gt;</literal> (FOLLOWED BY) <type>tsquery</type> operator, which
    matches only if its arguments have matches that are adjacent and in the
    given order.

Title: Basic Text Matching in PostgreSQL
Summary
PostgreSQL's full-text search relies on the @@ operator, which checks if a tsvector (document) matches a tsquery (query). A tsquery contains normalized search terms combined with AND, OR, NOT, and FOLLOWED BY operators. Functions like to_tsquery, plainto_tsquery, phraseto_tsquery, and to_tsvector are used to convert text into tsquery and tsvector formats. The @@ operator also supports text input, implicitly converting it to tsvector or tsquery. The & (AND), | (OR), and ! (NOT) operators are used to combine search terms. The <-> (FOLLOWED BY) operator searches for phrases with adjacent and ordered matches.