Home Explore Blog CI



postgresql

7th chunk of `doc/src/sgml/textsearch.sgml`
379f3235f3f5f21085d57fa9b51c3f71464bcc0283b4d72c0000000100000fa3
 objects.
    <productname>PostgreSQL</productname>'s text search facility provides
    four types of configuration-related database objects:
   </para>

  <itemizedlist  spacing="compact" mark="bullet">
   <listitem>
    <para>
     <firstterm>Text search parsers</firstterm> break documents into tokens
     and classify each token (for example, as words or numbers).
    </para>
   </listitem>

   <listitem>
    <para>
     <firstterm>Text search dictionaries</firstterm> convert tokens to normalized
     form and reject stop words.
    </para>
   </listitem>

   <listitem>
    <para>
     <firstterm>Text search templates</firstterm> provide the functions underlying
     dictionaries.  (A dictionary simply specifies a template and a set
     of parameters for the template.)
    </para>
   </listitem>

   <listitem>
    <para>
     <firstterm>Text search configurations</firstterm> select a parser and a set
     of dictionaries to use to normalize the tokens produced by the parser.
    </para>
   </listitem>
  </itemizedlist>

   <para>
    Text search parsers and templates are built from low-level C functions;
    therefore it requires C programming ability to develop new ones, and
    superuser privileges to install one into a database.  (There are examples
    of add-on parsers and templates in the <filename>contrib/</filename> area of the
    <productname>PostgreSQL</productname> distribution.)  Since dictionaries and
    configurations just parameterize and connect together some underlying
    parsers and templates, no special privilege is needed to create a new
    dictionary or configuration.  Examples of creating custom dictionaries and
    configurations appear later in this chapter.
   </para>

  </sect2>

 </sect1>

 <sect1 id="textsearch-tables">
  <title>Tables and Indexes</title>

  <para>
   The examples in the previous section illustrated full text matching using
   simple constant strings.  This section shows how to search table data,
   optionally using indexes.
  </para>

  <sect2 id="textsearch-tables-search">
   <title>Searching a Table</title>

   <para>
    It is possible to do a full text search without an index.  A simple query
    to print the <structname>title</structname> of each row that contains the word
    <literal>friend</literal> in its <structfield>body</structfield> field is:

<programlisting>
SELECT title
FROM pgweb
WHERE to_tsvector('english', body) @@ to_tsquery('english', 'friend');
</programlisting>

    This will also find related words such as <literal>friends</literal>
    and <literal>friendly</literal>, since all these are reduced to the same
    normalized lexeme.
   </para>

   <para>
    The query above specifies that the <literal>english</literal> configuration
    is to be used to parse and normalize the strings.  Alternatively we
    could omit the configuration parameters:

<programlisting>
SELECT title
FROM pgweb
WHERE to_tsvector(body) @@ to_tsquery('friend');
</programlisting>

    This query will use the configuration set by <xref
    linkend="guc-default-text-search-config"/>.
   </para>

   <para>
    A more complex example is to
    select the ten most recent documents that contain <literal>create</literal> and
    <literal>table</literal> in the <structname>title</structname> or <structname>body</structname>:

<programlisting>
SELECT title
FROM pgweb
WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create &amp; table')
ORDER BY last_mod_date DESC
LIMIT 10;
</programlisting>

    For clarity we omitted the <function>coalesce</function> function calls
    which would be needed to find rows that contain <literal>NULL</literal>
    in one of the two fields.
   </para>

   <para>
    Although these queries will work without an index, most applications
    will find this approach too slow, except perhaps for occasional ad-hoc
    searches.  Practical use of text searching usually requires creating
    an index.
   </para>

  </sect2>

  <sect2 id="textsearch-tables-index">

Title: Text Search Components and Table Searching in PostgreSQL
Summary
PostgreSQL's text search uses parsers, dictionaries, templates, and configurations. Parsers and templates require C programming and superuser privileges to create. Dictionaries and configurations can be created without special privileges. Searching tables involves using to_tsvector and to_tsquery functions with or without specifying a configuration. Indexing is necessary for practical use.