Home Explore Blog CI



postgresql

8th chunk of `doc/src/sgml/textsearch.sgml`
a3d36c9a1dd686863ebc6259b5ab20880fb5f43d2be58bad0000000100000fa0
 use the configuration set by <xref
    linkend="guc-default-text-search-config"/>.
   </para>

   <para>
    A more complex example is to
    select the ten most recent documents that contain <literal>create</literal> and
    <literal>table</literal> in the <structname>title</structname> or <structname>body</structname>:

<programlisting>
SELECT title
FROM pgweb
WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create &amp; table')
ORDER BY last_mod_date DESC
LIMIT 10;
</programlisting>

    For clarity we omitted the <function>coalesce</function> function calls
    which would be needed to find rows that contain <literal>NULL</literal>
    in one of the two fields.
   </para>

   <para>
    Although these queries will work without an index, most applications
    will find this approach too slow, except perhaps for occasional ad-hoc
    searches.  Practical use of text searching usually requires creating
    an index.
   </para>

  </sect2>

  <sect2 id="textsearch-tables-index">
   <title>Creating Indexes</title>

   <para>
    We can create a <acronym>GIN</acronym> index (<xref
    linkend="textsearch-indexes"/>) to speed up text searches:

<programlisting>
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', body));
</programlisting>

    Notice that the 2-argument version of <function>to_tsvector</function> is
    used.  Only text search functions that specify a configuration name can
    be used in expression indexes (<xref linkend="indexes-expressional"/>).
    This is because the index contents must be unaffected by <xref
    linkend="guc-default-text-search-config"/>.  If they were affected, the
    index contents might be inconsistent because different entries could
    contain <type>tsvector</type>s that were created with different text search
    configurations, and there would be no way to guess which was which.  It
    would be impossible to dump and restore such an index correctly.
   </para>

   <para>
    Because the two-argument version of <function>to_tsvector</function> was
    used in the index above, only a query reference that uses the 2-argument
    version of <function>to_tsvector</function> with the same configuration
    name will use that index.  That is, <literal>WHERE
    to_tsvector('english', body) @@ 'a &amp; b'</literal> can use the index,
    but <literal>WHERE to_tsvector(body) @@ 'a &amp; b'</literal> cannot.
    This ensures that an index will be used only with the same configuration
    used to create the index entries.
   </para>

  <para>
    It is possible to set up more complex expression indexes wherein the
    configuration name is specified by another column, e.g.:

<programlisting>
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector(config_name, body));
</programlisting>

    where <literal>config_name</literal> is a column in the <literal>pgweb</literal>
    table.  This allows mixed configurations in the same index while
    recording which configuration was used for each index entry.  This
    would be useful, for example, if the document collection contained
    documents in different languages.  Again,
    queries that are meant to use the index must be phrased to match, e.g.,
    <literal>WHERE to_tsvector(config_name, body) @@ 'a &amp; b'</literal>.
   </para>

   <para>
    Indexes can even concatenate columns:

<programlisting>
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', title || ' ' || body));
</programlisting>
   </para>

   <para>
    Another approach is to create a separate <type>tsvector</type> column
    to hold the output of <function>to_tsvector</function>.  To keep this
    column automatically up to date with its source data, use a stored
    generated column.  This example is a
    concatenation of <literal>title</literal> and <literal>body</literal>,
    using <function>coalesce</function> to ensure that one field will still be
    indexed when the other is <literal>NULL</literal>:

<programlisting>
ALTER TABLE pgweb

Title: Creating Text Search Indexes in PostgreSQL
Summary
Creating GIN indexes speeds up text searches. Expression indexes must use the 2-argument version of to_tsvector, specifying the configuration. Indexes can be more complex, using a configuration name from another column or concatenating columns. Alternatively, a separate tsvector column can be maintained automatically using a stored generated column.