Home Explore Blog CI



postgresql

43th chunk of `doc/src/sgml/textsearch.sgml`
e7f34611935a67706d234c6607bab5b59eeaf990c06d808b0000000100000fa8
         |
 asciiword | Word, all ASCII | it    | {english_stem} | english_stem | {}
 blank     | Space symbols   |       | {}             |              |
 asciiword | Word, all ASCII | ate   | {english_stem} | english_stem | {ate}
 blank     | Space symbols   |       | {}             |              |
 asciiword | Word, all ASCII | a     | {english_stem} | english_stem | {}
 blank     | Space symbols   |       | {}             |              |
 asciiword | Word, all ASCII | fat   | {english_stem} | english_stem | {fat}
 blank     | Space symbols   |       | {}             |              |
 asciiword | Word, all ASCII | rats  | {english_stem} | english_stem | {rat}
</screen>
  </para>

  <para>
   For a more extensive demonstration, we
   first create a <literal>public.english</literal> configuration and
   Ispell dictionary for the English language:
  </para>

<programlisting>
CREATE TEXT SEARCH CONFIGURATION public.english ( COPY = pg_catalog.english );

CREATE TEXT SEARCH DICTIONARY english_ispell (
    TEMPLATE = ispell,
    DictFile = english,
    AffFile = english,
    StopWords = english
);

ALTER TEXT SEARCH CONFIGURATION public.english
   ALTER MAPPING FOR asciiword WITH english_ispell, english_stem;
</programlisting>

<screen>
SELECT * FROM ts_debug('public.english', 'The Brightest supernovaes');
   alias   |   description   |    token    |         dictionaries          |   dictionary   |   lexemes
-----------+-----------------+-------------+-------------------------------+----------------+-------------
 asciiword | Word, all ASCII | The         | {english_ispell,english_stem} | english_ispell | {}
 blank     | Space symbols   |             | {}                            |                |
 asciiword | Word, all ASCII | Brightest   | {english_ispell,english_stem} | english_ispell | {bright}
 blank     | Space symbols   |             | {}                            |                |
 asciiword | Word, all ASCII | supernovaes | {english_ispell,english_stem} | english_stem   | {supernova}
</screen>

  <para>
   In this example, the word <literal>Brightest</literal> was recognized by the
   parser as an <literal>ASCII word</literal> (alias <literal>asciiword</literal>).
   For this token type the dictionary list is
   <literal>english_ispell</literal> and
   <literal>english_stem</literal>. The word was recognized by
   <literal>english_ispell</literal>, which reduced it to the noun
   <literal>bright</literal>. The word <literal>supernovaes</literal> is
   unknown to the <literal>english_ispell</literal> dictionary so it
   was passed to the next dictionary, and, fortunately, was recognized (in
   fact, <literal>english_stem</literal> is a Snowball dictionary which
   recognizes everything; that is why it was placed at the end of the
   dictionary list).
  </para>

  <para>
   The word <literal>The</literal> was recognized by the
   <literal>english_ispell</literal> dictionary as a stop word (<xref
   linkend="textsearch-stopwords"/>) and will not be indexed.
   The spaces are discarded too, since the configuration provides no
   dictionaries at all for them.
  </para>

  <para>
   You can reduce the width of the output by explicitly specifying which columns
   you want to see:

<screen>
SELECT alias, token, dictionary, lexemes
FROM ts_debug('public.english', 'The Brightest supernovaes');
   alias   |    token    |   dictionary   |   lexemes
-----------+-------------+----------------+-------------
 asciiword | The         | english_ispell | {}
 blank     |             |                |
 asciiword | Brightest   | english_ispell | {bright}
 blank     |             |                |
 asciiword | supernovaes | english_stem   | {supernova}
</screen>
  </para>

  </sect2>

  <sect2 id="textsearch-parser-testing">
   <title>Parser Testing</title>

  <para>
   The following functions allow direct testing of a text search parser.
  </para>

  <indexterm>
   <primary>ts_parse</primary>
  </indexterm>

<synopsis>
ts_parse(<replaceable

Title: Detailed Example and Explanation of `ts_debug` with Ispell Dictionary
Summary
This section provides a detailed example of using `ts_debug` with a custom `public.english` configuration and an Ispell dictionary. It demonstrates the creation of the configuration and dictionary, shows how words are processed by the dictionaries (including stop word recognition and stemming), and explains how the parser selects and uses different dictionaries in the list. It also provides an example of how to reduce the output width of `ts_debug` by specifying the desired columns. Finally, it transitions into a new section about parser testing and introduces the `ts_parse` function.