Home Explore Blog CI



postgresql

37th chunk of `doc/src/sgml/textsearch.sgml`
a6ca0abf49b6d020a9fdd3e227a458350718edcebf5848a90000000100000fa1
 star''');
 to_tsquery
------------
 'sn'
</screen>

    Notice that <literal>supernova star</literal> matches <literal>supernovae
    stars</literal> in <literal>thesaurus_astro</literal> because we specified
    the <literal>english_stem</literal> stemmer in the thesaurus definition.
    The stemmer removed the <literal>e</literal> and <literal>s</literal>.
   </para>

   <para>
    To index the original phrase as well as the substitute, just include it
    in the right-hand part of the definition:

<screen>
supernovae stars : sn supernovae stars

SELECT plainto_tsquery('supernova star');
       plainto_tsquery
-----------------------------
 'sn' &amp; 'supernova' &amp; 'star'
</screen>
   </para>

  </sect3>

  </sect2>

  <sect2 id="textsearch-ispell-dictionary">
   <title><application>Ispell</application> Dictionary</title>

   <para>
    The <application>Ispell</application> dictionary template supports
    <firstterm>morphological dictionaries</firstterm>, which can normalize many
    different linguistic forms of a word into the same lexeme.  For example,
    an English <application>Ispell</application> dictionary can match all declensions and
    conjugations of the search term <literal>bank</literal>, e.g.,
    <literal>banking</literal>, <literal>banked</literal>, <literal>banks</literal>,
    <literal>banks'</literal>, and <literal>bank's</literal>.
   </para>

   <para>
    The standard <productname>PostgreSQL</productname> distribution does
    not include any <application>Ispell</application> configuration files.
    Dictionaries for a large number of languages are available from <ulink
    url="https://www.cs.hmc.edu/~geoff/ispell.html">Ispell</ulink>.
    Also, some more modern dictionary file formats are supported &mdash; <ulink
    url="https://en.wikipedia.org/wiki/MySpell">MySpell</ulink> (OO &lt; 2.0.1)
    and <ulink url="https://hunspell.github.io/">Hunspell</ulink>
    (OO &gt;= 2.0.2).  A large list of dictionaries is available on the <ulink
    url="https://wiki.openoffice.org/wiki/Dictionaries">OpenOffice
    Wiki</ulink>.
   </para>

   <para>
    To create an <application>Ispell</application> dictionary perform these steps:
   </para>
   <itemizedlist spacing="compact" mark="bullet">
    <listitem>
     <para>
      download dictionary configuration files. <productname>OpenOffice</productname>
      extension files have the <filename>.oxt</filename> extension. It is necessary
      to extract <filename>.aff</filename> and <filename>.dic</filename> files, change
      extensions to <filename>.affix</filename> and <filename>.dict</filename>. For some
      dictionary files it is also needed to convert characters to the UTF-8
      encoding with commands (for example, for a Norwegian language dictionary):
<programlisting>
iconv -f ISO_8859-1 -t UTF-8 -o nn_no.affix nn_NO.aff
iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
</programlisting>
     </para>
    </listitem>
    <listitem>
     <para>
      copy files to the <filename>$SHAREDIR/tsearch_data</filename> directory
     </para>
    </listitem>
    <listitem>
     <para>
      load files into PostgreSQL with the following command:
<programlisting>
CREATE TEXT SEARCH DICTIONARY english_hunspell (
    TEMPLATE = ispell,
    DictFile = en_us,
    AffFile = en_us,
    Stopwords = english);
</programlisting>
     </para>
    </listitem>
   </itemizedlist>

   <para>
    Here, <literal>DictFile</literal>, <literal>AffFile</literal>, and <literal>StopWords</literal>
    specify the base names of the dictionary, affixes, and stop-words files.
    The stop-words file has the same format explained above for the
    <literal>simple</literal> dictionary type.  The format of the other files is
    not specified here but is available from the above-mentioned web sites.
   </para>

   <para>
    Ispell dictionaries usually recognize a limited set of words, so they
    should be followed by another broader dictionary; for
    example, a Snowball dictionary,

Title: Ispell Dictionary Configuration
Summary
This section explains how to configure and use an Ispell dictionary in PostgreSQL for morphological normalization. It highlights that Ispell dictionaries can normalize various linguistic forms of a word into a single lexeme. It notes that PostgreSQL doesn't include Ispell configuration files by default but provides links to resources for downloading dictionaries. The section outlines the steps to create an Ispell dictionary: downloading and converting dictionary files to UTF-8 encoding, copying them to the `$SHAREDIR/tsearch_data` directory, and loading them into PostgreSQL using the `CREATE TEXT SEARCH DICTIONARY` command. It clarifies the roles of `DictFile`, `AffFile`, and `StopWords` parameters, and suggests using a broader dictionary (like Snowball) after the Ispell dictionary due to its limited word recognition.