Listing Text Search Templates and Text Search Limitations

Signed integer numhword | Hyphenated word, letters and digits numword | Word, letters and digits protocol | Protocol head sfloat | Scientific notation tag | XML tag uint | Unsigned integer url | URL url_path | URL path version | Version number word | Word, all letters (23 rows) </screen> </para> </listitem> </varlistentry> <varlistentry> <term><literal>\dFt<optional>+</optional> <optional>PATTERN</optional></literal></term> <listitem> <para> List text search templates (add <literal>+</literal> for more detail). <screen> => \dFt List of text search templates Schema | Name | Description ------------+-----------+----------------------------------------------------------- pg_catalog | ispell | ispell dictionary pg_catalog | simple | simple dictionary: just lower case and check for stopword pg_catalog | snowball | snowball stemmer pg_catalog | synonym | synonym dictionary: replace word by its synonym pg_catalog | thesaurus | thesaurus dictionary: phrase by phrase substitution </screen> </para> </listitem> </varlistentry> </variablelist> </sect1> <sect1 id="textsearch-limitations"> <title>Limitations</title> <para> The current limitations of <productname>PostgreSQL</productname>'s text search features are: <itemizedlist spacing="compact" mark="bullet"> <listitem> <para>The length of each lexeme must be less than 2 kilobytes</para> </listitem> <listitem> <para>The length of a <type>tsvector</type> (lexemes + positions) must be less than 1 megabyte</para> </listitem> <listitem>  <para>The number of lexemes must be less than 2<superscript>64</superscript></para> </listitem> <listitem> <para>Position values in <type>tsvector</type> must be greater than 0 and no more than 16,383</para> </listitem> <listitem> <para>The match distance in a <literal><<replaceable>N</replaceable>></literal> (FOLLOWED BY) <type>tsquery</type> operator cannot be more than 16,384</para> </listitem> <listitem> <para>No more than 256 positions per lexeme</para> </listitem> <listitem> <para>The number of nodes (lexemes + operators) in a <type>tsquery</type> must be less than 32,768</para> </listitem> </itemizedlist> </para> <para> For comparison, the <productname>PostgreSQL</productname> 8.1 documentation contained 10,441 unique words, a total of 335,420 words, and the most frequent word <quote>postgresql</quote> was mentioned 6,127 times in 655 documents. </para>  <para> Another example — the <productname>PostgreSQL</productname> mailing list archives contained 910,989 unique words with 57,491,343 lexemes in 461,020 messages. </para> </sect1> </chapter>

The text lists text search templates such as 'ispell', 'simple', 'snowball', 'synonym' and 'thesaurus'. It also describes the limitations of PostgreSQL's text search features, including restrictions on lexeme length, tsvector size, number of lexemes, position values, match distance, positions per lexeme, and nodes in a tsquery. It provides statistics from PostgreSQL documentation and mailing list archives as examples.