<programlisting>
SELECT * FROM ts_stat('SELECT vector FROM apod')
ORDER BY nentry DESC, ndoc DESC, word
LIMIT 10;
</programlisting>
The same, but counting only word occurrences with weight <literal>A</literal>
or <literal>B</literal>:
<programlisting>
SELECT * FROM ts_stat('SELECT vector FROM apod', 'ab')
ORDER BY nentry DESC, ndoc DESC, word
LIMIT 10;
</programlisting>
</para>
</sect2>
</sect1>
<sect1 id="textsearch-parsers">
<title>Parsers</title>
<para>
Text search parsers are responsible for splitting raw document text
into <firstterm>tokens</firstterm> and identifying each token's type, where
the set of possible types is defined by the parser itself.
Note that a parser does not modify the text at all — it simply
identifies plausible word boundaries. Because of this limited scope,
there is less need for application-specific custom parsers than there is
for custom dictionaries. At present <productname>PostgreSQL</productname>
provides just one built-in parser, which has been found to be useful for a
wide range of applications.
</para>
<para>
The built-in parser is named <literal>pg_catalog.default</literal>.
It recognizes 23 token types, shown in <xref linkend="textsearch-default-parser"/>.
</para>
<table id="textsearch-default-parser">
<title>Default Parser's Token Types</title>
<tgroup cols="3">
<colspec colname="col1" colwidth="2*"/>
<colspec colname="col2" colwidth="2*"/>
<colspec colname="col3" colwidth="3*"/>
<thead>
<row>
<entry>Alias</entry>
<entry>Description</entry>
<entry>Example</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>asciiword</literal></entry>
<entry>Word, all ASCII letters</entry>
<entry><literal>elephant</literal></entry>
</row>
<row>
<entry><literal>word</literal></entry>
<entry>Word, all letters</entry>
<entry><literal>mañana</literal></entry>
</row>
<row>
<entry><literal>numword</literal></entry>
<entry>Word, letters and digits</entry>
<entry><literal>beta1</literal></entry>
</row>
<row>
<entry><literal>asciihword</literal></entry>
<entry>Hyphenated word, all ASCII</entry>
<entry><literal>up-to-date</literal></entry>
</row>
<row>
<entry><literal>hword</literal></entry>
<entry>Hyphenated word, all letters</entry>
<entry><literal>lógico-matemática</literal></entry>
</row>
<row>
<entry><literal>numhword</literal></entry>
<entry>Hyphenated word, letters and digits</entry>
<entry><literal>postgresql-beta1</literal></entry>
</row>
<row>
<entry><literal>hword_asciipart</literal></entry>
<entry>Hyphenated word part, all ASCII</entry>
<entry><literal>postgresql</literal> in the context <literal>postgresql-beta1</literal></entry>
</row>
<row>
<entry><literal>hword_part</literal></entry>
<entry>Hyphenated word part, all letters</entry>
<entry><literal>lógico</literal> or <literal>matemática</literal>
in the context <literal>lógico-matemática</literal></entry>
</row>
<row>
<entry><literal>hword_numpart</literal></entry>
<entry>Hyphenated word part, letters and digits</entry>
<entry><literal>beta1</literal> in the context
<literal>postgresql-beta1</literal></entry>
</row>
<row>
<entry><literal>email</literal></entry>
<entry>Email address</entry>
<entry><literal>foo@example.com</literal></entry>
</row>
<row>
<entry><literal>protocol</literal></entry>
<entry>Protocol head</entry>
<entry><literal>http://</literal></entry>
</row>
<row>
<entry><literal>url</literal></entry>
<entry>URL</entry>
<entry><literal>example.com/stuff/index.html</literal></entry>
</row>
<row>
<entry><literal>host</literal></entry>