<!-- doc/src/sgml/pgtrgm.sgml -->
<sect1 id="pgtrgm" xreflabel="pg_trgm">
<title>pg_trgm —
support for similarity of text using trigram matching</title>
<indexterm zone="pgtrgm">
<primary>pg_trgm</primary>
</indexterm>
<para>
The <filename>pg_trgm</filename> module provides functions and operators
for determining the similarity of
alphanumeric text based on trigram matching, as
well as index operator classes that support fast searching for similar
strings.
</para>
<para>
This module is considered <quote>trusted</quote>, that is, it can be
installed by non-superusers who have <literal>CREATE</literal> privilege
on the current database.
</para>
<sect2 id="pgtrgm-concepts">
<title>Trigram (or Trigraph) Concepts</title>
<para>
A trigram is a group of three consecutive characters taken
from a string. We can measure the similarity of two strings by
counting the number of trigrams they share. This simple idea
turns out to be very effective for measuring the similarity of
words in many natural languages.
</para>
<note>
<para>
<filename>pg_trgm</filename> ignores non-word characters
(non-alphanumerics) when extracting trigrams from a string.
Each word is considered to have two spaces
prefixed and one space suffixed when determining the set
of trigrams contained in the string.
For example, the set of trigrams in the string
<quote><literal>cat</literal></quote> is
<quote><literal> c</literal></quote>,
<quote><literal> ca</literal></quote>,
<quote><literal>cat</literal></quote>, and
<quote><literal>at </literal></quote>.
The set of trigrams in the string
<quote><literal>foo|bar</literal></quote> is
<quote><literal> f</literal></quote>,
<quote><literal> fo</literal></quote>,
<quote><literal>foo</literal></quote>,
<quote><literal>oo </literal></quote>,
<quote><literal> b</literal></quote>,
<quote><literal> ba</literal></quote>,
<quote><literal>bar</literal></quote>, and
<quote><literal>ar </literal></quote>.
</para>
</note>
</sect2>
<sect2 id="pgtrgm-funcs-ops">
<title>Functions and Operators</title>
<para>
The functions provided by the <filename>pg_trgm</filename> module
are shown in <xref linkend="pgtrgm-func-table"/>, the operators
in <xref linkend="pgtrgm-op-table"/>.
</para>
<table id="pgtrgm-func-table">
<title><filename>pg_trgm</filename> Functions</title>
<tgroup cols="1">
<thead>
<row>
<entry role="func_table_entry"><para role="func_signature">
Function
</para>
<para>
Description
</para></entry>
</row>
</thead>
<tbody>
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>similarity</primary></indexterm>
<function>similarity</function> ( <type>text</type>, <type>text</type> )
<returnvalue>real</returnvalue>
</para>
<para>
Returns a number that indicates how similar the two arguments are.
The range of the result is zero (indicating that the two strings are
completely dissimilar) to one (indicating that the two strings are
identical).
</para></entry>
</row>
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>show_trgm</primary></indexterm>
<function>show_trgm</function> ( <type>text</type> )
<returnvalue>text[]</returnvalue>
</para>
<para>
Returns an array of all the trigrams in the given string.
(In practice this is seldom useful except for debugging.)
</para></entry>
</row>
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>word_similarity</primary></indexterm>
<function>word_similarity</function> ( <type>text</type>, <type>text</type> )
<returnvalue>real</returnvalue>