Home Explore Blog CI



postgresql

4th chunk of `doc/src/sgml/fuzzystrmatch.sgml`
e81ab7ff83846ed907033ef71d0c30bde75bdb2e91dfb1d90000000100000e3c
 </indexterm>

<synopsis>
levenshtein(source text, target text, ins_cost int, del_cost int, sub_cost int) returns int
levenshtein(source text, target text) returns int
levenshtein_less_equal(source text, target text, ins_cost int, del_cost int, sub_cost int, max_d int) returns int
levenshtein_less_equal(source text, target text, max_d int) returns int
</synopsis>

  <para>
   Both <literal>source</literal> and <literal>target</literal> can be any
   non-null string, with a maximum of 255 characters.  The cost parameters
   specify how much to charge for a character insertion, deletion, or
   substitution, respectively.  You can omit the cost parameters, as in
   the second version of the function; in that case they all default to 1.
  </para>

  <para>
   <function>levenshtein_less_equal</function> is an accelerated version of the
   Levenshtein function for use when only small distances are of interest.
   If the actual distance is less than or equal to <literal>max_d</literal>,
   then <function>levenshtein_less_equal</function> returns the correct
   distance; otherwise it returns some value greater than <literal>max_d</literal>.
   If <literal>max_d</literal> is negative then the behavior is the same as
   <function>levenshtein</function>.
  </para>

  <para>
   Examples:
  </para>

<screen>
test=# SELECT levenshtein('GUMBO', 'GAMBOL');
 levenshtein
-------------
           2
(1 row)

test=# SELECT levenshtein('GUMBO', 'GAMBOL', 2, 1, 1);
 levenshtein
-------------
           3
(1 row)

test=# SELECT levenshtein_less_equal('extensive', 'exhaustive', 2);
 levenshtein_less_equal
------------------------
                      3
(1 row)

test=# SELECT levenshtein_less_equal('extensive', 'exhaustive', 4);
 levenshtein_less_equal
------------------------
                      4
(1 row)
</screen>
 </sect2>

 <sect2 id="fuzzystrmatch-metaphone">
  <title>Metaphone</title>

  <para>
   Metaphone, like Soundex, is based on the idea of constructing a
   representative code for an input string.  Two strings are then
   deemed similar if they have the same codes.
  </para>

  <para>
   This function calculates the metaphone code of an input string:
  </para>

  <indexterm>
   <primary>metaphone</primary>
  </indexterm>

<synopsis>
metaphone(source text, max_output_length int) returns text
</synopsis>

  <para>
   <literal>source</literal> has to be a non-null string with a maximum of
   255 characters.  <literal>max_output_length</literal> sets the maximum
   length of the output metaphone code; if longer, the output is truncated
   to this length.
  </para>

  <para>
   Example:
  </para>

<screen>
test=# SELECT metaphone('GUMBO', 4);
 metaphone
-----------
 KM
(1 row)
</screen>
 </sect2>

 <sect2 id="fuzzystrmatch-double-metaphone">
  <title>Double Metaphone</title>

  <para>
   The Double Metaphone system computes two <quote>sounds like</quote> strings
   for a given input string &mdash; a <quote>primary</quote> and an
   <quote>alternate</quote>.  In most cases they are the same, but for non-English
   names especially they can be a bit different, depending on pronunciation.
   These functions compute the primary and alternate codes:
  </para>

  <indexterm>
   <primary>dmetaphone</primary>
  </indexterm>

  <indexterm>
   <primary>dmetaphone_alt</primary>
  </indexterm>

<synopsis>
dmetaphone(source text) returns text
dmetaphone_alt(source text) returns text
</synopsis>

  <para>
   There is no length limit on the input strings.
  </para>

  <para>
   Example:
  </para>

<screen>
test=# SELECT dmetaphone('gumbo');
 dmetaphone
------------
 KMP
(1 row)
</screen>
 </sect2>

</sect1>

Title: Fuzzy String Matching Functions: Levenshtein, Metaphone, and Double Metaphone
Summary
The section describes various fuzzy string matching functions, including Levenshtein, which calculates the distance between two strings, Metaphone, which constructs a representative code for an input string, and Double Metaphone, which computes primary and alternate 'sounds like' strings for a given input string, with examples and usage of each function.