Home Explore Blog CI



postgresql

70th chunk of `doc/src/sgml/func.sgml`
78c609d68f1d697ffeb0bf32b93a6f84c952996d9ed22e7a0000000100000fa7
 expressions; see
    <xref linkend="posix-character-entry-escapes-table"/>,
    <xref linkend="posix-class-shorthand-escapes-table"/>, and
    <xref linkend="posix-constraint-escapes-table"/> below.
   </para>

   <para>
    Some examples:
<programlisting>
'abc' SIMILAR TO 'abc'          <lineannotation>true</lineannotation>
'abc' SIMILAR TO 'a'            <lineannotation>false</lineannotation>
'abc' SIMILAR TO '%(b|d)%'      <lineannotation>true</lineannotation>
'abc' SIMILAR TO '(b|c)%'       <lineannotation>false</lineannotation>
'-abc-' SIMILAR TO '%\mabc\M%'  <lineannotation>true</lineannotation>
'xabcy' SIMILAR TO '%\mabc\M%'  <lineannotation>false</lineannotation>
</programlisting>
   </para>

   <para>
    The <function>substring</function> function with three parameters
    provides extraction of a substring that matches an SQL
    regular expression pattern.  The function can be written according
    to standard SQL syntax:
<synopsis>
substring(<replaceable>string</replaceable> similar <replaceable>pattern</replaceable> escape <replaceable>escape-character</replaceable>)
</synopsis>
    or using the now obsolete SQL:1999 syntax:
<synopsis>
substring(<replaceable>string</replaceable> from <replaceable>pattern</replaceable> for <replaceable>escape-character</replaceable>)
</synopsis>
    or as a plain three-argument function:
<synopsis>
substring(<replaceable>string</replaceable>, <replaceable>pattern</replaceable>, <replaceable>escape-character</replaceable>)
</synopsis>
    As with <literal>SIMILAR TO</literal>, the
    specified pattern must match the entire data string, or else the
    function fails and returns null.  To indicate the part of the
    pattern for which the matching data sub-string is of interest,
    the pattern should contain
    two occurrences of the escape character followed by a double quote
    (<literal>"</literal>). <!-- " font-lock sanity -->
    The text matching the portion of the pattern
    between these separators is returned when the match is successful.
   </para>

   <para>
    The escape-double-quote separators actually
    divide <function>substring</function>'s pattern into three independent
    regular expressions; for example, a vertical bar (<literal>|</literal>)
    in any of the three sections affects only that section.  Also, the first
    and third of these regular expressions are defined to match the smallest
    possible amount of text, not the largest, when there is any ambiguity
    about how much of the data string matches which pattern.  (In POSIX
    parlance, the first and third regular expressions are forced to be
    non-greedy.)
   </para>

   <para>
    As an extension to the SQL standard, <productname>PostgreSQL</productname>
    allows there to be just one escape-double-quote separator, in which case
    the third regular expression is taken as empty; or no separators, in which
    case the first and third regular expressions are taken as empty.
   </para>

   <para>
    Some examples, with <literal>#&quot;</literal> delimiting the return string:
<programlisting>
substring('foobar' similar '%#"o_b#"%' escape '#')   <lineannotation>oob</lineannotation>
substring('foobar' similar '#"o_b#"%' escape '#')    <lineannotation>NULL</lineannotation>
</programlisting>
   </para>
  </sect2>

  <sect2 id="functions-posix-regexp">
   <title><acronym>POSIX</acronym> Regular Expressions</title>

   <indexterm zone="functions-posix-regexp">
    <primary>regular expression</primary>
    <seealso>pattern matching</seealso>
   </indexterm>
   <indexterm>
    <primary>substring</primary>
   </indexterm>
   <indexterm>
    <primary>regexp_count</primary>
   </indexterm>
   <indexterm>
    <primary>regexp_instr</primary>
   </indexterm>
   <indexterm>
    <primary>regexp_like</primary>
   </indexterm>
   <indexterm>
    <primary>regexp_match</primary>
   </indexterm>
   <indexterm>
    <primary>regexp_matches</primary>
   </indexterm>
   <indexterm>
    <primary>regexp_replace</primary>

Title: substring Function with Regular Expressions
Summary
This section describes the `substring` function used with SQL regular expressions for extracting substrings. It details the function's syntax, including standard SQL and older SQL:1999 versions, and explains how to use escape characters and double quotes to specify the part of the pattern to extract. The function requires the pattern to match the entire string, returning null otherwise. It also covers PostgreSQL's extension allowing for one or no escape-double-quote separators. Examples are provided to illustrate the function's usage. The section then transitions to discussing POSIX regular expressions.