Home Explore Blog CI



postgresql

87th chunk of `doc/src/sgml/func.sgml`
708a469d21516610fdc0b14ab75a2b03094cc1e6922ad63d0000000100000fb3
 exactly eight hexadecimal
       digits)
       the character whose hexadecimal value is
       <literal>0x</literal><replaceable>stuvwxyz</replaceable>
       </entry>
       </row>

       <row>
       <entry> <literal>\v</literal> </entry>
       <entry> vertical tab, as in C </entry>
       </row>

       <row>
       <entry> <literal>\x</literal><replaceable>hhh</replaceable> </entry>
       <entry> (where <replaceable>hhh</replaceable> is any sequence of hexadecimal
       digits)
       the character whose hexadecimal value is
       <literal>0x</literal><replaceable>hhh</replaceable>
       (a single character no matter how many hexadecimal digits are used)
       </entry>
       </row>

       <row>
       <entry> <literal>\0</literal> </entry>
       <entry> the character whose value is <literal>0</literal> (the null byte)</entry>
       </row>

       <row>
       <entry> <literal>\</literal><replaceable>xy</replaceable> </entry>
       <entry> (where <replaceable>xy</replaceable> is exactly two octal digits,
       and is not a <firstterm>back reference</firstterm>)
       the character whose octal value is
       <literal>0</literal><replaceable>xy</replaceable> </entry>
       </row>

       <row>
       <entry> <literal>\</literal><replaceable>xyz</replaceable> </entry>
       <entry> (where <replaceable>xyz</replaceable> is exactly three octal digits,
       and is not a <firstterm>back reference</firstterm>)
       the character whose octal value is
       <literal>0</literal><replaceable>xyz</replaceable> </entry>
       </row>
      </tbody>
     </tgroup>
    </table>

   <para>
    Hexadecimal digits are <literal>0</literal>-<literal>9</literal>,
    <literal>a</literal>-<literal>f</literal>, and <literal>A</literal>-<literal>F</literal>.
    Octal digits are <literal>0</literal>-<literal>7</literal>.
   </para>

   <para>
    Numeric character-entry escapes specifying values outside the ASCII range
    (0&ndash;127) have meanings dependent on the database encoding.  When the
    encoding is UTF-8, escape values are equivalent to Unicode code points,
    for example <literal>\u1234</literal> means the character <literal>U+1234</literal>.
    For other multibyte encodings, character-entry escapes usually just
    specify the concatenation of the byte values for the character.  If the
    escape value does not correspond to any legal character in the database
    encoding, no error will be raised, but it will never match any data.
   </para>

   <para>
    The character-entry escapes are always taken as ordinary characters.
    For example, <literal>\135</literal> is <literal>]</literal> in ASCII, but
    <literal>\135</literal> does not terminate a bracket expression.
   </para>

   <table id="posix-class-shorthand-escapes-table">
    <title>Regular Expression Class-Shorthand Escapes</title>

    <tgroup cols="2">
     <thead>
      <row>
       <entry>Escape</entry>
       <entry>Description</entry>
      </row>
     </thead>

      <tbody>
       <row>
       <entry> <literal>\d</literal> </entry>
       <entry> matches any digit, like
        <literal>[[:digit:]]</literal> </entry>
       </row>

       <row>
       <entry> <literal>\s</literal> </entry>
       <entry> matches any whitespace character, like
        <literal>[[:space:]]</literal> </entry>
       </row>

       <row>
       <entry> <literal>\w</literal> </entry>
       <entry> matches any word character, like
        <literal>[[:word:]]</literal> </entry>
       </row>

       <row>
       <entry> <literal>\D</literal> </entry>
       <entry> matches any non-digit, like
        <literal>[^[:digit:]]</literal> </entry>
       </row>

       <row>
       <entry> <literal>\S</literal> </entry>
       <entry> matches any non-whitespace character, like
        <literal>[^[:space:]]</literal> </entry>
       </row>

       <row>
       <entry> <literal>\W</literal> </entry>
       <entry> matches any non-word character, like
        <literal>[^[:word:]]</literal>

Title: Regular Expression Character-Entry Escapes (Continued) and Class-Shorthand Escapes
Summary
The table describes more character-entry escapes, including octal representations like `\xy` and `\xyz`. It clarifies that hexadecimal digits are 0-9, a-f, and A-F, while octal digits are 0-7. Numeric character-entry escapes outside the ASCII range depend on the database encoding, and in UTF-8 they correspond to Unicode code points. The section then introduces regular expression class-shorthand escapes: `\d` (digit), `\s` (whitespace), `\w` (word character), `\D` (non-digit), `\S` (non-whitespace), and `\W` (non-word character).