Regular Expression Character-Entry Escapes (Continued) and Class-Shorthand Escapes

exactly eight hexadecimal digits) the character whose hexadecimal value is <literal>0x</literal><replaceable>stuvwxyz</replaceable> </entry> </row> <row> <entry> <literal>\v</literal> </entry> <entry> vertical tab, as in C </entry> </row> <row> <entry> <literal>\x</literal><replaceable>hhh</replaceable> </entry> <entry> (where <replaceable>hhh</replaceable> is any sequence of hexadecimal digits) the character whose hexadecimal value is <literal>0x</literal><replaceable>hhh</replaceable> (a single character no matter how many hexadecimal digits are used) </entry> </row> <row> <entry> <literal>\0</literal> </entry> <entry> the character whose value is <literal>0</literal> (the null byte)</entry> </row> <row> <entry> <literal>\</literal><replaceable>xy</replaceable> </entry> <entry> (where <replaceable>xy</replaceable> is exactly two octal digits, and is not a <firstterm>back reference</firstterm>) the character whose octal value is <literal>0</literal><replaceable>xy</replaceable> </entry> </row> <row> <entry> <literal>\</literal><replaceable>xyz</replaceable> </entry> <entry> (where <replaceable>xyz</replaceable> is exactly three octal digits, and is not a <firstterm>back reference</firstterm>) the character whose octal value is <literal>0</literal><replaceable>xyz</replaceable> </entry> </row> </tbody> </tgroup> </table> <para> Hexadecimal digits are <literal>0</literal>-<literal>9</literal>, <literal>a</literal>-<literal>f</literal>, and <literal>A</literal>-<literal>F</literal>. Octal digits are <literal>0</literal>-<literal>7</literal>. </para> <para> Numeric character-entry escapes specifying values outside the ASCII range (0–127) have meanings dependent on the database encoding. When the encoding is UTF-8, escape values are equivalent to Unicode code points, for example <literal>\u1234</literal> means the character <literal>U+1234</literal>. For other multibyte encodings, character-entry escapes usually just specify the concatenation of the byte values for the character. If the escape value does not correspond to any legal character in the database encoding, no error will be raised, but it will never match any data. </para> <para> The character-entry escapes are always taken as ordinary characters. For example, <literal>\135</literal> is <literal>]</literal> in ASCII, but <literal>\135</literal> does not terminate a bracket expression. </para> <table id="posix-class-shorthand-escapes-table"> <title>Regular Expression Class-Shorthand Escapes</title> <tgroup cols="2"> <thead> <row> <entry>Escape</entry> <entry>Description</entry> </row> </thead> <tbody> <row> <entry> <literal>\d</literal> </entry> <entry> matches any digit, like <literal>[[:digit:]]</literal> </entry> </row> <row> <entry> <literal>\s</literal> </entry> <entry> matches any whitespace character, like <literal>[[:space:]]</literal> </entry> </row> <row> <entry> <literal>\w</literal> </entry> <entry> matches any word character, like <literal>[[:word:]]</literal> </entry> </row> <row> <entry> <literal>\D</literal> </entry> <entry> matches any non-digit, like <literal>[^[:digit:]]</literal> </entry> </row> <row> <entry> <literal>\S</literal> </entry> <entry> matches any non-whitespace character, like <literal>[^[:space:]]</literal> </entry> </row> <row> <entry> <literal>\W</literal> </entry> <entry> matches any non-word character, like <literal>[^[:word:]]</literal>

The table describes more character-entry escapes, including octal representations like `\xy` and `\xyz`. It clarifies that hexadecimal digits are 0-9, a-f, and A-F, while octal digits are 0-7. Numeric character-entry escapes outside the ASCII range depend on the database encoding, and in UTF-8 they correspond to Unicode code points. The section then introduces regular expression class-shorthand escapes: `\d` (digit), `\s` (whitespace), `\w` (word character), `\D` (non-digit), `\S` (non-whitespace), and `\W` (non-word character).