Newline Sensitivity, Compatibility, and Basic Regular Expressions

transformed into a bracket expression containing both cases, e.g., <literal>x</literal> becomes <literal>[xX]</literal>. When it appears inside a bracket expression, all case counterparts of it are added to the bracket expression, e.g., <literal>[x]</literal> becomes <literal>[xX]</literal> and <literal>[^x]</literal> becomes <literal>[^xX]</literal>. </para> <para> If newline-sensitive matching is specified, <literal>.</literal> and bracket expressions using <literal>^</literal> will never match the newline character (so that matches will not cross lines unless the RE explicitly includes a newline) and <literal>^</literal> and <literal>$</literal> will match the empty string after and before a newline respectively, in addition to matching at beginning and end of string respectively. But the ARE escapes <literal>\A</literal> and <literal>\Z</literal> continue to match beginning or end of string <emphasis>only</emphasis>. Also, the character class shorthands <literal>\D</literal> and <literal>\W</literal> will match a newline regardless of this mode. (Before <productname>PostgreSQL</productname> 14, they did not match newlines when in newline-sensitive mode. Write <literal>[^[:digit:]]</literal> or <literal>[^[:word:]]</literal> to get the old behavior.) </para> <para> If partial newline-sensitive matching is specified, this affects <literal>.</literal> and bracket expressions as with newline-sensitive matching, but not <literal>^</literal> and <literal>$</literal>. </para> <para> If inverse partial newline-sensitive matching is specified, this affects <literal>^</literal> and <literal>$</literal> as with newline-sensitive matching, but not <literal>.</literal> and bracket expressions. This isn't very useful but is provided for symmetry. </para> </sect3> <sect3 id="posix-limits-compatibility"> <title>Limits and Compatibility</title> <para> No particular limit is imposed on the length of REs in this implementation. However, programs intended to be highly portable should not employ REs longer than 256 bytes, as a POSIX-compliant implementation can refuse to accept such REs. </para> <para> The only feature of AREs that is actually incompatible with POSIX EREs is that <literal>\</literal> does not lose its special significance inside bracket expressions. All other ARE features use syntax which is illegal or has undefined or unspecified effects in POSIX EREs; the <literal>***</literal> syntax of directors likewise is outside the POSIX syntax for both BREs and EREs. </para> <para> Many of the ARE extensions are borrowed from Perl, but some have been changed to clean them up, and a few Perl extensions are not present. Incompatibilities of note include <literal>\b</literal>, <literal>\B</literal>, the lack of special treatment for a trailing newline, the addition of complemented bracket expressions to the things affected by newline-sensitive matching, the restrictions on parentheses and back references in lookahead/lookbehind constraints, and the longest/shortest-match (rather than first-match) matching semantics. </para> </sect3> <sect3 id="posix-basic-regexes"> <title>Basic Regular Expressions</title> <para> BREs differ from EREs in several respects. In BREs, <literal>|</literal>, <literal>+</literal>, and <literal>?</literal> are ordinary characters and there is no equivalent for their functionality. The delimiters for bounds are <literal>\{</literal> and <literal>\}</literal>, with <literal>{</literal> and <literal>}</literal> by themselves ordinary characters. The parentheses for nested subexpressions are <literal>$</literal> and <literal>$</literal>, with <literal>(</literal> and <literal>)</literal> by themselves

This section explains newline-sensitive and partial newline-sensitive matching in regular expressions. It describes how these modes affect the behavior of '.' and bracket expressions with '^', as well as '^' and '$' anchors, with specific notes regarding '\D' and '\W'. It also discusses limits and compatibility with POSIX standards, highlighting differences between AREs and POSIX EREs, and noting distinctions from Perl's regex features. The final part introduces Basic Regular Expressions (BREs) and contrasts them with Extended Regular Expressions (EREs), noting differences in metacharacters and delimiters.