Home Explore Blog CI



postgresql

94th chunk of `doc/src/sgml/func.sgml`
5f7520f28ae08a27a9ce3e118703a3ba65e08e3ab91ca1d40000000100000fa5
 transformed into a bracket expression containing both cases,
    e.g., <literal>x</literal> becomes <literal>[xX]</literal>.
    When it appears inside a bracket expression, all case counterparts
    of it are added to the bracket expression, e.g.,
    <literal>[x]</literal> becomes <literal>[xX]</literal>
    and <literal>[^x]</literal> becomes <literal>[^xX]</literal>.
   </para>

   <para>
    If newline-sensitive matching is specified, <literal>.</literal>
    and bracket expressions using <literal>^</literal>
    will never match the newline character
    (so that matches will not cross lines unless the RE
    explicitly includes a newline)
    and <literal>^</literal> and <literal>$</literal>
    will match the empty string after and before a newline
    respectively, in addition to matching at beginning and end of string
    respectively.
    But the ARE escapes <literal>\A</literal> and <literal>\Z</literal>
    continue to match beginning or end of string <emphasis>only</emphasis>.
    Also, the character class shorthands <literal>\D</literal>
    and <literal>\W</literal> will match a newline regardless of this mode.
    (Before <productname>PostgreSQL</productname> 14, they did not match
    newlines when in newline-sensitive mode.
    Write <literal>[^[:digit:]]</literal>
    or <literal>[^[:word:]]</literal> to get the old behavior.)
   </para>

   <para>
    If partial newline-sensitive matching is specified,
    this affects <literal>.</literal> and bracket expressions
    as with newline-sensitive matching, but not <literal>^</literal>
    and <literal>$</literal>.
   </para>

   <para>
    If inverse partial newline-sensitive matching is specified,
    this affects <literal>^</literal> and <literal>$</literal>
    as with newline-sensitive matching, but not <literal>.</literal>
    and bracket expressions.
    This isn't very useful but is provided for symmetry.
   </para>
   </sect3>

   <sect3 id="posix-limits-compatibility">
    <title>Limits and Compatibility</title>

   <para>
    No particular limit is imposed on the length of REs in this
    implementation.  However,
    programs intended to be highly portable should not employ REs longer
    than 256 bytes,
    as a POSIX-compliant implementation can refuse to accept such REs.
   </para>

   <para>
    The only feature of AREs that is actually incompatible with
    POSIX EREs is that <literal>\</literal> does not lose its special
    significance inside bracket expressions.
    All other ARE features use syntax which is illegal or has
    undefined or unspecified effects in POSIX EREs;
    the <literal>***</literal> syntax of directors likewise is outside the POSIX
    syntax for both BREs and EREs.
   </para>

   <para>
    Many of the ARE extensions are borrowed from Perl, but some have
    been changed to clean them up, and a few Perl extensions are not present.
    Incompatibilities of note include <literal>\b</literal>, <literal>\B</literal>,
    the lack of special treatment for a trailing newline,
    the addition of complemented bracket expressions to the things
    affected by newline-sensitive matching,
    the restrictions on parentheses and back references in lookahead/lookbehind
    constraints, and the longest/shortest-match (rather than first-match)
    matching semantics.
   </para>
   </sect3>

   <sect3 id="posix-basic-regexes">
    <title>Basic Regular Expressions</title>

   <para>
    BREs differ from EREs in several respects.
    In BREs, <literal>|</literal>, <literal>+</literal>, and <literal>?</literal>
    are ordinary characters and there is no equivalent
    for their functionality.
    The delimiters for bounds are
    <literal>\{</literal> and <literal>\}</literal>,
    with <literal>{</literal> and <literal>}</literal>
    by themselves ordinary characters.
    The parentheses for nested subexpressions are
    <literal>\(</literal> and <literal>\)</literal>,
    with <literal>(</literal> and <literal>)</literal> by themselves

Title: Newline Sensitivity, Compatibility, and Basic Regular Expressions
Summary
This section explains newline-sensitive and partial newline-sensitive matching in regular expressions. It describes how these modes affect the behavior of '.' and bracket expressions with '^', as well as '^' and '$' anchors, with specific notes regarding '\D' and '\W'. It also discusses limits and compatibility with POSIX standards, highlighting differences between AREs and POSIX EREs, and noting distinctions from Perl's regex features. The final part introduces Basic Regular Expressions (BREs) and contrasts them with Extended Regular Expressions (EREs), noting differences in metacharacters and delimiters.