Home Explore Blog CI



postgresql

97th chunk of `doc/src/sgml/func.sgml`
24616691706b193c7840db00df21bc69feb5c8ccb9e23e610000000100000fa7
 <itemizedlist>
      <listitem>
       <para>
        XQuery character class subtraction is not supported.  An example of
        this feature is using the following to match only English
        consonants: <literal>[a-z-[aeiou]]</literal>.
       </para>
      </listitem>
      <listitem>
       <para>
        XQuery character class shorthands <literal>\c</literal>,
        <literal>\C</literal>, <literal>\i</literal>,
        and <literal>\I</literal> are not supported.
       </para>
      </listitem>
      <listitem>
       <para>
        XQuery character class elements
        using <literal>\p{UnicodeProperty}</literal> or the
        inverse <literal>\P{UnicodeProperty}</literal> are not supported.
       </para>
      </listitem>
      <listitem>
       <para>
        POSIX interprets character classes such as <literal>\w</literal>
        (see <xref linkend="posix-class-shorthand-escapes-table"/>)
        according to the prevailing locale (which you can control by
        attaching a <literal>COLLATE</literal> clause to the operator or
        function).  XQuery specifies these classes by reference to Unicode
        character properties, so equivalent behavior is obtained only with
        a locale that follows the Unicode rules.
       </para>
      </listitem>
      <listitem>
       <para>
        The SQL standard (not XQuery itself) attempts to cater for more
        variants of <quote>newline</quote> than POSIX does.  The
        newline-sensitive matching options described above consider only
        ASCII NL (<literal>\n</literal>) to be a newline, but SQL would have
        us treat CR (<literal>\r</literal>), CRLF (<literal>\r\n</literal>)
        (a Windows-style newline), and some Unicode-only characters like
        LINE SEPARATOR (U+2028) as newlines as well.
        Notably, <literal>.</literal> and <literal>\s</literal> should
        count <literal>\r\n</literal> as one character not two according to
        SQL.
       </para>
      </listitem>
      <listitem>
       <para>
        Of the character-entry escapes described in
        <xref linkend="posix-character-entry-escapes-table"/>,
        XQuery supports only <literal>\n</literal>, <literal>\r</literal>,
        and <literal>\t</literal>.
       </para>
      </listitem>
      <listitem>
       <para>
        XQuery does not support
        the <literal>[:<replaceable>name</replaceable>:]</literal> syntax
        for character classes within bracket expressions.
       </para>
      </listitem>
      <listitem>
       <para>
        XQuery does not have lookahead or lookbehind constraints,
        nor any of the constraint escapes described in
        <xref linkend="posix-constraint-escapes-table"/>.
       </para>
      </listitem>
      <listitem>
       <para>
        The metasyntax forms described in <xref linkend="posix-metasyntax"/>
        do not exist in XQuery.
       </para>
      </listitem>
      <listitem>
       <para>
        The regular expression flag letters defined by XQuery are
        related to but not the same as the option letters for POSIX
        (<xref linkend="posix-embedded-options-table"/>).  While the
        <literal>i</literal> and <literal>q</literal> options behave the
        same, others do not:
        <itemizedlist>
         <listitem>
          <para>
           XQuery's <literal>s</literal> (allow dot to match newline)
           and <literal>m</literal> (allow <literal>^</literal>
           and <literal>$</literal> to match at newlines) flags provide
           access to the same behaviors as
           POSIX's <literal>n</literal>, <literal>p</literal>
           and <literal>w</literal> flags, but they
           do <emphasis>not</emphasis> match the behavior of
           POSIX's <literal>s</literal> and <literal>m</literal> flags.
           Note in particular that dot-matches-newline is the default
           behavior in POSIX but not XQuery.
          </para>
         </listitem>
         <listitem>

Title: Differences Between POSIX and XQuery Regular Expressions (Continued)
Summary
This section continues detailing the differences between POSIX and XQuery regular expressions. Key differences include how POSIX interprets character classes according to the locale, while XQuery uses Unicode character properties. SQL attempts to cater to more newline variants than POSIX. XQuery supports a limited set of character-entry escapes. XQuery lacks support for character class syntax within bracket expressions, lookahead/lookbehind constraints, and metasyntax forms. Finally, it highlights the differences in regular expression flag letters, particularly regarding the 's' and 'm' flags, noting that the 's' flag (dot matches newline) is the default behavior in POSIX but not in XQuery.