Home Explore Blog CI



postgresql

6th chunk of `doc/src/sgml/syntax.sgml`
1803e41a86ca19872f8ab16e79486c4925e69adc7aed16e00000000100000fa5
 (<replaceable>o</replaceable> = 0&ndash;7)
        </entry>
        <entry>octal byte value</entry>
       </row>
       <row>
        <entry>
         <literal>\x<replaceable>h</replaceable></literal>,
         <literal>\x<replaceable>hh</replaceable></literal>
         (<replaceable>h</replaceable> = 0&ndash;9, A&ndash;F)
        </entry>
        <entry>hexadecimal byte value</entry>
       </row>
       <row>
        <entry>
         <literal>\u<replaceable>xxxx</replaceable></literal>,
         <literal>\U<replaceable>xxxxxxxx</replaceable></literal>
         (<replaceable>x</replaceable> = 0&ndash;9, A&ndash;F)
        </entry>
        <entry>16 or 32-bit hexadecimal Unicode character value</entry>
       </row>
      </tbody>
      </tgroup>
     </table>

    <para>
     Any other
     character following a backslash is taken literally. Thus, to
     include a backslash character, write two backslashes (<literal>\\</literal>).
     Also, a single quote can be included in an escape string by writing
     <literal>\'</literal>, in addition to the normal way of <literal>''</literal>.
    </para>

    <para>
     It is your responsibility that the byte sequences you create,
     especially when using the octal or hexadecimal escapes, compose
     valid characters in the server character set encoding.
     A useful alternative is to use Unicode escapes or the
     alternative Unicode escape syntax, explained
     in <xref linkend="sql-syntax-strings-uescape"/>; then the server
     will check that the character conversion is possible.
    </para>

    <caution>
    <para>
     If the configuration parameter
     <xref linkend="guc-standard-conforming-strings"/> is <literal>off</literal>,
     then <productname>PostgreSQL</productname> recognizes backslash escapes
     in both regular and escape string constants.  However, as of
     <productname>PostgreSQL</productname> 9.1, the default is <literal>on</literal>, meaning
     that backslash escapes are recognized only in escape string constants.
     This behavior is more standards-compliant, but might break applications
     which rely on the historical behavior, where backslash escapes
     were always recognized.  As a workaround, you can set this parameter
     to <literal>off</literal>, but it is better to migrate away from using backslash
     escapes.  If you need to use a backslash escape to represent a special
     character, write the string constant with an <literal>E</literal>.
    </para>

    <para>
     In addition to <varname>standard_conforming_strings</varname>, the configuration
     parameters <xref linkend="guc-escape-string-warning"/> and
     <xref linkend="guc-backslash-quote"/> govern treatment of backslashes
     in string constants.
    </para>
    </caution>

    <para>
     The character with the code zero cannot be in a string constant.
    </para>
   </sect3>

   <sect3 id="sql-syntax-strings-uescape">
    <title>String Constants with Unicode Escapes</title>

    <indexterm  zone="sql-syntax-strings-uescape">
     <primary>Unicode escape</primary>
     <secondary>in string constants</secondary>
    </indexterm>

    <para>
     <productname>PostgreSQL</productname> also supports another type
     of escape syntax for strings that allows specifying arbitrary
     Unicode characters by code point.  A Unicode escape string
     constant starts with <literal>U&amp;</literal> (upper or lower case
     letter U followed by ampersand) immediately before the opening
     quote, without any spaces in between, for
     example <literal>U&amp;'foo'</literal>.  (Note that this creates an
     ambiguity with the operator <literal>&amp;</literal>.  Use spaces
     around the operator to avoid this problem.)  Inside the quotes,
     Unicode characters can be specified in escaped form by writing a
     backslash followed by the four-digit hexadecimal code point
     number or alternatively a backslash followed by a plus sign
     followed by a six-digit hexadecimal

Title: More on String Constants with C-Style and Unicode Escapes in PostgreSQL
Summary
This section continues the discussion on string constants with escapes in PostgreSQL. It covers including backslashes and single quotes in escape strings, and the importance of ensuring that byte sequences created with octal or hexadecimal escapes result in valid characters. It also discusses Unicode escapes as an alternative, the impact of the standard_conforming_strings configuration parameter on backslash escape recognition, and mentions escape_string_warning and backslash_quote parameters. Finally, it introduces Unicode escape strings, starting with U&, for specifying Unicode characters by code point.