Home Explore Blog CI



postgresql

19th chunk of `doc/src/sgml/datatype.sgml`
70e5a91a83baa0c0acb1a4db831991d4a533b9ca79ab07ac0000000100000faa
 <title><type>bytea</type> Escape Format</title>

   <para>
    The <quote>escape</quote> format is the traditional
    <productname>PostgreSQL</productname> format for the <type>bytea</type>
    type.  It
    takes the approach of representing a binary string as a sequence
    of ASCII characters, while converting those bytes that cannot be
    represented as an ASCII character into special escape sequences.
    If, from the point of view of the application, representing bytes
    as characters makes sense, then this representation can be
    convenient.  But in practice it is usually confusing because it
    fuzzes up the distinction between binary strings and character
    strings, and also the particular escape mechanism that was chosen is
    somewhat unwieldy.  Therefore, this format should probably be avoided
    for most new applications.
   </para>

   <para>
    When entering <type>bytea</type> values in escape format,
    octets of certain
    values <emphasis>must</emphasis> be escaped, while all octet
    values <emphasis>can</emphasis> be escaped.  In
    general, to escape an octet, convert it into its three-digit
    octal value and precede it by a backslash.
    Backslash itself (octet decimal value 92) can alternatively be represented by
    double backslashes.
    <xref linkend="datatype-binary-sqlesc"/>
    shows the characters that must be escaped, and gives the alternative
    escape sequences where applicable.
   </para>

   <table id="datatype-binary-sqlesc">
    <title><type>bytea</type> Literal Escaped Octets</title>
    <tgroup cols="5">
     <colspec colname="col1" colwidth="1*"/>
     <colspec colname="col2" colwidth="1*"/>
     <colspec colname="col3" colwidth="1*"/>
     <colspec colname="col4" colwidth="1.25*"/>
     <colspec colname="col5" colwidth="1*"/>
     <thead>
      <row>
       <entry>Decimal Octet Value</entry>
       <entry>Description</entry>
       <entry>Escaped Input Representation</entry>
       <entry>Example</entry>
       <entry>Hex Representation</entry>
      </row>
     </thead>

     <tbody>
      <row>
       <entry>0</entry>
       <entry>zero octet</entry>
       <entry><literal>'\000'</literal></entry>
       <entry><literal>'\000'::bytea</literal></entry>
       <entry><literal>\x00</literal></entry>
      </row>

      <row>
       <entry>39</entry>
       <entry>single quote</entry>
       <entry><literal>''''</literal> or <literal>'\047'</literal></entry>
       <entry><literal>''''::bytea</literal></entry>
       <entry><literal>\x27</literal></entry>
      </row>

      <row>
       <entry>92</entry>
       <entry>backslash</entry>
       <entry><literal>'\\'</literal> or <literal>'\134'</literal></entry>
       <entry><literal>'\\'::bytea</literal></entry>
       <entry><literal>\x5c</literal></entry>
      </row>

      <row>
       <entry>0 to 31 and 127 to 255</entry>
       <entry><quote>non-printable</quote> octets</entry>
       <entry><literal>'\<replaceable>xxx'</replaceable></literal> (octal value)</entry>
       <entry><literal>'\001'::bytea</literal></entry>
       <entry><literal>\x01</literal></entry>
      </row>

     </tbody>
    </tgroup>
   </table>

   <para>
    The requirement to escape <emphasis>non-printable</emphasis> octets
    varies depending on locale settings. In some instances you can get away
    with leaving them unescaped.
   </para>

   <para>
    The reason that single quotes must be doubled, as shown
    in <xref linkend="datatype-binary-sqlesc"/>, is that this
    is true for any string literal in an SQL command.  The generic
    string-literal parser consumes the outermost single quotes
    and reduces any pair of single quotes to one data character.
    What the <type>bytea</type> input function sees is just one
    single quote, which it treats as a plain data character.
    However, the <type>bytea</type> input function treats
    backslashes as special, and the other behaviors shown in
    <xref linkend="datatype-binary-sqlesc"/>

Title: Bytea Escape Format in PostgreSQL
Summary
The bytea escape format in PostgreSQL represents binary strings as a sequence of ASCII characters, converting non-representable bytes into special escape sequences. Certain octet values must be escaped, and all octet values can be escaped by converting them to their three-digit octal value and preceding it with a backslash. The format has specific rules for escaping special characters, such as backslashes and single quotes, and its usage is generally discouraged in favor of the hex format due to potential confusion and complexity.