Home Explore Blog CI



postgresql

10th chunk of `doc/src/sgml/ref/copy.sgml`
7f16ceb14e22f5787e6a80a746c6779b5d1717e2e1964d0e0000000100000fae
 specified null string is used in
    place of columns that are null.
    <command>COPY FROM</command> will raise an error if any line of the
    input file contains more or fewer columns than are expected.
   </para>

   <para>
    End of data can be represented by a line containing just
    backslash-period (<literal>\.</literal>).  An end-of-data marker is
    not necessary when reading from a file, since the end of file
    serves perfectly well; in that context this provision exists only for
    backward compatibility.  However, <application>psql</application>
    uses <literal>\.</literal> to terminate a <literal>COPY FROM
    STDIN</literal> operation (that is, reading
    in-line <command>COPY</command> data in an SQL script).  In that
    context the rule is needed to be able to end the operation before the
    end of the script.
   </para>

   <para>
    Backslash characters (<literal>\</literal>) can be used in the
    <command>COPY</command> data to quote data characters that might
    otherwise be taken as row or column delimiters. In particular, the
    following characters <emphasis>must</emphasis> be preceded by a backslash if
    they appear as part of a column value: backslash itself,
    newline, carriage return, and the current delimiter character.
   </para>

   <para>
    The specified null string is sent by <command>COPY TO</command> without
    adding any backslashes; conversely, <command>COPY FROM</command> matches
    the input against the null string before removing backslashes.  Therefore,
    a null string such as <literal>\N</literal> cannot be confused with
    the actual data value <literal>\N</literal> (which would be represented
    as <literal>\\N</literal>).
   </para>

   <para>
    The following special backslash sequences are recognized by
    <command>COPY FROM</command>:

   <informaltable>
    <tgroup cols="2">
     <thead>
      <row>
       <entry>Sequence</entry>
       <entry>Represents</entry>
      </row>
     </thead>

     <tbody>
      <row>
       <entry><literal>\b</literal></entry>
       <entry>Backspace (ASCII 8)</entry>
      </row>
      <row>
       <entry><literal>\f</literal></entry>
       <entry>Form feed (ASCII 12)</entry>
      </row>
      <row>
       <entry><literal>\n</literal></entry>
       <entry>Newline (ASCII 10)</entry>
      </row>
      <row>
       <entry><literal>\r</literal></entry>
       <entry>Carriage return (ASCII 13)</entry>
      </row>
      <row>
       <entry><literal>\t</literal></entry>
       <entry>Tab (ASCII 9)</entry>
      </row>
      <row>
       <entry><literal>\v</literal></entry>
       <entry>Vertical tab (ASCII 11)</entry>
      </row>
      <row>
       <entry><literal>\</literal><replaceable>digits</replaceable></entry>
       <entry>Backslash followed by one to three octal digits specifies
       the byte with that numeric code</entry>
      </row>
      <row>
       <entry><literal>\x</literal><replaceable>digits</replaceable></entry>
       <entry>Backslash <literal>x</literal> followed by one or two hex digits specifies
       the byte with that numeric code</entry>
      </row>
     </tbody>
    </tgroup>
   </informaltable>

    Presently, <command>COPY TO</command> will never emit an octal or
    hex-digits backslash sequence, but it does use the other sequences
    listed above for those control characters.
   </para>

   <para>
    Any other backslashed character that is not mentioned in the above table
    will be taken to represent itself.  However, beware of adding backslashes
    unnecessarily, since that might accidentally produce a string matching the
    end-of-data marker (<literal>\.</literal>) or the null string (<literal>\N</literal> by
    default).  These strings will be recognized before any other backslash
    processing is done.
   </para>

   <para>
    It is strongly recommended that applications generating <command>COPY</command> data convert
    data newlines and carriage returns to the <literal>\n</literal>

Title: COPY Text Format: Delimiters, Special Characters, and Backslash Escaping
Summary
This section details the specifics of the text format used with the COPY command. It describes how delimiters separate columns, the usage of a null string to represent null values, and the importance of the end-of-data marker. It clarifies how backslashes are used to escape special characters like delimiters, newlines, and carriage returns within column values. It also lists the specific backslash escape sequences recognized by COPY FROM, such as \b for backspace, \n for newline, and \x for hexadecimal byte codes, and explains the behavior of COPY TO in relation to these sequences. Finally, it advises applications generating COPY data to convert newlines and carriage returns to \n and \r respectively.