Home Explore Blog CI



postgresql

11th chunk of `doc/src/sgml/ref/copy.sgml`
b958cbff22e16741ec06e4276c794fedd6ec6e3f5b0e2e3d0000000100000faa
 <entry>Backslash <literal>x</literal> followed by one or two hex digits specifies
       the byte with that numeric code</entry>
      </row>
     </tbody>
    </tgroup>
   </informaltable>

    Presently, <command>COPY TO</command> will never emit an octal or
    hex-digits backslash sequence, but it does use the other sequences
    listed above for those control characters.
   </para>

   <para>
    Any other backslashed character that is not mentioned in the above table
    will be taken to represent itself.  However, beware of adding backslashes
    unnecessarily, since that might accidentally produce a string matching the
    end-of-data marker (<literal>\.</literal>) or the null string (<literal>\N</literal> by
    default).  These strings will be recognized before any other backslash
    processing is done.
   </para>

   <para>
    It is strongly recommended that applications generating <command>COPY</command> data convert
    data newlines and carriage returns to the <literal>\n</literal> and
    <literal>\r</literal> sequences respectively.  At present it is
    possible to represent a data carriage return by a backslash and carriage
    return, and to represent a data newline by a backslash and newline.
    However, these representations might not be accepted in future releases.
    They are also highly vulnerable to corruption if the <command>COPY</command> file is
    transferred across different machines (for example, from Unix to Windows
    or vice versa).
   </para>

   <para>
     All backslash sequences are interpreted after encoding conversion.
     The bytes specified with the octal and hex-digit backslash sequences must
     form valid characters in the database encoding.
   </para>

   <para>
    <command>COPY TO</command> will terminate each row with a Unix-style
    newline (<quote><literal>\n</literal></quote>).  Servers running on Microsoft Windows instead
    output carriage return/newline (<quote><literal>\r\n</literal></quote>), but only for
    <command>COPY</command> to a server file; for consistency across platforms,
    <command>COPY TO STDOUT</command> always sends <quote><literal>\n</literal></quote>
    regardless of server platform.
    <command>COPY FROM</command> can handle lines ending with newlines,
    carriage returns, or carriage return/newlines.  To reduce the risk of
    error due to un-backslashed newlines or carriage returns that were
    meant as data, <command>COPY FROM</command> will complain if the line
    endings in the input are not all alike.
   </para>
  </refsect2>

  <refsect2 id="sql-copy-csv-format" xreflabel="CSV Format">
   <title>CSV Format</title>

   <para>
    This format option is used for importing and exporting the Comma-
    Separated Value (<literal>CSV</literal>) file format used by many other
    programs, such as spreadsheets. Instead of the escaping rules used by
    <productname>PostgreSQL</productname>'s standard text format, it
    produces and recognizes the common <literal>CSV</literal> escaping mechanism.
   </para>

   <para>
    The values in each record are separated by the <literal>DELIMITER</literal>
    character. If the value contains the delimiter character, the
    <literal>QUOTE</literal> character, the <literal>NULL</literal> string, a carriage
    return, or line feed character, then the whole value is prefixed and
    suffixed by the <literal>QUOTE</literal> character, and any occurrence
    within the value of a <literal>QUOTE</literal> character or the
    <literal>ESCAPE</literal> character is preceded by the escape character.
    You can also use <literal>FORCE_QUOTE</literal> to force quotes when outputting
    non-<literal>NULL</literal> values in specific columns.
   </para>

   <para>
    The <literal>CSV</literal> format has no standard way to distinguish a
    <literal>NULL</literal> value from an empty string.
    <productname>PostgreSQL</productname>'s <command>COPY</command> handles this by quoting.
    A <literal>NULL</literal>

Title: COPY Text Format: Backslash Sequences, Newlines, and CSV Format Introduction
Summary
This section continues the discussion on the text format for the COPY command, clarifying that unlisted backslashed characters represent themselves, but advises caution to avoid accidental end-of-data markers or null strings. It emphasizes converting newlines and carriage returns to \n and \r and warns against relying on direct backslash representations due to potential corruption across different operating systems. It notes that backslash sequences are interpreted post-encoding conversion, with octal and hex byte codes required to form valid characters. COPY TO uses Unix-style newlines, while COPY FROM handles various line endings but complains if they aren't uniform. The section then introduces the CSV format for compatibility with other programs like spreadsheets, noting its distinct escaping mechanism where values are enclosed in quote characters if they contain delimiters, quote characters, null strings, carriage returns, or line feeds.