Home Explore Blog CI



postgresql

12th chunk of `doc/src/sgml/ref/copy.sgml`
305b03262ceeda654975a8889a6dbecd42f44959b23f219b0000000100000fa6
 <literal>CSV</literal> escaping mechanism.
   </para>

   <para>
    The values in each record are separated by the <literal>DELIMITER</literal>
    character. If the value contains the delimiter character, the
    <literal>QUOTE</literal> character, the <literal>NULL</literal> string, a carriage
    return, or line feed character, then the whole value is prefixed and
    suffixed by the <literal>QUOTE</literal> character, and any occurrence
    within the value of a <literal>QUOTE</literal> character or the
    <literal>ESCAPE</literal> character is preceded by the escape character.
    You can also use <literal>FORCE_QUOTE</literal> to force quotes when outputting
    non-<literal>NULL</literal> values in specific columns.
   </para>

   <para>
    The <literal>CSV</literal> format has no standard way to distinguish a
    <literal>NULL</literal> value from an empty string.
    <productname>PostgreSQL</productname>'s <command>COPY</command> handles this by quoting.
    A <literal>NULL</literal> is output as the <literal>NULL</literal> parameter string
    and is not quoted, while a non-<literal>NULL</literal> value matching the
    <literal>NULL</literal> parameter string is quoted.  For example, with the
    default settings, a <literal>NULL</literal> is written as an unquoted empty
    string, while an empty string data value is written with double quotes
    (<literal>""</literal>). Reading values follows similar rules. You can
    use <literal>FORCE_NOT_NULL</literal> to prevent <literal>NULL</literal> input
    comparisons for specific columns. You can also use
    <literal>FORCE_NULL</literal> to convert quoted null string data values to
    <literal>NULL</literal>.
   </para>

   <para>
    Because backslash is not a special character in the <literal>CSV</literal>
    format, the end-of-data marker used in text mode (<literal>\.</literal>)
    is not normally treated as special when reading <literal>CSV</literal>
    data.  An exception is that <application>psql</application> will terminate
    a <literal>COPY FROM STDIN</literal> operation (that is, reading
    in-line <command>COPY</command> data in an SQL script) at a line containing
    only <literal>\.</literal>, whether it is text or <literal>CSV</literal>
    mode.
   </para>

   <note>
    <para>
     <productname>PostgreSQL</productname> versions before v18 always
     recognized unquoted <literal>\.</literal> as an end-of-data marker,
     even when reading from a separate file.  For compatibility with older
     versions, <command>COPY TO</command> will quote <literal>\.</literal>
     when it's alone on a line, even though this is no longer necessary.
    </para>
   </note>

   <note>
    <para>
     In <literal>CSV</literal> format, all characters are significant. A quoted value
     surrounded by white space, or any characters other than
     <literal>DELIMITER</literal>, will include those characters. This can cause
     errors if you import data from a system that pads <literal>CSV</literal>
     lines with white space out to some fixed width. If such a situation
     arises you might need to preprocess the <literal>CSV</literal> file to remove
     the trailing white space, before importing the data into
     <productname>PostgreSQL</productname>.
    </para>
   </note>

   <note>
    <para>
     <literal>CSV</literal> format will both recognize and produce <literal>CSV</literal> files with quoted
     values containing embedded carriage returns and line feeds. Thus
     the files are not strictly one line per table row like text-format
     files.
    </para>
   </note>

   <note>
    <para>
     Many programs produce strange and occasionally perverse <literal>CSV</literal> files,
     so the file format is more a convention than a standard. Thus you
     might encounter some files that cannot be imported using this
     mechanism, and <command>COPY</command> might produce files that other
     programs cannot process.
    </para>
   </note>

  </refsect2>

Title: COPY CSV Format Details: Delimiters, Quotes, NULL Handling, and Caveats
Summary
This section details the intricacies of the CSV format within PostgreSQL's COPY command. It specifies how delimiters, quote characters, NULL strings, carriage returns, and line feeds are handled through quoting and escaping. It clarifies how PostgreSQL distinguishes NULL values from empty strings by quoting non-NULL values that match the NULL parameter string, and introduces FORCE_NOT_NULL and FORCE_NULL options for precise control over NULL input comparisons and conversions. The section notes that while backslash is not a special character in CSV format, psql will terminate COPY FROM STDIN operations at a line containing only '\.'. It includes notes on whitespace significance, handling of embedded carriage returns and line feeds, and the variability of CSV file conventions, cautioning about potential compatibility issues with other programs.