COPY Text Format: Delimiters, Special Characters, and Backslash Escaping

specified null string is used in place of columns that are null. <command>COPY FROM</command> will raise an error if any line of the input file contains more or fewer columns than are expected. </para> <para> End of data can be represented by a line containing just backslash-period (<literal>\.</literal>). An end-of-data marker is not necessary when reading from a file, since the end of file serves perfectly well; in that context this provision exists only for backward compatibility. However, <application>psql</application> uses <literal>\.</literal> to terminate a <literal>COPY FROM STDIN</literal> operation (that is, reading in-line <command>COPY</command> data in an SQL script). In that context the rule is needed to be able to end the operation before the end of the script. </para> <para> Backslash characters (<literal>\</literal>) can be used in the <command>COPY</command> data to quote data characters that might otherwise be taken as row or column delimiters. In particular, the following characters <emphasis>must</emphasis> be preceded by a backslash if they appear as part of a column value: backslash itself, newline, carriage return, and the current delimiter character. </para> <para> The specified null string is sent by <command>COPY TO</command> without adding any backslashes; conversely, <command>COPY FROM</command> matches the input against the null string before removing backslashes. Therefore, a null string such as <literal>\N</literal> cannot be confused with the actual data value <literal>\N</literal> (which would be represented as <literal>\\N</literal>). </para> <para> The following special backslash sequences are recognized by <command>COPY FROM</command>: <informaltable> <tgroup cols="2"> <thead> <row> <entry>Sequence</entry> <entry>Represents</entry> </row> </thead> <tbody> <row> <entry><literal>\b</literal></entry> <entry>Backspace (ASCII 8)</entry> </row> <row> <entry><literal>\f</literal></entry> <entry>Form feed (ASCII 12)</entry> </row> <row> <entry><literal>\n</literal></entry> <entry>Newline (ASCII 10)</entry> </row> <row> <entry><literal>\r</literal></entry> <entry>Carriage return (ASCII 13)</entry> </row> <row> <entry><literal>\t</literal></entry> <entry>Tab (ASCII 9)</entry> </row> <row> <entry><literal>\v</literal></entry> <entry>Vertical tab (ASCII 11)</entry> </row> <row> <entry><literal>\</literal><replaceable>digits</replaceable></entry> <entry>Backslash followed by one to three octal digits specifies the byte with that numeric code</entry> </row> <row> <entry><literal>\x</literal><replaceable>digits</replaceable></entry> <entry>Backslash <literal>x</literal> followed by one or two hex digits specifies the byte with that numeric code</entry> </row> </tbody> </tgroup> </informaltable> Presently, <command>COPY TO</command> will never emit an octal or hex-digits backslash sequence, but it does use the other sequences listed above for those control characters. </para> <para> Any other backslashed character that is not mentioned in the above table will be taken to represent itself. However, beware of adding backslashes unnecessarily, since that might accidentally produce a string matching the end-of-data marker (<literal>\.</literal>) or the null string (<literal>\N</literal> by default). These strings will be recognized before any other backslash processing is done. </para> <para> It is strongly recommended that applications generating <command>COPY</command> data convert data newlines and carriage returns to the <literal>\n</literal>

This section details the specifics of the text format used with the COPY command. It describes how delimiters separate columns, the usage of a null string to represent null values, and the importance of the end-of-data marker. It clarifies how backslashes are used to escape special characters like delimiters, newlines, and carriage returns within column values. It also lists the specific backslash escape sequences recognized by COPY FROM, such as \b for backspace, \n for newline, and \x for hexadecimal byte codes, and explains the behavior of COPY TO in relation to these sequences. Finally, it advises applications generating COPY data to convert newlines and carriage returns to \n and \r respectively.