when the planner will <quote>flatten out</quote> subqueries, and the
other controls when it will flatten out explicit joins. Typically
you would either set <varname>join_collapse_limit</varname> equal to
<varname>from_collapse_limit</varname> (so that explicit joins and subqueries
act similarly) or set <varname>join_collapse_limit</varname> to 1 (if you want
to control join order with explicit joins). But you might set them
differently if you are trying to fine-tune the trade-off between planning
time and run time.
</para>
</sect1>
<sect1 id="populate">
<title>Populating a Database</title>
<para>
One might need to insert a large amount of data when first populating
a database. This section contains some suggestions on how to make
this process as efficient as possible.
</para>
<sect2 id="disable-autocommit">
<title>Disable Autocommit</title>
<indexterm>
<primary>autocommit</primary>
<secondary>bulk-loading data</secondary>
</indexterm>
<para>
When using multiple <command>INSERT</command>s, turn off autocommit and just do
one commit at the end. (In plain
SQL, this means issuing <command>BEGIN</command> at the start and
<command>COMMIT</command> at the end. Some client libraries might
do this behind your back, in which case you need to make sure the
library does it when you want it done.) If you allow each
insertion to be committed separately,
<productname>PostgreSQL</productname> is doing a lot of work for
each row that is added. An additional benefit of doing all
insertions in one transaction is that if the insertion of one row
were to fail then the insertion of all rows inserted up to that
point would be rolled back, so you won't be stuck with partially
loaded data.
</para>
</sect2>
<sect2 id="populate-copy-from">
<title>Use <command>COPY</command></title>
<para>
Use <link linkend="sql-copy"><command>COPY</command></link> to load
all the rows in one command, instead of using a series of
<command>INSERT</command> commands. The <command>COPY</command>
command is optimized for loading large numbers of rows; it is less
flexible than <command>INSERT</command>, but incurs significantly
less overhead for large data loads. Since <command>COPY</command>
is a single command, there is no need to disable autocommit if you
use this method to populate a table.
</para>
<para>
If you cannot use <command>COPY</command>, it might help to use <link
linkend="sql-prepare"><command>PREPARE</command></link> to create a
prepared <command>INSERT</command> statement, and then use
<command>EXECUTE</command> as many times as required. This avoids
some of the overhead of repeatedly parsing and planning
<command>INSERT</command>. Different interfaces provide this facility
in different ways; look for <quote>prepared statements</quote> in the interface
documentation.
</para>
<para>
Note that loading a large number of rows using
<command>COPY</command> is almost always faster than using
<command>INSERT</command>, even if <command>PREPARE</command> is used and
multiple insertions are batched into a single transaction.
</para>
<para>
<command>COPY</command> is fastest when used within the same
transaction as an earlier <command>CREATE TABLE</command> or
<command>TRUNCATE</command> command. In such cases no WAL
needs to be written, because in case of an error, the files
containing the newly loaded data will be removed anyway.
However, this consideration only applies when
<xref linkend="guc-wal-level"/> is <literal>minimal</literal>
as all commands must write WAL otherwise.
</para>
</sect2>
<sect2 id="populate-rm-indexes">
<title>Remove Indexes</title>
<para>
If you are loading a freshly created table, the fastest method is to
create the table, bulk load the table's data