<varname>maintenance_work_mem</varname></title>
<para>
Temporarily increasing the <xref linkend="guc-maintenance-work-mem"/>
configuration variable when loading large amounts of data can
lead to improved performance. This will help to speed up <command>CREATE
INDEX</command> commands and <command>ALTER TABLE ADD FOREIGN KEY</command> commands.
It won't do much for <command>COPY</command> itself, so this advice is
only useful when you are using one or both of the above techniques.
</para>
</sect2>
<sect2 id="populate-max-wal-size">
<title>Increase <varname>max_wal_size</varname></title>
<para>
Temporarily increasing the <xref linkend="guc-max-wal-size"/>
configuration variable can also
make large data loads faster. This is because loading a large
amount of data into <productname>PostgreSQL</productname> will
cause checkpoints to occur more often than the normal checkpoint
frequency (specified by the <varname>checkpoint_timeout</varname>
configuration variable). Whenever a checkpoint occurs, all dirty
pages must be flushed to disk. By increasing
<varname>max_wal_size</varname> temporarily during bulk
data loads, the number of checkpoints that are required can be
reduced.
</para>
</sect2>
<sect2 id="populate-pitr">
<title>Disable WAL Archival and Streaming Replication</title>
<para>
When loading large amounts of data into an installation that uses
WAL archiving or streaming replication, it might be faster to take a
new base backup after the load has completed than to process a large
amount of incremental WAL data. To prevent incremental WAL logging
while loading, disable archiving and streaming replication, by setting
<xref linkend="guc-wal-level"/> to <literal>minimal</literal>,
<xref linkend="guc-archive-mode"/> to <literal>off</literal>, and
<xref linkend="guc-max-wal-senders"/> to zero.
But note that changing these settings requires a server restart,
and makes any base backups taken before unavailable for archive
recovery and standby server, which may lead to data loss.
</para>
<para>
Aside from avoiding the time for the archiver or WAL sender to process the
WAL data, doing this will actually make certain commands faster, because
they do not to write WAL at all if <varname>wal_level</varname>
is <literal>minimal</literal> and the current subtransaction (or top-level
transaction) created or truncated the table or index they change. (They
can guarantee crash safety more cheaply by doing
an <function>fsync</function> at the end than by writing WAL.)
</para>
</sect2>
<sect2 id="populate-analyze">
<title>Run <command>ANALYZE</command> Afterwards</title>
<para>
Whenever you have significantly altered the distribution of data
within a table, running <link linkend="sql-analyze"><command>ANALYZE</command></link> is strongly recommended. This
includes bulk loading large amounts of data into the table. Running
<command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>)
ensures that the planner has up-to-date statistics about the
table. With no statistics or obsolete statistics, the planner might
make poor decisions during query planning, leading to poor
performance on any tables with inaccurate or nonexistent
statistics. Note that if the autovacuum daemon is enabled, it might
run <command>ANALYZE</command> automatically; see
<xref linkend="vacuum-for-statistics"/>
and <xref linkend="autovacuum"/> for more information.
</para>
</sect2>
<sect2 id="populate-pg-dump">
<title>Some Notes about <application>pg_dump</application></title>
<para>
Dump scripts generated by <application>pg_dump</application> automatically apply
several, but not all, of the above guidelines. To restore a
<application>pg_dump</application> dump as quickly as possible, you need to
do