Optimizing PostgreSQL Data Restoration from pg

includes bulk loading large amounts of data into the table. Running <command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>) ensures that the planner has up-to-date statistics about the table. With no statistics or obsolete statistics, the planner might make poor decisions during query planning, leading to poor performance on any tables with inaccurate or nonexistent statistics. Note that if the autovacuum daemon is enabled, it might run <command>ANALYZE</command> automatically; see <xref linkend="vacuum-for-statistics"/> and <xref linkend="autovacuum"/> for more information. </para> </sect2> <sect2 id="populate-pg-dump"> <title>Some Notes about <application>pg_dump</application></title> <para> Dump scripts generated by <application>pg_dump</application> automatically apply several, but not all, of the above guidelines. To restore a <application>pg_dump</application> dump as quickly as possible, you need to do a few extra things manually. (Note that these points apply while <emphasis>restoring</emphasis> a dump, not while <emphasis>creating</emphasis> it. The same points apply whether loading a text dump with <application>psql</application> or using <application>pg_restore</application> to load from a <application>pg_dump</application> archive file.) </para> <para> By default, <application>pg_dump</application> uses <command>COPY</command>, and when it is generating a complete schema-and-data dump, it is careful to load data before creating indexes and foreign keys. So in this case several guidelines are handled automatically. What is left for you to do is to: <itemizedlist> <listitem> <para> Set appropriate (i.e., larger than normal) values for <varname>maintenance_work_mem</varname> and <varname>max_wal_size</varname>. </para> </listitem> <listitem> <para> If using WAL archiving or streaming replication, consider disabling them during the restore. To do that, set <varname>archive_mode</varname> to <literal>off</literal>, <varname>wal_level</varname> to <literal>minimal</literal>, and <varname>max_wal_senders</varname> to zero before loading the dump. Afterwards, set them back to the right values and take a fresh base backup. </para> </listitem> <listitem> <para> Experiment with the parallel dump and restore modes of both <application>pg_dump</application> and <application>pg_restore</application> and find the optimal number of concurrent jobs to use. Dumping and restoring in parallel by means of the <option>-j</option> option should give you a significantly higher performance over the serial mode. </para> </listitem> <listitem> <para> Consider whether the whole dump should be restored as a single transaction. To do that, pass the <option>-1</option> or <option>--single-transaction</option> command-line option to <application>psql</application> or <application>pg_restore</application>. When using this mode, even the smallest of errors will rollback the entire restore, possibly discarding many hours of processing. Depending on how interrelated the data is, that might seem preferable to manual cleanup, or not. <command>COPY</command> commands will run fastest if you use a single transaction and have WAL archiving turned off. </para> </listitem> <listitem> <para> If multiple CPUs are available in the database server, consider using <application>pg_restore</application>'s <option>--jobs</option> option. This

To restore a pg_dump dump quickly, consider setting larger values for maintenance_work_mem and max_wal_size, disabling WAL archiving and streaming replication, using parallel dump and restore modes, restoring as a single transaction, and utilizing multiple CPUs with pg_restore's --jobs option to improve performance and minimize errors.