Home Explore Blog CI



postgresql

32th chunk of `doc/src/sgml/perform.sgml`
e7d8203a2e87887eabab2ba332fbe49c49fecb5b863143400000000100000eca
 includes bulk loading large amounts of data into the table.  Running
    <command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>)
    ensures that the planner has up-to-date statistics about the
    table.  With no statistics or obsolete statistics, the planner might
    make poor decisions during query planning, leading to poor
    performance on any tables with inaccurate or nonexistent
    statistics.  Note that if the autovacuum daemon is enabled, it might
    run <command>ANALYZE</command> automatically; see
    <xref linkend="vacuum-for-statistics"/>
    and <xref linkend="autovacuum"/> for more information.
   </para>
  </sect2>

  <sect2 id="populate-pg-dump">
   <title>Some Notes about <application>pg_dump</application></title>

   <para>
    Dump scripts generated by <application>pg_dump</application> automatically apply
    several, but not all, of the above guidelines.  To restore a
    <application>pg_dump</application> dump as quickly as possible, you need to
    do a few extra things manually.  (Note that these points apply while
    <emphasis>restoring</emphasis> a dump, not while <emphasis>creating</emphasis> it.
    The same points apply whether loading a text dump with
    <application>psql</application> or using <application>pg_restore</application> to load
    from a <application>pg_dump</application> archive file.)
   </para>

   <para>
    By default, <application>pg_dump</application> uses <command>COPY</command>, and when
    it is generating a complete schema-and-data dump, it is careful to
    load data before creating indexes and foreign keys.  So in this case
    several guidelines are handled automatically.  What is left
    for you to do is to:
    <itemizedlist>
     <listitem>
      <para>
       Set appropriate (i.e., larger than normal) values for
       <varname>maintenance_work_mem</varname> and
       <varname>max_wal_size</varname>.
      </para>
     </listitem>
     <listitem>
      <para>
       If using WAL archiving or streaming replication, consider disabling
       them during the restore. To do that, set <varname>archive_mode</varname>
       to <literal>off</literal>,
       <varname>wal_level</varname> to <literal>minimal</literal>, and
       <varname>max_wal_senders</varname> to zero before loading the dump.
       Afterwards, set them back to the right values and take a fresh
       base backup.
      </para>
     </listitem>
     <listitem>
      <para>
       Experiment with the parallel dump and restore modes of both
       <application>pg_dump</application> and <application>pg_restore</application> and find the
       optimal number of concurrent jobs to use. Dumping and restoring in
       parallel by means of the <option>-j</option> option should give you a
       significantly higher performance over the serial mode.
      </para>
     </listitem>
     <listitem>
      <para>
       Consider whether the whole dump should be restored as a single
       transaction.  To do that, pass the <option>-1</option> or
       <option>--single-transaction</option> command-line option to
       <application>psql</application> or <application>pg_restore</application>. When using this
       mode, even the smallest of errors will rollback the entire restore,
       possibly discarding many hours of processing.  Depending on how
       interrelated the data is, that might seem preferable to manual cleanup,
       or not.  <command>COPY</command> commands will run fastest if you use a single
       transaction and have WAL archiving turned off.
      </para>
     </listitem>
     <listitem>
      <para>
       If multiple CPUs are available in the database server, consider using
       <application>pg_restore</application>'s <option>--jobs</option> option.  This
   

Title: Optimizing PostgreSQL Data Restoration from pg_dump
Summary
To restore a pg_dump dump quickly, consider setting larger values for maintenance_work_mem and max_wal_size, disabling WAL archiving and streaming replication, using parallel dump and restore modes, restoring as a single transaction, and utilizing multiple CPUs with pg_restore's --jobs option to improve performance and minimize errors.