File System Backup and Continuous Archiving in PostgreSQL

</para> </listitem> </orderedlist> </para> <para> An alternative file-system backup approach is to make a <quote>consistent snapshot</quote> of the data directory, if the file system supports that functionality (and you are willing to trust that it is implemented correctly). The typical procedure is to make a <quote>frozen snapshot</quote> of the volume containing the database, then copy the whole data directory (not just parts, see above) from the snapshot to a backup device, then release the frozen snapshot. This will work even while the database server is running. However, a backup created in this way saves the database files in a state as if the database server was not properly shut down; therefore, when you start the database server on the backed-up data, it will think the previous server instance crashed and will replay the WAL log. This is not a problem; just be aware of it (and be sure to include the WAL files in your backup). You can perform a <command>CHECKPOINT</command> before taking the snapshot to reduce recovery time. </para> <para> If your database is spread across multiple file systems, there might not be any way to obtain exactly-simultaneous frozen snapshots of all the volumes. For example, if your data files and WAL log are on different disks, or if tablespaces are on different file systems, it might not be possible to use snapshot backup because the snapshots <emphasis>must</emphasis> be simultaneous. Read your file system documentation very carefully before trusting the consistent-snapshot technique in such situations. </para> <para> If simultaneous snapshots are not possible, one option is to shut down the database server long enough to establish all the frozen snapshots. Another option is to perform a continuous archiving base backup (<xref linkend="backup-base-backup"/>) because such backups are immune to file system changes during the backup. This requires enabling continuous archiving just during the backup process; restore is done using continuous archive recovery (<xref linkend="backup-pitr-recovery"/>). </para> <para> Another option is to use <application>rsync</application> to perform a file system backup. This is done by first running <application>rsync</application> while the database server is running, then shutting down the database server long enough to do an <command>rsync --checksum</command>. (<option>--checksum</option> is necessary because <command>rsync</command> only has file modification-time granularity of one second.) The second <application>rsync</application> will be quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. </para> <para> Note that a file system backup will typically be larger than an SQL dump. (<application>pg_dump</application> does not need to dump the contents of indexes for example, just the commands to recreate them.) However, taking a file system backup might be faster. </para> </sect1> <sect1 id="continuous-archiving"> <title>Continuous Archiving and Point-in-Time Recovery (PITR)</title> <indexterm zone="backup"> <primary>continuous archiving</primary> </indexterm> <indexterm zone="backup"> <primary>point-in-time recovery</primary> </indexterm> <indexterm zone="backup"> <primary>PITR</primary> </indexterm> <para> At all times, <productname>PostgreSQL</productname> maintains a <firstterm>write ahead log</firstterm> (WAL) in the <filename>pg_wal/</filename> subdirectory of the cluster's data directory. The log records every change made to the database's data files. This log exists primarily for crash-safety purposes: if the system crashes, the database can be restored to consistency

This section discusses various file system backup approaches for PostgreSQL, including making a consistent snapshot of the data directory, using rsync, and shutting down the database server, as well as the limitations and considerations of each method, and introduces the concept of continuous archiving and point-in-time recovery (PITR) using the write ahead log (WAL).