Home Explore Blog CI



postgresql

1st chunk of `doc/src/sgml/ref/pg_rewind.sgml`
8e7d7c397eaffdb8afe6e79a62b54e3c29ee75cdc2d7fcc10000000100000fa2
<!--
doc/src/sgml/ref/pg_rewind.sgml
PostgreSQL documentation
-->

<refentry id="app-pgrewind">
 <indexterm zone="app-pgrewind">
  <primary>pg_rewind</primary>
 </indexterm>

 <refmeta>
  <refentrytitle><application>pg_rewind</application></refentrytitle>
  <manvolnum>1</manvolnum>
  <refmiscinfo>Application</refmiscinfo>
 </refmeta>

 <refnamediv>
  <refname>pg_rewind</refname>
  <refpurpose>synchronize a <productname>PostgreSQL</productname> data directory with another data directory that was forked from it</refpurpose>
 </refnamediv>

 <refsynopsisdiv>
  <cmdsynopsis>
   <command>pg_rewind</command>
   <arg rep="repeat"><replaceable>option</replaceable></arg>
   <group choice="plain">
    <group choice="req">
     <arg choice="plain"><option>-D</option></arg>
     <arg choice="plain"><option>--target-pgdata</option></arg>
    </group>
    <replaceable> directory</replaceable>
    <group choice="req">
     <arg choice="plain"><option>--source-pgdata=<replaceable>directory</replaceable></option></arg>
     <arg choice="plain"><option>--source-server=<replaceable>connstr</replaceable></option></arg>
    </group>
   </group>
  </cmdsynopsis>
 </refsynopsisdiv>

 <refsect1>
  <title>Description</title>

  <para>
   <application>pg_rewind</application> is a tool for synchronizing a PostgreSQL cluster
   with another copy of the same cluster, after the clusters' timelines have
   diverged. A typical scenario is to bring an old primary server back online
   after failover as a standby that follows the new primary.
  </para>

  <para>
   After a successful rewind, the state of the target data directory is
   analogous to a base backup of the source data directory. Unlike taking
   a new base backup or using a tool like <application>rsync</application>,
   <application>pg_rewind</application> does not require comparing or copying
   unchanged relation blocks in the cluster. Only changed blocks from existing
   relation files are copied; all other files, including new relation files,
   configuration files, and WAL segments, are copied in full. As such the
   rewind operation is significantly faster than other approaches when the
   database is large and only a small fraction of blocks differ between the
   clusters.
  </para>

  <para>
   <application>pg_rewind</application> examines the timeline histories of the source
   and target clusters to determine the point where they diverged, and
   expects to find WAL in the target cluster's <filename>pg_wal</filename> directory
   reaching all the way back to the point of divergence. The point of divergence
   can be found either on the target timeline, the source timeline, or their common
   ancestor. In the typical failover scenario where the target cluster was
   shut down soon after the divergence, this is not a problem, but if the
   target cluster ran for a long time after the divergence, its old WAL
   files might no longer be present. In this case, you can manually copy them
   from the WAL archive to the <filename>pg_wal</filename> directory, or run
   <application>pg_rewind</application> with the <literal>-c</literal> option to
   automatically retrieve them from the WAL archive. The use of
   <application>pg_rewind</application> is not limited to failover, e.g.,  a standby
   server can be promoted, run some write transactions, and then rewound
   to become a standby again.
  </para>

  <para>
   After running <application>pg_rewind</application>, WAL replay needs to
   complete for the data directory to be in a consistent state. When the
   target server is started again it will enter archive recovery and replay
   all WAL generated in the source server from the last checkpoint before
   the point of divergence. If some of the WAL was no longer available in the
   source server when <application>pg_rewind</application> was run, and
   therefore could not be copied by the <application>pg_rewind</application>
   session, it must be made available when the target server is started.

Title: pg_rewind: Synchronize PostgreSQL Data Directories
Summary
pg_rewind synchronizes a PostgreSQL data directory with another that was forked from it, typically used to bring an old primary server back online as a standby after a failover. It copies only the changed blocks and other necessary files, making it faster than a base backup or rsync. pg_rewind examines timeline histories and requires WAL to be available up to the point of divergence. After running pg_rewind, WAL replay needs to complete for the data directory to be in a consistent state.