<indexterm>
<primary>continuous archiving</primary>
<secondary>in standby</secondary>
</indexterm>
<para>
When continuous WAL archiving is used in a standby, there are two
different scenarios: the WAL archive can be shared between the primary
and the standby, or the standby can have its own WAL archive. When
the standby has its own WAL archive, set <varname>archive_mode</varname>
to <literal>always</literal>, and the standby will call the archive
command for every WAL segment it receives, whether it's by restoring
from the archive or by streaming replication. The shared archive can
be handled similarly, but the <varname>archive_command</varname> or <varname>archive_library</varname> must
test if the file being archived exists already, and if the existing file
has identical contents. This requires more care in the
<varname>archive_command</varname> or <varname>archive_library</varname>, as it must
be careful to not overwrite an existing file with different contents,
but return success if the exactly same file is archived twice. And
all that must be done free of race conditions, if two servers attempt
to archive the same file at the same time.
</para>
<para>
If <varname>archive_mode</varname> is set to <literal>on</literal>, the
archiver is not enabled during recovery or standby mode. If the standby
server is promoted, it will start archiving after the promotion, but
will not archive any WAL or timeline history files that
it did not generate itself. To get a complete
series of WAL files in the archive, you must ensure that all WAL is
archived, before it reaches the standby. This is inherently true with
file-based log shipping, as the standby can only restore files that
are found in the archive, but not if streaming replication is enabled.
When a server is not in recovery mode, there is no difference between
<literal>on</literal> and <literal>always</literal> modes.
</para>
</sect2>
</sect1>
<sect1 id="warm-standby-failover">
<title>Failover</title>
<para>
If the primary server fails then the standby server should begin
failover procedures.
</para>
<para>
If the standby server fails then no failover need take place. If the
standby server can be restarted, even some time later, then the recovery
process can also be restarted immediately, taking advantage of
restartable recovery. If the standby server cannot be restarted, then a
full new standby server instance should be created.
</para>
<para>
If the primary server fails and the standby server becomes the
new primary, and then the old primary restarts, you must have
a mechanism for informing the old primary that it is no longer the primary. This is
sometimes known as <acronym>STONITH</acronym> (Shoot The Other Node In The Head), which is
necessary to avoid situations where both systems think they are the
primary, which will lead to confusion and ultimately data loss.
</para>
<para>
Many failover systems use just two systems, the primary and the standby,
connected by some kind of heartbeat mechanism to continually verify the
connectivity between the two and the viability of the primary. It is
also possible to use a third system (called a witness server) to prevent
some cases of inappropriate failover, but the additional complexity
might not be worthwhile unless it is set up with sufficient care and
rigorous testing.
</para>
<para>
<productname>PostgreSQL</productname> does not provide the system
software required to identify a failure on the primary and notify
the standby database server. Many such tools exist and are well
integrated with the operating system facilities required for
successful failover, such as IP address migration.
</para>
<para>
Once failover to the standby