Logical Replication Architecture: Processes, Roles, Triggers, and Initial Snapshot

box) that do not have a default operator class for B-tree or Hash. However, this limitation can be overcome by ensuring that the table has a primary key or replica identity defined for it. </para> </listitem> </itemizedlist> </sect1> <sect1 id="logical-replication-architecture"> <title>Architecture</title> <para> Logical replication is built with an architecture similar to physical streaming replication (see <xref linkend="streaming-replication"/>). It is implemented by <literal>walsender</literal> and <literal>apply</literal> processes. The walsender process starts logical decoding (described in <xref linkend="logicaldecoding"/>) of the WAL and loads the standard logical decoding output plugin (<literal>pgoutput</literal>). The plugin transforms the changes read from WAL to the logical replication protocol (see <xref linkend="protocol-logical-replication"/>) and filters the data according to the publication specification. The data is then continuously transferred using the streaming replication protocol to the apply worker, which maps the data to local tables and applies the individual changes as they are received, in correct transactional order. </para> <para> The apply process on the subscriber database always runs with <link linkend="guc-session-replication-role"><varname>session_replication_role</varname></link> set to <literal>replica</literal>. This means that, by default, triggers and rules will not fire on a subscriber. Users can optionally choose to enable triggers and rules on a table using the <link linkend="sql-altertable"><command>ALTER TABLE</command></link> command and the <literal>ENABLE TRIGGER</literal> and <literal>ENABLE RULE</literal> clauses. </para> <para> The logical replication apply process currently only fires row triggers, not statement triggers. The initial table synchronization, however, is implemented like a <command>COPY</command> command and thus fires both row and statement triggers for <command>INSERT</command>. </para> <sect2 id="logical-replication-snapshot"> <title>Initial Snapshot</title> <para> The initial data in existing subscribed tables are snapshotted and copied in parallel instances of a special kind of apply process. These special apply processes are dedicated table synchronization workers, spawned for each table to be synchronized. Each table synchronization process will create its own replication slot and copy the existing data. As soon as the copy is finished the table contents will become visible to other backends. Once existing data is copied, the worker enters synchronization mode, which ensures that the table is brought up to a synchronized state with the main apply process by streaming any changes that happened during the initial data copy using standard logical replication. During this synchronization phase, the changes are applied and committed in the same order as they happened on the publisher. Once synchronization is done, control of the replication of the table is given back to the main apply process where replication continues as normal. </para> <note> <para> The publication <link linkend="sql-createpublication-params-with-publish"><literal>publish</literal></link> parameter only affects what DML operations will be replicated. The initial data synchronization does not take this parameter into account when copying the existing table data. </para> </note> <note> <para> If a table synchronization worker fails during copy, the apply worker detects the failure and respawns the table synchronization worker to continue the synchronization process. This behaviour ensures that transient errors do not permanently disrupt the replication setup. See also <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_retrieve_retry_interval</varname></link>.

Logical replication utilizes `walsender` and `apply` processes similar to physical streaming replication. The `walsender` process reads WAL, applies a logical decoding plugin (`pgoutput`), transforms changes to the logical replication protocol, and filters data. The data is then streamed to the `apply` worker, which maps the data to local tables and applies changes in transactional order. The apply process runs with `session_replication_role` set to `replica`, so triggers and rules are disabled by default. Initial table data is snapshotted and copied using parallel instances of table synchronization workers, each with its own replication slot. The worker copies existing data and then enters synchronization mode to catch up with changes made during the copy. The `publish` parameter only affects replicated DML operations and not the initial data synchronization.