Chapter 52. Multimaster

Table of Contents

52.1. Architecture
52.1.1. Replication
52.1.2. Failure Detection and Recovery
52.1.3. Multimaster Background Workers
52.2. Installation and Setup
52.2.1. Setting up a Multi-Master Cluster
52.2.2. Tuning Configuration Parameters
52.2.3. Setting Timeout for Failure Detection
52.2.4. 2+1 Mode: Setting up a Standalone Referee Node
52.3. Multi-Master Cluster Administration
52.3.1. Monitoring Cluster Status
52.3.2. Accessing Disabled Nodes
52.3.3. Adding New Nodes to the Cluster
52.3.4. Removing Nodes from the Cluster
52.4. Reference
52.4.1. Configuration Parameters
52.4.2. Functions
52.5. Limitations

multimaster is a PgES extension with a set of patches that turns PostgreSQL into a synchronous shared-nothing cluster to provide Online Transaction Processing (OLTP) scalability for read transactions and high availability with automatic disaster recovery.

As compared to a standard PostgreSQL master-standby cluster, a cluster configured with the multimaster extension offers the following benefits:

Important

Before deploying multimaster on production systems, make sure to take its replication restrictions into account.

The multimaster extension replicates your database to all nodes of the cluster and allows write transactions on each node. Write transactions are synchronously replicated to all nodes, which increases commit latency. Read-only transactions and queries are executed locally, without any measurable overhead.

To ensure high availability and fault tolerance of the cluster, multimaster determines each transaction outcome through Paxos consensus algorithm, uses custom recovery protocol and heartbeats for failure discovery. A multi-master cluster of N nodes can continue working while the majority of the nodes are alive and reachable by other nodes. To be configured with multimaster, the cluster must include at least two nodes. Since the data on all cluster nodes is the same, you do not typically need more than five cluster nodes. Three cluster nodes are enough to ensure high availability in most cases. There is also a special 2+1 (referee) mode in which 2 nodes hold data and an additional one called referee only participates in voting. Compared to traditional three nodes setup, this is cheaper (referee resources demands are low) but availability is decreased.

When a failed node is reconnected to the cluster, multimaster automatically fast-forwards the node to the actual state based on the Write-Ahead Log (WAL) data in the corresponding replication slot. If a node was excluded from the cluster, you can add it back using pg_basebackup.