How to do it…

The repmgr utility provides a set of single command-line actions that perform all the required activities on one node:

  1. To start a new cluster with repmgr with the current node as its primary, use the following command:
repmgr primary register
  1. To add an existing standby to the cluster with repmgr, use the following command:
repmgr standby register
  1. Use the following command to request repmgr to create a new standby for you by copying node1. This will fail if you specify an existing data directory:
repmgr standby clone node1 -D /path/of_new_data_directory
  1. To reuse an old master as a standby, use the rejoin command:
repmgr node rejoin -d ‘host=node2 user=repmgr’
  1. To switch from one primary to another one, run this command on the standby which you want to make a primary:

repmgr standby switchover
  1. To promote a standby to be the new primary, use the following command:
repmgr standby promote
  1. To request a standby to follow a new primary, use the following command:
repmgr standby follow
  1. Check the status of each registered node in the cluster, like this:
repmgr cluster show
  1. Request cleanup of monitoring data, as follows. This is relevant only if --monitoring-history is used:
repmgr cluster cleanup
  1. Create a witness server for use with auto-failover voting, like this:
repmgr witness create

The preceding commands are presented in a simplified form. Each command also takes one of these options:

For each node, create a repmgr.conf file containing at least the following parameters. Note that the node_id and node_name parameters need to be different on each node:

node_id=2
node_name=beta
conninfo='host=beta user=repmgr'
data_directory=/var/lib/pgsql/10/data

Once all the nodes are registered, you can start the repmgr daemon on each node, like this:

repmgrd -d -f /var/lib/pgsql/repmgr/repmgr.conf &

If you would like the daemon to generate monitoring information for that node, you should set monitoring_history=yes in the repmgr.conf file.

Monitoring data can be accessed using this:

repmgr=# select * from repmgr.replication_status;
-[ RECORD 1 ]-------------+------------------------------
primary_node_id | 1
standby_node_id | 2
standby_name | node2
node_type | standby
active | t

last_monitor_time | 2017-08-24 16:28:41.260478+09
last_wal_primary_location | 0/6D57A00
last_wal_standby_location | 0/5000000
replication_lag | 29 MB
replication_time_lag | 00:00:11.736163
apply_lag | 15 MB
communication_time_lag | 00:00:01.365643