When clients have high availability requirements for their Plone site, we recommend using the RelStorage implementation for the ZODB, combined with the PostgreSQL database platform. When we deploy this solution, we use ZFS on FreeBSD. Until recenty, this solution relied on rsync
to synchronize the PostgreSQL data directory from the primary server to the secondary server. The problem with this approach is that rsync
has to scan the entire data directory to find changed files. We decided to take advantage of the ZFS snapshot replication features to make this step more efficient.
Since ZFS snapshots have the deltas we need regarding changes to the PostgreSQL data directory, we wrote a script that will:
recovery.conf
/ postgresql.conf
filesThere are moving parts to this solution, so you'll need to do some leg work before being able to use the script.
In order to take a ZFS snapshot on the remote primary server and initiate replication back to the secondary, you need to have an operating system user setup with a password-less SSH key (on the primary database server):
$ sudo pw groupadd -n zfssync -g 6000
$ sudo pw useradd -n zfssync -u 6000 -g 6000 -m
$ sudo -H -u zfssync ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/zfssync/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/zfssync/.ssh/id_rsa.
Your public key has been saved in /home/zfssync/.ssh/id_rsa.pub.
The key fingerprint is:
48:da:0d:68:2e:66:96:ee:d8:ba:fc:6d:a3:b6:dd:8d davidb@dhmo.local
The key's randomart image is:
+--[ RSA 2048]----+
| |
| . |
| o o |
| + + + |
| `* o o S |
| = . |
| . |
|.+ .oo. o |
|++=++o.E . |
+-----------------+
# Copy /home/zfssync/.ssh to your secondary database server
$ sudo zfs allow -u zfssync create,mount,snapshot,send,receive,hold data/pgsql
The calls to pg_start_backup
and pg_stop_backup
require elevated privileges in PostgreSQL. Here's how to set up a PostgreSQL database user with the replication
role:
$ psql -U pgsql postgres
psql (9.3.2)
Type "help" for help.
postgres=# create user replicator with replication;
You can use md5
for the METHOD
, but you must then setup the .pgpass file for your operating system user.
In order to take advantage of efficient incremental replication, the secondary database server must first transfer an initial snapshot of the ZFS filesystem holding the current PostgreSQL data directory (this builds on the first requisite above, to be run on the secondary server):
$ sudo ssh -i /home/zfssync/.ssh/id_rsa zfssync@<IP of primary> zfs snapshot data/pgsql@init-secondary
$ sudo ssh -i /home/zfssync/.ssh/id_rsa zfssync@<IP of primary> zfs send -Rv data@init-secondary | sudo zfs recv -Fv data/pgsql
The reset_secondary.sh
script (and ancillary config file info) is on Github: https://gist.github.com/davidblewett/8282108 . It's usage is pretty simple (on secondary):
$ sudo /path/to/reset_secondary.sh <IP of primary>
After the script has run, the secondary server will be running off of an up-to-date ZFS snapshot. If you use the config file info in the gist, it will then continue to use PostgreSQL's built-in streaming replication to keep itself up to date.
It is possible to build on this setup and add automatic failover and promotion of the secondary to primary by using FreeBSD's CARP and ifstated mechanisms.