What does it mean to "pack" a database? In shortest form, it means to remove unused records in order to reduce the memory required to hold the database.
Think of it like this: each time you delete a page in your website, a copy of that page is kept in the database. These copies are kept in case you need to get it back later. Over time, as you create and delete content and make other changes in your website, the number of unused objects can grow and grow, expanding the memory needed to operate the website, and slowing down operations. When you pack the database, the system finds the unused objects and removes them. Performing this job regularly helps to keep the size of a database manageable.
What is the benefit of scheduling routine database packs? If the database is not being packed on a regular basis, the amount of resources and time to perform a database pack may increase exponentially. Making sure that the packs are happening on a scheduled basis will save time in the long run. These are tasks that can easily be added into your buildout and then released into a production environment.
As an example, we recently had a Plone site that was using RelStorage backed with MySQL that had not been packed in over a year. The size of the MySQL dump grew to nearly 8GB and took 45 minutes to complete. This means that any time a release was made, there would be 45 minutes of lead time just to make a backup of the data.
Having this large database does not only impact the ability to swiftly do a release. If a new bug is discovered that can only be re-produced on the production environment, you certainly don't want to be debugging it there. Instead, you will want to be able to quickly bring back the data to your testing instance, hence the benefit of keeping the size of the database to a manageable size.
When using a ZEO server, either from plone.recipe.zeoserver or plone.recipe.zope2zeoserver, a zeopack
script is automatically generated for you with all the necessary parameters to run the pack. This can simply be run from the command line:
$ cd path/to/buildout
$ bin/zeopack
Now your database is packed, that was easy! As programmers and systems administrators, we know that doing things manually will eventually fail. How do we automate this process? Simple, we add a cron job to take care of the work for us. Here is an example configuration that gets everything set up for us:
[buildout]
parts = instance zeoserver cron-pack
extends = http://dist.plone.org/release/4.1.3/versions.cfg
[instance]
recipe = plone.recipe.zope2instance
user = admin:admin
http-address = 8080
zeo-address = 8100
zeo-client = True
# Set the shared blob option so blobs work with zeoserver
shared-blob = on
blob-storage = ${buildout:directory}/var/blobstorage
eggs =
Pillow
Plone
[zeoserver]
recipe = plone.recipe.zeoserver
blob-storage = ${instance:blob-storage}
zeo-address = ${instance:zeo-address}
# keep history for 7 days instead of the default 1
pack-days = 7
[cron-pack]
recipe = z3c.recipe.usercrontab
times = @weekly
command = ${buildout:bin-directory}/zeopack
After running this buildout, there will be a cron job added to the cron of the user that ran the buildout. This job will run weekly and pack the database, leaving 7 days of history.
Now for a more complex scenario using RelStorage. The zeopack
script will no longer work, but fear not, RelStorage comes with a script called zodbpack
that can do the job.
First some setup to make the generation of the conf file work:
$ cd path/to/buidout
$ mkdir templates
$ mkdir etc
$ touch templates/zodbpack.conf.tmpl
This sets up a directory for our zodbpack.conf
template, and an output directory named etc
. In the zodbpack.conf.tmpl
, place the following code:
<relstorage>
<mysql>
host ${settings:mysql-host}
db ${settings:mysql-db}
user ${settings:mysql-user}
passwd ${settings:mysql-passwd}
</mysql>
</relstorage>
Now here is our example buildout configuration that sets up Plone to use RelStorage with MySQL:
[buildout]
parts = instance relstorage zodbpack-conf cron-pack
extends = http://dist.plone.org/release/4.1.3/versions.cfg
[settings]
# settings for RelStorage connection
mysql-host = localhost
mysql-db = mydbname
mysql-user = root
mysql-passwd =
# options for the cron job
pack-days = 7
[instance]
recipe = plone.recipe.zope2instance
user = admin:admin
http-address = 8080
eggs =
Pillow
Plone
RelStorage
MySQL-python
rel-storage =
type mysql
host ${settings:mysql-host}
db ${settings:mysql-db}
user ${settings:mysql-user}
passwd ${settings:mysql-passwd}
blob-dir ${buildout:directory}/var/blobstorage
# Generate the RelStorage scripts
[relstorage]
recipe = zc.recipe.egg
eggs =
${instance:eggs}
RelStorage
MySQL-python
[zodbpack-conf]
recipe = collective.recipe.template
input = ${buildout:directory}/templates/zodbpack.conf.tmpl
output = ${buildout:directory}/etc/zodbpack.conf
[cron-pack]
recipe = z3c.recipe.usercrontab
times = @weekly
command = ${buildout:bin-directory}/zodbpack -d ${settings:pack-days} ${zodbpack-conf:output}
In this example, we are generating a conf file for the zodbpack
script to use with the [zodbpack-conf]
part. Since the RelStorage information will be used in two places, and might be different on each deployment platform, there is a [settings]
section that both the [instance]
and [zodbpack-conf]
parts take advantage of.
Once this buildout has run, it will be set up similar to our conventional buildout. The cron job will pack the RelStorage database once a week, keeping 7 days of history.
Packing your Plone site's database on a regular basis is as simple as adding some configuration to buildout. There is really no reason you shouldn't be keeping up on this maintenance task with your Plone sites, especially when you run complex websites or intranets. It will save you time during your development and release cycles by making sure the database is at its leanest.