Share



How to Backup Heroku Postgres Database to an Encrypted AWS S3 Bucket

 
Heroku PostgreSQL to S3 backups system is represented by a secure locker

Heroku offers a robust backups system for it’s Postgres database plugin. Unfortunately, you can irreversibly lose all your data and backups just by typing a single command. It might seem improbable, but still, I would rather not bet my startup’s existence on a single faulty bash line. In this tutorial, I will describe how to set up a proprietary redundant Heroku PostgreSQL backups system to a secure AWS S3 bucket.

I will be covering various tools including AWS CLI, OpenSSL, GPG, Heroku buildpacks, and scheduler but you don’t need to be familiar with any of those. By following this guide, you will set up a reliable, custom backups system for Heroku PostgreSQL database even if you don’t have much dev ops experience up your sleeve.

Let’s get started!

How to lose all the Heroku Postgres data

Though the first paragraph is a bit “clickbaity”, it’s actually true. Just by typing:

heroku apps:destroy app-name --confirm app-name

you can completely destroy the Heroku app including all it’s add-ons, databases and backups data:

Don't lose your Heroku data

A single unfortunate command and all your team can go home


You can lose all the Heroku data without the redundant backup on S3

Read on if you’d like to safeguard yourself against a potential total loss of your startup’s data.

Set up an encrypted AWS S3 bucket

Our redundant backups system will periodically upload encrypted snapshosts of the PostgreSQL database to a secure AWS S3 bucket.

Let’s start with adding a correctly configured S3 bucket. For a more in-depth tutorial on how to work with AWS S3 buckets, you can check out my other article.

Make sure to disable public access and enable encryption for your new S3 bucket:

Heroku data can be completely lost without the redundant backup on S3

S3 bucket public access blocked


Heroku data can be completely lost without the redundant backup on S3

S3 bucket AES-256 encryption enabled

Add an IAM user

You will need Amazon AWS credentials to upload the backup dump to S3 bucket. One common mistake is to use your primary account credentials instead of creating an IAM user with limited permissions.

Check out the official docs for info how to add the IAM user with correct permission policies. For the backups system to work, you should use AmazonS3FullAccess policy for the IAM user.

Make sure to copy both AWS Access Key ID and AWS Secret Access Key generated because we’ll need them later.

Let’s move on to writing the actual backup script:

Backup bash script

To avoid installing unnecessary heavy dependencies on Heroku dynos, we presign the S3 upload URL manually. GPG is used to encrypt the database dump after downloading it using Heroku CLI.

First, add a buildpack to enable Heroku CLI access from within the script:

heroku authorizations:create => TOKEN
heroku config:set HEROKU_API_KEY=[TOKEN]
heroku buildpacks:add heroku-community/cli

For security reasons, you might want to consider forking a CLI buildpack repo and use the fork instead.

Now you need to set S3 credentials you generated when adding an IAM user and your bucket name:

heroku config:set S3_KEY=[Your AWS Access Key ID]
heroku config:set S3_SECRET=[Your AWS Secret Access Key]
heroku config:set S3_BUCKET=[Your S3 bucket name]

You also need to set a secure password that will be used to encrypt the database dump files before uploading them to S3. You can use OpenSSL fot that:

heroku config:set PG_BACKUP_PASSWORD=$(openssl rand -base64 32)

Just make sure to save this password somewhere safe. Otherwise that destructive one-liner will also prevent you from decrypting the secondary backup.

You must also set your Heroku app name because it will be used by Heroku CLI to download the latest backup:

heroku config:set APP_NAME=app-name

Now let’s see the actual script bin/pg_backup_to_s3:

# Set the script to fail fast if there
# is an error or a missing variable

set -eu
set -o pipefail

#!/bin/sh

# Download the latest backup from
# Heroku and gzip it

heroku pg:backups:download --output=/tmp/pg_backup.dump --app $APP_NAME
gzip /tmp/pg_backup.dump

# Encrypt the gzipped backup file
# using GPG passphrase

gpg --yes --batch --passphrase=$PG_BACKUP_PASSWORD -c /tmp/pg_backup.dump.gz

# Remove the plaintext backup file

rm /tmp/pg_backup.dump.gz

# Generate backup filename based
# on the current date

BACKUP_FILE_NAME="heroku-backup-$(date '+%Y-%m-%d_%H.%M').gpg"

# Make sure to use the UTC
# date for S3 signature!

DATE=`date -R -u`

S3_PATH="/${S3_BUCKET_NAME}/${BACKUP_FILE_NAME}"

# Generate S3 signature needed
# to upload file to the bucket

S3_STRING="PUT\n\napplication/octet-stream\n${DATE}\n${S3_PATH}"
S3_SIGNATURE=`echo -en ${S3_STRING} | openssl sha1 -hmac ${S3_SECRET} -binary | base64`

# Upload the file to S3 using
# the signature auth header

curl -X PUT -T "/tmp/pg_backup.dump.gz.gpg" \
  -H "Host: ${S3_BUCKET_NAME}.s3-eu-west-1.amazonaws.com" \
  -H "Date: ${DATE}" \
  -H "Content-Type: application/octet-stream" \
  -H "Authorization: AWS ${S3_KEY}:${S3_SIGNATURE}" \
  https://${S3_BUCKET_NAME}.s3-eu-west-1.amazonaws.com/${BACKUP_FILE_NAME}

# Remove the encrypted backup file

rm /tmp/pg_backup.dump.gz.gpg

Depending on your S3 bucket location you might have to change s3-eu-west-1 to other region.

Now make the file executable by typing:

chmod +x bin/pg_backup_to_s3

Before you test it make sure you have the up to date backup by running:

heroku pg:backups:capture

You can try to run the script locally if you set all the correct shell variables. Just be careful if you are using zsh as your shell. I’ve had issues with generating the correct signature when working on zsh and had to switch to bash to make it work.

Now commit the changes to your repo and deploy them to Heroku. You can now test your script in Heroku environment by typing:

heroku run ./bin/pg_backup_to_s3

After the script execution you should see your encrypted backup file on the S3 bucket!

PostgreSQL backup file is on S3 now

Troubleshooting

The script is a bit complex so if you fail to upload the backup file to S3 you can try the following:

  • change set -eu to set -eux for more verbose script output
  • if S3 upload request returns SignatureDoesNotMatch error make sure that the signature string is the same as expected by AWS, also use bash not zsh
  • make sure that the date you are sending to S3 is in the UTC timezone
  • leave a comment, took me a while to make it work, happy to help here ;)

Use Heroku scheduler for automatic script execution

You can now run the backup script manually, let’s make it automatic. Heroku Scheduler is a cron like tool to run Heroku jobs in predefined time periods.

You can add it by typing:

heroku addons:create scheduler:standard
heroku addons:open scheduler

and configure it like that:

Heroku scheduler config UI

To make sure that the newest database dump will be stored daily you can schedule Heroku backup to take place just before the scheduler script execution:

heroku pg:backups:schedule DATABASE_URL --at '01:00'

That’s it. Now your Heroku PostgreSQL will be backed up daily to your own secure S3 bucket. You might also want to consider adding a bucket lifecycle rule to remove the older files and optimize storage costs.

How to restore Heroku S3 PostgreSQL backup

It is a good practice to double check that your backups actually work and can be restored in case they are needed. Let me show you how to do it.

AWS CLI configuration

AWS CLI will be needed to restore the backup. You can install it locally by following this tutorial.

Now authenticate the AWS CLI by running:

aws configure

and inputting your IAM user AWS Access Key ID and AWS Secret Access Key. You can just press ENTER when asked to provide Default region name and Default output format.

When it’s it up and running you can now generate a short-lived download URL for your encrypted backup file. Let’s assume that it’s S3 path is s3://heroku-secondary-backups/heroku-backup-2019-06-25_01.30.gpg. You can download it with the following command:

wget $(aws s3 presign s3://heroku-secondary-backups/heroku-backup-2019-06-25_01.30.gpg --expires-in 5) -O backup.gpg

Once you have it on your local disc you can decrypt it by running:

gpg --batch --yes --passphrase=$PG_BACKUP_PASSWORD -d backup.gpg | gunzip --to-stdout > backup.sql

PostgreSQL dump file format

Make sure that your dump file is in the correct plain text format and starts with the "PGDMP" string


Now you have to upload the decrypted version of a backup back to S3 bucket, use it to restore Heroku database and remove it from the bucket right after its been used. We will start with testing it out on a newly provisioned database add-on:

heroku addons:create heroku-postgresql:hobby-dev
aws s3 cp backup.sql s3://heroku-secondary-backups/backup.sql
heroku pg:backups:restore $(aws s3 presign s3://heroku-secondary-backups/backup.sql --expires-in 60) HEROKU_POSTGRESQL_GRAY_URL -a app-name
aws s3 rm s3://heroku-secondary-backups/backup.sql
Remember to replace HEROKU_POSTGRESQL_GRAY_URL with the URL of your newly provisioned database add-on. You can check out the Heroku docs if you run into trouble


You can now check if the content of your database looks correct by logging into it and running some queries:

heroku pg:psql HEROKU_POSTGRESQL_GRAY_URL

If everything looks OK you can now restore the backup file to your production database:

aws s3 cp backup.sql s3://heroku-secondary-backups/backup.sql
heroku pg:backups:restore $(aws s3 presign s3://heroku-secondary-backups/backup.sql --expires-in 60) DATABASE_URL -a app-name
aws s3 rm s3://heroku-secondary-backups/backup.sql

Alternatively, you could promote the new database add-on as your new primary database:

heroku pg:promote HEROKU_POSTGRESQL_GRAY_URL

Summary

I hope this blog post will help you secure your Heroku app data from random incidents. A secondary backup on a proprietary secure S3 bucket is the best practice that every startup and non-trivial side project should follow.

I am not too much into devops, so tips on how this tutorial could be improved are welcome.



Pawel Urbanek Full Stack Ruby on Rails developer avatar

I'm not giving away a free ebook, but you're still welcome to subscribe.


Back to top
Back to index