Share
Share on Twitter
Share on Facebook
Share on LinkedIn

Why You Should Migrate your Heroku Postgres Database to AWS RDS

 
Migrating Heroku PostgreSQL addon database to Amazon RDS is represented by a van. Photo by Nubia Navarro (nubikini) from Pexels


Heroku PostgreSQL addon is excellent for a quick start setup of a new project. Once your web app matures, then migrating to an alternative database engine like Amazon RDS should be considered.

In this blog post, I’ll describe the benefits and drawbacks of using AWS RDS instead of the default Heroku addon. I’ll also compare the pricing and explain why projects that care about EU GDRP compliance should avoid using the Heroku database.

I write this blog post in the context of the default Heroku public spaces. Private spaces are an enterprise feature, and the pricing starts from $1000/month.

Heroku PostgreSQL addon and GDPR

Disclaimer: I am not a lawyer, and this article does not constitute legal advice.

I’ll start with potentially the most critical issue that many Heroku Postgres addon users might not be aware of. Even if your application is provisioned in the Heroku Europe region, its backups, Dataclips, and logs are still stored in the United States. Here’s an excerpt of my discussion with Heroku support about it:

Slow requests detected by Scout APM

Transcript:

Me:

Hi. I would like to know what is the physical location of the postgresql addon database backups. I have a Heroku app provisioned in EU. Does it mean that none my data is stored in the US?

Heroku Support:

When a database gets provisioned, the data associated with that database is stored within the region in which it’s created. However, a number of services that are ancillary to Heroku Postgres as well as the systems that manage the fleet of databases may not be located within the same region as the provisioned databases. Here are some:

Postgres Continuous Protection for disaster recovery stores the base backup and write-ahead logs in the same region that the database is located.

Application logs are routed to Logplex, which is hosted in the US. In addition to logs from your application, this includes System logs and Heroku Postgres logs from any database attached to your application.

Logging of Heroku Postgres queries and errors can be blocked by using the –block-logs flag when creating the database with heroku addons:create heroku-postgres:…

PG Backup snapshots are stored in the US.

Dataclips are stored in the US.

========================

European Union GDPR regulations are clear that European location is preferred for personally identifiable data of your users. If you want to be compliant, than moving away to a different database must be planned.

For some of my clients, the discovery that their users’ data is kept in the United States was an instant dealbreaker and a good enough reason to ditch the Heroku Postgres addon. Too bad, Heroku does not announce data location more readily in their docs.

You can read my other blog post for more info about GDPR compliance for web apps.

Is there a performance overhead of using the RDS database on Heroku?

Heroku uses AWS as an infrastructure provider. They currently offer two regions for public spaces: the United States and Europe. Europe is located in the AWS Ireland region (eu-west-1), United States in AWS North Virginia (us-east-1). Because underlying hardware and network are the same, there’s no performance impact of using the RDS with the Heroku application dynos.

Is RDS as secure as the Heroku addon?

PostgreSQL on RDS can be significantly more secure than what Heroku has to offer. All the databases provisioned in Heroku are publicly accessible from any IP address. All you need to connect to the Heroku database instance is a valid connection URL:

psql "postgres://heroku_user:[email protected]:5432/database_name"

# psql (12.4, server 12.4 (Ubuntu 12.4-1.pgdg16.04+1))

SSL connection is enforced so you cannot connect with an SSL mode explicitly disabled:

psql "postgres://heroku_user:[email protected]:5432/database_name?sslmode=disable"

# psql: error: could not connect to server: FATAL:

Contrary to RDS, you cannot use an SSL verify-full mode because Heroku does not offer a CA certificate for database instances outside of the Private Spaces. It means that any time you directly connect to the Heroku database using psql or heroku pg:psql commands, you’re susceptible to a Man in the Middle attack. Check out those two somewhat dated but apparently still exploitable sources for more details:

MitM-ing Heroku Postgres

postgres-mitm script

If you provision an RDS database, it also has to be publicly accessible to talk to the Heroku dynos. A significant difference is that you can provide a link to database CA certificate and specify verify-full mode, which effectively protects you from MITM attempts:

psql "postgres://rds_user:[email protected]:5432/database_name?sslmode=verify-full&sslrootcert=config/amazon-rds-ca-cert.pem"

# psql (12.4 (Ubuntu 12.4-1.pgdg18.04+1), server 12.4)

You can check out this link for more info about different SSL modes in PostgreSQL

Bonus: meet your neighbors

Things get interesting if you’re using a free Hobby-dev plan of the Heroku PostgreSQL plugin. You can run the following query to see user, process and database names of people sharing the same node:

select datname, usename, application_name from pg_stat_activity;

 -- dxxxxxxxxxxxxx | kxxxxxxxxxxxxx | PostgreSQL JDBC Driver
 -- dyyyyyyyyyyyyy | kyyyyyyyyyyyyy | /app/vendor/bundle/ruby/2.6.0/bin/rake
 -- dzzzzzzzzzzzzz | kzzzzzzzzzzzzz | sidekiq 6.1.1 app [0 of 5 busy]
 -- ...
On my test instance I've had ~250 "neighbors"


Not a security threat, but maybe it’s something you’d like to keep in mind if you’re using the free plan. Worth noting that AWS offers one year of free RDS instance that is not shared with anyone.

Pricing

Pricing for both solutions is comparable with a growing advantage on the side of RDS for more expensive plans. Let’s compare two highly-available database instances of 8GB and 16GB RAM each.

I choose m5 instance type that’s most suitable for general purpose database workloads. Contrary to a straightforward Heroku pricing page, RDS does not make it easy to calculate the total cost. Your best bet is to configure an instance in the console to display the final pricing. You can also use Amazon RDS Instance Comparison tool but remember to add 20% to cover the storage costs.

You don’t need to start with provisioned disk IOPS (input/output operations per second). General Purpose SSD comes with 3 IOPS/GB and burst ability, which should be enough for most cases.

RDS prices are for the eu-west-1 Ireland region.

Heroku RDS
Name Premium 2 db.m5.large
RAM 8GB 8GB
Storage 256GB 256GB
Max connections 400 Custom
Price/month $350 $318


Heroku RDS
Name Premium 3 db.m5.xlarge
RAM 16GB 16GB
Storage 512GB 512GB
Max connections 500 Custom
Price/month $750 $637


It looks like we’ve got a winner! If your database has 16GB and more, switching to RDS will bring in considerable savings.

It is worth noting that Heroku enforces a per plan connections limit that cannot be increased. On RDS, you can customize the number of max connections using max_connections setting. The Heroku database instance could stop scaling just because of the connections limit, forcing you to apply convoluted solutions like pg_bouncer. In RDS, you can tweak the maximum number of connections and other config variables to max out the currently provisioned hardware’s performance and throughput.

You can also cut the RDS costs by ~40% if you commit to a year or more upfront with Reserved Instances. From my experience, it usually not possible to predict database specs requirements so far in the future but your case could be different.

Monitoring and alerts

Heroku offers very limited ways to monitor database metrics. Console tools like:

heroku-pg-extras

pg-diagnose

give you point in time insights into database statistics. But since its text-only output it is cumbersome, to integrate them into automatic monitoring and alerts solutions.

BTW I’ve ported features offered by heroku-pg-extras to several programming languages. You can read my previous blogpost for more info about it.

On contrary, RDS shines when it comes to monitoring and alerts toolkit. With a few clicks, you can build informative Cloudwatch dashboards integrated with SNS alerts to the channel your of choice:

Abot AWS Cloudwatch dashboard requests detected by Scout APM

Abot for Slack AWS Cloudwatch dashboard. Web servers and PG stats at a glance.


Abot AWS SNS custom Slack alert

Custom AWS SNS Slack alert


But wait, there’s more. RDS offers an optional Enhanced Monitoring feature. Honestly, so far, Cloudwatch and pg-extras were enough for me to resolve database issues, and I did not have a chance to deep dive into those metrics. Still, they look super smart and useful:

Abot RDS enhanced monitoring metrics

RDS enhanced monitoring metrics

Other features

Dataclips

The one feature I’m missing in RDS after moving away from Heroku is Dataclips. They are perfect for giving instant data insights for less technical members of your team.

There’s a whole range of UI tools for PostgreSQL that can somehow replicate the same feature set, but none is as handy as Heroku Dataclips themselves.

Backups

Both solutions offer automated backups and point in time recovery. Personally, I find the RDS backups mechanism to be more robust and easier to work with.

The biggest drawback of the backups solution that Heroku offers is that they are all part of the PostgreSQL plugin instance. It means that accidental or malicious removal of your database plugin would irreversibly remove your database and all its backups. You can read more about this potentially disastrous issue and how to prevent it by adding secondary backups in my other blogpost.

Do you need a dev-ops experience to use AWS RDS?

Things cannot get any easier then Heroku PostgreSQL plug and play approach. RDS is also a fully-managed platform but requires a slightly more involved setup.

Choosing the instance type, storage type, configuring the security groups, and tweaking PostgreSQL settings… It might seem a bit overwhelming if you’ve never worked with AWS before.

If you want to do the migration, but your team is missing the required AWS/DevOps skills, then we’ve got you covered. You can now pre-order the new eBook with 40% discount. The guide describes all the steps required to safely move your Heroku PostgreSQL database to AWS RDS. Zero prior AWS experience is required to complete the guide. All the steps are described in a detailed way together with code snippets and AWS console screenshots.

If you don’t want to pre-order right now but would like to get notified when the book is out you can follow me on Twitter or join the mailing list.

Once you’re up and running, the most significant change when working with RDS instead of the Heroku database is that you can no longer use handy Heroku CLI commands.

Instead, you’ll need to get familiar with standard PostgreSQL CLI tools like psql, pg_restore, etc. Dev ops or not, every developer should be know who to use it, so taking your time to master them is a excellent investment.

Summary

If your application is past the proof of concept stage, I’d recommend switching Heroku addon to RDS. For the cost of a one-time setup, you’ll get a cheaper database engine that’s superior in security, compliance and robustness.

Thanks for making it to the end. Time for a bit of self-marketing:

If you want to rethink your app’s database infrastructure but you’re not sure how to proceed, you can get in touch. I’m currently available for consulting gigs.



Pawel Urbanek Full Stack Ruby on Rails developer avatar

I'm not giving away a free ebook, but you're still welcome to subscribe to the mailing list.


Back to index