UUID also known as GUID is an alternative primary key type for SQL databases. It offers some non-obvious advantages compared to standard integer-based keys. Rails 6 release fresh out of beta introduces a new feature in ActiveRecord that makes working with UUID primary keys more straightforward. In this tutorial, we will dive deep into UUIDs with all their cons and pros.
Advantages of using UUIDs over integers
UUID is a random string in a predefined format. Sample value looks like that:
ccbb63c0-a8cd-47b7-8445-5d85e9c80977
UUID is superior to integer-based primary keys in many ways. One caveat might be the size of database indexes, but for non-big-data tables, you shouldn’t notice the difference between integers and UUIDs.
Exposing non-public information in URLs
A primary key value is usually publicly discoverable in URLs and API network logs. In turn, everyone can roughly estimate the app’s resources, total number, and pace of growth. Do you really want to expose how many users are signing up to your service or how many products you are selling with public URLs like:
/orders/2234/checkout
/users/287/profile
This issue could be mitigated by adding slugs, but these are just duplicated unique keys with additional maintenance required.
Switching to UUID results in URLs that don’t expose any potentially confidential information:
/orders/cc7a4c8b-1a90-4287-a983-3f6e10bd88d4/checkout
/users/6b6cabb3-e37d-4dd1-ae18-a4eb893b07ae/profile
Access scoping bugs
Properly scoping access to resources in web apps with non-trivial business logic is hard. Rails makes it just too easy to ship code like:
instead of:
This example might seem obvious, but in apps with multiple user roles and complex logic for who can access what, it’s not always possible to completely prevent similar mistakes.
In the above example, if invoice ID was a UUID type, it would make it impossible for an attacker to sequentially scan integer ID values looking for a security hole. This simple change makes a range of potential security bugs extremely unlikely to exploit.
By no means, I claim that using UUID releases you from the necessity to scope access to resources in your web app. Still, it might save you in case a similar security loophole was discovered in your project.
Frontend “independence”
UUID primary keys allow frontend applications to independently generate new objects, together with IDs, without talking to the backend. A unique ID can be created using the JavaScript code, and the chances of collision with already existing objects are negligible.
This approach opens up a whole array of possibilities for frontend developers, e.g., to batch create objects together with their associations without the overhead of API calls.
How to start using UUID in Ruby on Rails apps
You can generate UUID with Ruby by running:
To enable UUID in PostgreSQL, you need to create the following migration:
Don’t forget to run it:
You can now configure new tables to use UUID for their primary keys:
Remember to set correct foreign key data type on relation models. For this sample case:
app/models/user.rb
app/models/comments.rb
Migration creating comments should look like that:
If you want all your future models to use UUID for primary keys by default you need to add the following file:
config/initializers/generators.rb
How to migrate a table from integer to UUID primary key?
Changing the primary key type in the table is not straightforward. You need to start by running a similar migration, that will create a new uuid
column. Then rename the old id
column to integer_id
, unset it as the primary key in favor of the new uuid
column after renaming it to id
.
I will not go into details on how to migrate associations because it will differ for every use case. You need to follow the similar steps of adding a new GUID type column and based on the value from old integer foreign key, you must assign correct UUID keys.
GUID implicit coercion
One surprising feature of ActiveRecord is how it treats invalid UUID values. Consider the following code snippet:
It generates the following SQL query:
Invalid UUID value is implicitly converted to nil
to avoid database level error. It should not affect your logic as long as you use UUIDs only as primary keys, because they cannot be nil
. Otherwise, you need to remember to cover these cases.
Another surprise will happen if you want to create objects with predefined UUID values that might be incorrect:
Running this code snippet will not raise an error but rather create a new user object, with id
randomly generated by the database.
UUID order issues
Before Rails 6, trying out UUID in your app might have been a bit discouraging. Apparently first
and last
ActiveRecord::Relation
methods no longer worked as expected, returning a seemingly random object from a collection.
Let’s take a look at an SQL query generated by running User.last
Integer primary keys are generated sequentially. We can safely assume that the most recently created object will have the highest ID value.
On the contrary, due to the totally random nature of UUID, it is generated in a non-sequential order. PostgreSQL can still sort them using the deterministic algorithm. It means that a single UUID from the table will always have the first place when sorting. Unfortunately, it has nothing to do with when it was generated compared to other UUID values from the same table.
It results in a seemingly buggy behavior of first
and last
methods before Rails 6 because, by default, they implicitly sort relations by ID values.
Rails 6 changes
Rails 6 introduced a new configuration option implicit_order_value
for ApplicationRecord
classes. You can use it like that:
With this setting running User.last
now generates the following query:
We’re back to an expected behavior of this method returning the most recently created object, even if it is using non-sequential UUID for a primary key.
The initial implementation implicit_order_column
had a potentially sneaky bug in it. In the case of nonunique values in created_at
column, running the above query would result in non-deterministic results. Timestamp values in Rails have millisecond precision, so it might seem unlikely that more than one object has exactly the same creation time. But batch creation of objects eg. using insert_all
method is a use case that could make it a standard situation.
My contribution to implicit ordering behavior
I can humblebrag that my pull request related to the mentioned issue has recently been merged to Rails master, and should go live in 6.0.2.
It modifies the behavior of implicit_order_column
to additionally sub sort results of the query by primary key if it’s available. It ensures deterministic results regardless of the potential duplicate values in the implicit order column. An SQL query generated by User.last
now looks like that:
Using custom implicit ordering in older Rails
Are you stuck on the older Rails version, but would like to start using implicit_order_column
right now? You can check out my new gem that backports this feature. It’s a bit rough, but I am using it without issues in my Rails 5 based side-project Abot.
Use UUID v6 to avoid ordering issues
Alternatively, you could use UUID v6 to generate sequentially ordered values. uuid_v6 gem allows you to generate them using Ruby. UUIDs v6 are partially generated based on the current time, so they are orderable.
But watch out because v6 values are significantly less random than v4, so they are more likely to be guessed by an attacker.
Summary
Switching to UUID as a default primary key type in your Rails apps is worth considering. I can hardly think of a use case that can only be handled by integer keys but not UUIDs. The other way around it is often the case. When creating a new model, you cannot imagine all the possible business logic requirements it will have to handle. Using UUID in the first place might spare you a cumbersome migration in the future.