UUID also known as GUID is an alternative primary key type for SQL databases. It offers some non-obvious advantages compared to standard integer-based keys. Phoenix provides reliable support for working with UUID using its Ecto PostgreSQL adapter. In this tutorial, we will dive deep into UUIDs with all their cons and pros.
Benefits of using UUIDs instead of integers
UUID is a random string in a predefined format. Sample value looks like that:
ccbb63c0-a8cd-47b7-8445-5d85e9c80977
UUID is superior to integer-based primary keys in many ways. One caveat might be the size of database indexes, but for non-big-data tables, you shouldn’t notice the difference between integers and UUIDs.
Exposing non-public information in URLs
A primary key value is usually publicly discoverable in URLs and API network logs. In turn, everyone can roughly estimate the app’s resources, total number, and pace of growth. Do you really want to expose how many users are signing up to your service or how many products you are selling with public URLs like:
/orders/2234/checkout
/users/287/profile
This issue could be mitigated by adding slugs, but these are just duplicated unique keys with additional maintenance required.
Switching to UUID results in URLs that don’t expose any potentially confidential information:
/orders/cc7a4c8b-1a90-4287-a983-3f6e10bd88d4/checkout
/users/6b6cabb3-e37d-4dd1-ae18-a4eb893b07ae/profile
Access scoping bugs
Properly scoping access to resources in web apps with non-trivial business logic is hard. It’s possible to ship code like:
instead of:
This example might seem obvious, but in apps with multiple user roles and complex logic for who can access what, it’s not always possible to completely prevent similar mistakes.
If invoice ID was a UUID type in the above example, it would make it impossible for an attacker to sequentially scan integer ID values looking for a security hole. This simple change makes a range of potential security bugs extremely unlikely to exploit.
By no means, I claim that using UUID releases you from the necessity to scope access to resources in your web app. Still, it might save you in case a similar security loophole was discovered in your project.
Frontend “independence”
UUID primary keys allow frontend applications to independently generate new objects, together with IDs, without talking to the backend. A unique ID can be created using the JavaScript code, and the chances of collision with already existing objects are negligible.
This approach opens up a whole array of possibilities for frontend developers, e.g., to batch create objects together with their associations without the overhead of API calls.
UUID formats in Elixir Ecto
You can generate UUID with Elixir by running:
Under the hood string format of UUID is converted to binary when interacting with the database layer. To generate a sample binary UUID run:
You can convert one format to another in the following way:
How to start using UUID in Elixir Phoenix apps
First, generate the following migration:
priv/repo/migrations/20201217143615_create_users.exs
Don’t forget to run it:
Now add the corresponding schema module:
lib/your_app/accounts/user.ex
You can use string UUID format when building queries. Just remember always to prefix even the literal values with ^
because otherwise, you’ll get a somewhat cryptic error:
Generating UUID using PostgreSQL function
autogenerate: true
configures Ecto to generate random UUID values for newly created objects. Alternatively, you could generate UUIDs at the PostgreSQL level by editing your migration like that:
If you decide to do it, remember to use the returning
option when inserting new objects. Without it, Ecto would not fetch default column values generated by PostgreSQL functions.
Let’s have a look at the resulting SQL query with and without returning: [:id]
config:
returning
optionreturning
optionUsing UUID as a default
If you’re starting a new project and would like all your schemas to leverage UUIDs for primary keys without customizing their migrations, you need to add the following line to your config file.
config/config.exs
If you did not create the project yet, you could use a --binary-id
flag to configure it automatically:
How to migrate a table from integer to UUID primary key?
Changing the primary key type in the table is not straightforward. You need to start by running a similar migration that will create a new uuid
column. Then rename the old id
column to integer_id
, unset it as the primary key in favor of the new uuid
column after renaming it to id
.
I will not detail how to migrate associations because it will differ for every use case. You need to follow the similar steps of adding a new GUID type column and based on the value from the old integer foreign key, you must assign correct UUID keys.
Caveats of working with UUID
Binary logs
You only need the string format to construct the UUID queries. Unfortunately, currently, Ecto displays the binary format in the logs, so you’ll need to do the manual conversion to work with them. I’ve opened a PR trying to improve it.
Implicit ordering issue
first
and last
Query.API methods may seem broken when used with UUIDs. By default, they sort objects based on their primary key.
Integer primary keys are generated sequentially. We can assume that the most recently created object will have the highest ID value.
On the contrary, due to UUID’s totally random nature, it is generated in a non-sequential order. PostgreSQL can still sort them using the deterministic algorithm. It means that a single UUID from the table will always have a first place when sorting. Unfortunately, it has nothing to do with when it was generated compared to other UUID values from the same table.
It results in a seemingly buggy behavior of first and last methods. To remediate that, make sure to always explicitly pass the column name by which you want to sort your collection. In most cases, you’ll probably want to use inserted_at
instead of id
. One catch here is that the inserted_at
column is not guaranteed to be unique. To avoid inconsistent results, always make sure to subsort your results by primary key:
BTW you can check out my EctoExtras package. It implements a set of simple helper functions that I find missing from the default Ecto implementation. With EctoExtras
, the above example could look like that:
Use UUID v6 to avoid ordering issues
Alternatively, you could use UUID v6 to generate sequentially ordered values. uuid_utils package allows you to generate them in Elixir. UUIDs v6 are partially generated based on the current time, so they are orderable.
But watch out because v6 values are significantly less random than v4, so they are more likely to be guessed by an attacker.
Summary
Switching to UUID as a default primary key type in your Elixir Phoenix apps is worth considering. I can hardly think of a use case that can only be handled by integer keys but not UUIDs. The other way around it is often the case. When creating a new model, you cannot imagine all the possible business logic requirements it will have to handle. Using UUID in the first place might spare you a cumbersome migration in the future.