UUID Primary Key in Elixir Phoenix with PostgreSQL and Ecto

UUID in PostgreSQL Ecto and Phoenix are presented by sheep Photo by Massimiliano Martini on Unsplash

UUID also known as GUID is an alternative primary key type for SQL databases. It offers some non-obvious advantages compared to standard integer-based keys. Phoenix provides reliable support for working with UUID using its Ecto PostgreSQL adapter. In this tutorial, we will dive deep into UUIDs with all their cons and pros.

Benefits of using UUIDs instead of integers

UUID is a random string in a predefined format. Sample value looks like that:

ccbb63c0-a8cd-47b7-8445-5d85e9c80977

UUID is superior to integer-based primary keys in many ways. One caveat might be the size of database indexes, but for non-big-data tables, you shouldn’t notice the difference between integers and UUIDs.

Exposing non-public information in URLs

A primary key value is usually publicly discoverable in URLs and API network logs. In turn, everyone can roughly estimate the app’s resources, total number, and pace of growth. Do you really want to expose how many users are signing up to your service or how many products you are selling with public URLs like:

/orders/2234/checkout
/users/287/profile

This issue could be mitigated by adding slugs, but these are just duplicated unique keys with additional maintenance required.

Switching to UUID results in URLs that don’t expose any potentially confidential information:

/orders/cc7a4c8b-1a90-4287-a983-3f6e10bd88d4/checkout
/users/6b6cabb3-e37d-4dd1-ae18-a4eb893b07ae/profile

Access scoping bugs

Properly scoping access to resources in web apps with non-trivial business logic is hard. It’s possible to ship code like:

defmodule YourAppWeb.InvoiceController do
  # ...

  def show(conn, %{"id" => id}) do
    invoice = Repo.get(Invoice, id)

    render(conn, "show.html", invoice: invoice)
  end
end

instead of:

defmodule YourAppWeb.InvoiceController do
  # ...

  def show(conn, %{"id" => id}, current_user) do
    invoice = Repo.one(
                from a in Invoice,
                where: a.id == ^id and a.user_id == ^current_user.id
              )

    render(conn, "show.html", invoice: invoice)
  end
end

This example might seem obvious, but in apps with multiple user roles and complex logic for who can access what, it’s not always possible to completely prevent similar mistakes.

If invoice ID was a UUID type in the above example, it would make it impossible for an attacker to sequentially scan integer ID values looking for a security hole. This simple change makes a range of potential security bugs extremely unlikely to exploit.

By no means, I claim that using UUID releases you from the necessity to scope access to resources in your web app. Still, it might save you in case a similar security loophole was discovered in your project.

Frontend “independence”

UUID primary keys allow frontend applications to independently generate new objects, together with IDs, without talking to the backend. A unique ID can be created using the JavaScript code, and the chances of collision with already existing objects are negligible.

This approach opens up a whole array of possibilities for frontend developers, e.g., to batch create objects together with their associations without the overhead of API calls.

UUID formats in Elixir Ecto

You can generate UUID with Elixir by running:

Ecto.UUID.generate()
# "b436517a-e294-4211-8312-8576933f2db1"

Under the hood string format of UUID is converted to binary when interacting with the database layer. To generate a sample binary UUID run:

Ecto.UUID.bingenerate()
# <<72, 143, 56, 198, 127, 12, 78, 152, 141, 149, 101, 55, 70, 15, 236, 224>>

You can convert one format to another in the following way:

Ecto.UUID.dump("b436517a-e294-4211-8312-8576933f2db1")

# {:ok, <<180, 54, 81, 122, 226, 148, 66, 17, 131, 18, 133, 118, 147, 63, 45, 177>>}

Ecto.UUID.load(<<180, 54, 81, 122, 226, 148, 66, 17, 131, 18, 133, 118, 147, 63, 45, 177>>)

# {:ok, "b436517a-e294-4211-8312-8576933f2db1"}

How to start using UUID in Elixir Phoenix apps

First, generate the following migration:

priv/repo/migrations/20201217143615_create_users.exs

defmodule YourApp.Repo.Migrations.CreateUsers do
  use Ecto.Migration

  def change do
    create table(:users, primary_key: false) do
      add :id, :uuid, primary_key: true, null: false

      timestamps()
    end
  end
end

Don’t forget to run it:

mix ecto.migrate

Now add the corresponding schema module:

lib/your_app/accounts/user.ex

defmodule YourApp.Accounts.User do
  use Ecto.Schema
  import Ecto.Changeset

  @primary_key {:id, Ecto.UUID, autogenerate: true}

  schema "users" do
    timestamps()
  end
end

You can use string UUID format when building queries. Just remember always to prefix even the literal values with ^ because otherwise, you’ll get a somewhat cryptic error:

# Incorrect

Repo.one(
  from a in Area,
  where: a.id == "1e24165e-7f1d-4169-8ebe-869b7d1b7c90"
)

# ** (EXIT from #PID<0.470.0>) shell process exited with reason: an exception was raised:
# ** (ArgumentError) argument error
# (stdlib 3.13.2) :io.put_chars(:standard_io, :unicode

Repo.one(
  from a in Area,
  where: a.id == ^"1e24165e-7f1d-4169-8ebe-869b7d1b7c90"
)

# %YourApp.Accounts.User{ ... }

Generating UUID using PostgreSQL function

autogenerate: true configures Ecto to generate random UUID values for newly created objects. Alternatively, you could generate UUIDs at the PostgreSQL level by editing your migration like that:

defmodule YourApp.Repo.Migrations.CreateUsers do
  use Ecto.Migration

  def change do
    execute "CREATE EXTENSION IF NOT EXISTS pgcrypto"

    create table(:users, primary_key: false) do
      add :id, :uuid, primary_key: true, null: false, default: fragment("gen_random_uuid()")

      timestamps()
    end
  end
end

If you decide to do it, remember to use the returning option when inserting new objects. Without it, Ecto would not fetch default column values generated by PostgreSQL functions.

  def create_user(attrs) do
    %User{}
    |> User.changeset(attrs)
    |> Repo.insert(returning: [:id])
  end

Let’s have a look at the resulting SQL query with and without returning: [:id] config:

INSERT INTO "users" ("email","inserted_at","updated_at")
  VALUES ($1,$2,$3)
  ["[email protected]", ~N[2021-01-14 15:46:49], ~N[2021-01-14 15:46:49]]

Insert query without returning option

INSERT INTO "users" ("email","inserted_at","updated_at")
  VALUES ($1,$2,$3) RETURNING "id"
  ["[email protected]", ~N[2021-01-14 15:46:49], ~N[2021-01-14 15:46:49]]

Insert query with returning option

Using UUID as a default

If you’re starting a new project and would like all your schemas to leverage UUIDs for primary keys without customizing their migrations, you need to add the following line to your config file.

config/config.exs

config :your_app, yourapp.repo, migration_primary_key: [type: :uuid]

If you did not create the project yet, you could use a --binary-id flag to configure it automatically:

mix phx.new your_app --binary-id

How to migrate a table from integer to UUID primary key?

Changing the primary key type in the table is not straightforward. You need to start by running a similar migration that will create a new uuid column. Then rename the old id column to integer_id, unset it as the primary key in favor of the new uuid column after renaming it to id.

defmodule YourApp.Repo.Migrations.UsersPrimaryKey do
  use Ecto.Migration

  def up
    alter table("users") do
      add :uuid, :uuid, default: fragment("gen_random_uuid()"), null: false
    end

    rename_column table("users"), :id, to: :integer_id
    rename_column table("users"), :uuid, to: :id

    execute "ALTER TABLE users drop constraint users_pkey;"
    execute "ALTER TABLE users ADD PRIMARY KEY (id);"

    # Optionally you remove auto-incremented
    # default value for integer_id column
    execute "ALTER TABLE ONLY users ALTER COLUMN integer_id DROP DEFAULT;"

    alter table("users") do
      modify :integer_id, :bigint, null: true
    end

    execute "DROP SEQUENCE IF EXISTS users_id_seq"
  end

  def down do
    raise Ecto.MigrationError, "Irreversible migration"
  end
end

I will not detail how to migrate associations because it will differ for every use case. You need to follow the similar steps of adding a new GUID type column and based on the value from the old integer foreign key, you must assign correct UUID keys.

Caveats of working with UUID

Binary logs

You only need the string format to construct the UUID queries. Unfortunately, currently, Ecto displays the binary format in the logs, so you’ll need to do the manual conversion to work with them. I’ve opened a PR trying to improve it.

Binary UUID format displayed in Elixir Phoenix logs

Binary UUID format displayed in Phoenix logs

Implicit ordering issue

first and last Query.API methods may seem broken when used with UUIDs. By default, they sort objects based on their primary key.

Integer primary keys are generated sequentially. We can assume that the most recently created object will have the highest ID value.

On the contrary, due to UUID’s totally random nature, it is generated in a non-sequential order. PostgreSQL can still sort them using the deterministic algorithm. It means that a single UUID from the table will always have a first place when sorting. Unfortunately, it has nothing to do with when it was generated compared to other UUID values from the same table.

It results in a seemingly buggy behavior of first and last methods. To remediate that, make sure to always explicitly pass the column name by which you want to sort your collection. In most cases, you’ll probably want to use inserted_at instead of id. One catch here is that the inserted_at column is not guaranteed to be unique. To avoid inconsistent results, always make sure to subsort your results by primary key:

# Without UUIDs

User |> first |> Repo.one

# With UUIDs

User |> first([:inserted_at, :id]) |> Repo.one

BTW you can check out my EctoExtras package. It implements a set of simple helper functions that I find missing from the default Ecto implementation. With EctoExtras, the above example could look like that:

# Without UUIDs

Repo.first(User)

# With UUIDs

Repo.first(User, [:inserted_at, :id])

Use UUID v6 to avoid ordering issues

Alternatively, you could use UUID v6 to generate sequentially ordered values. uuid_utils package allows you to generate them in Elixir. UUIDs v6 are partially generated based on the current time, so they are orderable.

sequential = 1..100 |> Enum.map(fn _ -> UUID.uuid6() end)
ordered = Enum.sort sequential

sequential == ordered # => true

But watch out because v6 values are significantly less random than v4, so they are more likely to be guessed by an attacker.

Summary

Switching to UUID as a default primary key type in your Elixir Phoenix apps is worth considering. I can hardly think of a use case that can only be handled by integer keys but not UUIDs. The other way around it is often the case. When creating a new model, you cannot imagine all the possible business logic requirements it will have to handle. Using UUID in the first place might spare you a cumbersome migration in the future.