UUID Primary Key in Rails 6 with PostgreSQL and Active Record

 
UUID in PostgreSQL ActiveRecord in Rails are presented by berries

UUID also known as GUID is an alternative primary key type for SQL databases. It offers some non-obvious advantages compared to standard integer-based keys. Rails 6 release fresh out of beta introduces a new feature in ActiveRecord that makes working with UUID primary keys more straightforward. In this tutorial, we will dive deep into UUIDs with all their cons and pros.

Advantages of using UUIDs over integers

UUID is a random string in a predefined format. Sample value looks like that:

ccbb63c0-a8cd-47b7-8445-5d85e9c80977

UUID is superior to integer-based primary keys in many ways. One caveat might be the size of database indexes, but for non-big-data tables, you shouldn’t notice the difference between integers and UUIDs.


Exposing non-public information in URLs

A primary key value is usually publicly discoverable in URLs and API network logs. In turn, everyone can roughly estimate the app’s resources, total number, and pace of growth. Do you really want to expose how many users are signing up to your service or how many products you are selling with public URLs like:

/orders/2234/checkout
/users/287/profile

This issue could be mitigated by adding slugs, but these are just duplicated unique keys with additional maintenance required.

Switching to UUID results in URLs that don’t expose any potentially confidential information:

/orders/cc7a4c8b-1a90-4287-a983-3f6e10bd88d4/checkout
/users/6b6cabb3-e37d-4dd1-ae18-a4eb893b07ae/profile

Access scoping bugs

Properly scoping access to resources in web apps with non-trivial business logic is hard. Rails makes it just too easy to ship code like:

class InvoicesController < ApplicationController
  # ...

  def show
    @invoice = Invoice.find(params.fetch(:id))
  end
end

instead of:

class InvoicesController < ApplicationController
  # ...

  def show
    @invoice = current_user.invoices.find(params.fetch(:id))
  end
end

This example might seem obvious, but in apps with multiple user roles and complex logic for who can access what, it’s not always possible to completely prevent similar mistakes.

In the above example, if invoice ID was a UUID type, it would make it impossible for an attacker to sequentially scan integer ID values looking for a security hole. This simple change makes a range of potential security bugs extremely unlikely to exploit.

By no means, I claim that using UUID releases you from the necessity to scope access to resources in your web app. Still, it might save you in case a similar security loophole was discovered in your project.

Frontend “independence”

UUID primary keys allow frontend applications to independently generate new objects, together with IDs, without talking to the backend. A unique ID can be created using the JavaScript code, and the chances of collision with already existing objects are negligible.

This approach opens up a whole array of possibilities for frontend developers, e.g., to batch create objects together with their associations without the overhead of API calls.

How to start using UUID in Ruby on Rails apps

You can generate UUID with Ruby by running:

require "securerandom"

SecureRandom.uuid => "b436517a-e294-4211-8312-8576933f2db1"

To enable UUID in PostgreSQL, you need to create the following migration:

class EnableUUID < ActiveRecord::Migration
  def change
    enable_extension 'pgcrypto'
  end
end

Don’t forget to run it:

rake db:migrate

You can now configure new tables to use UUID for their primary keys:

class CreateUsers < ActiveRecord::Migration
  def change
    create_table :users, id: :uuid  do |t|
      t.string :name
      t.timestamps
    end
  end
end

Remember to set correct foreign key data type on relation models. For this sample case:

app/models/user.rb

class User < ApplicationRecord
  has_many :comments
end

app/models/comments.rb

class Comment < ApplicationRecord
  belongs_to :user
end

Migration creating comments should look like that:

class CreateComments < ActiveRecord::Migration
  def change
    create_table :comments, id: :uuid  do |t|
      t.string :content
      t.uuid :user_id
      t.timestamps
    end

    add_index :comments, :user_id
  end
end

If you want all your future models to use UUID for primary keys by default you need to add the following file:

config/initializers/generators.rb

Rails.application.config.generators do |g|
  g.orm :active_record, primary_key_type: :uuid
end

How to migrate a table from integer to UUID primary key?

Changing the primary key type in the table is not straightforward. You need to start by running a similar migration, that will create a new uuid column. Then rename the old id column to integer_id, unset it as the primary key in favor of the new uuid column after renaming it to id.

class AddUUIDToUsers < ActiveRecord::Migration
  def up
    add_column :users, :uuid, :uuid, default: "gen_random_uuid()", null: false
    rename_column :users, :id, :integer_id
    rename_column :users, :uuid, :id
    execute "ALTER TABLE users drop constraint users_pkey;"
    execute "ALTER TABLE users ADD PRIMARY KEY (id);"

    # Optionally you remove auto-incremented
    # default value for integer_id column
    execute "ALTER TABLE ONLY users ALTER COLUMN integer_id DROP DEFAULT;"
    change_column_null :users, :integer_id, true
    execute "DROP SEQUENCE IF EXISTS users_id_seq"
  end

  def down
    raise ActiveRecord::IrreversibleMigration
  end

I will not go into details on how to migrate associations because it will differ for every use case. You need to follow the similar steps of adding a new GUID type column and based on the value from old integer foreign key, you must assign correct UUID keys.

GUID implicit coercion

One surprising feature of ActiveRecord is how it treats invalid UUID values. Consider the following code snippet:

User.find_by!(id: 'not-an-uuid')

It generates the following SQL query:

SELECT  "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2  [["id", nil], ["LIMIT", 1]]

Invalid UUID value is implicitly converted to nil to avoid database level error. It should not affect your logic as long as you use UUIDs only as primary keys, because they cannot be nil. Otherwise, you need to remember to cover these cases.

Another surprise will happen if you want to create objects with predefined UUID values that might be incorrect:

User.create!(id: 'not-an-uuid')

Running this code snippet will not raise an error but rather create a new user object, with id randomly generated by the database.

UUID order issues

Before Rails 6, trying out UUID in your app might have been a bit discouraging. Apparently first and last ActiveRecord::Relation methods no longer worked as expected, returning a seemingly random object from a collection.

Let’s take a look at an SQL query generated by running User.last

SELECT * FROM users ORDER BY id DESC LIMIT 1
SQL code has been simplified for brevity


Integer primary keys are generated sequentially. We can safely assume that the most recently created object will have the highest ID value.

On the contrary, due to the totally random nature of UUID, it is generated in a non-sequential order. PostgreSQL can still sort them using the deterministic algorithm. It means that a single UUID from the table will always have the first place when sorting. Unfortunately, it has nothing to do with when it was generated compared to other UUID values from the same table.

It results in a seemingly buggy behavior of first and last methods before Rails 6 because, by default, they implicitly sort relations by ID values.

Rails 6 changes

Rails 6 introduced a new configuration option implicit_order_value for ApplicationRecord classes. You can use it like that:

class User < ApplicationRecord
  self.implicit_order_column = "created_at"
  ...
end

With this setting running User.last now generates the following query:

SELECT * FROM users ORDER BY created_at DESC LIMIT 1

We’re back to an expected behavior of this method returning the most recently created object, even if it is using non-sequential UUID for a primary key.

The initial implementation implicit_order_column had a potentially sneaky bug in it. In the case of nonunique values in created_at column, running the above query would result in non-deterministic results. Timestamp values in Rails have millisecond precision, so it might seem unlikely that more than one object has exactly the same creation time. But batch creation of objects eg. using insert_all method is a use case that could make it a standard situation.

My contribution to implicit ordering behavior

I can humblebrag that my pull request related to the mentioned issue has recently been merged to Rails master, and should go live in 6.0.2.

It modifies the behavior of implicit_order_column to additionally sub sort results of the query by primary key if it’s available. It ensures deterministic results regardless of the potential duplicate values in the implicit order column. An SQL query generated by User.last now looks like that:

SELECT * FROM users ORDER BY created_at DESC, id DESC LIMIT 1

Using custom implicit ordering in older Rails

Are you stuck on the older Rails version, but would like to start using implicit_order_column right now? You can check out my new gem that backports this feature. It’s a bit rough, but I am using it without issues in my Rails 5 based side-project Abot.

Use UUID v6 to avoid ordering issues

Alternatively, you could use UUID v6 to generate sequentially ordered values. uuid_v6 gem allows you to generate them using Ruby. UUIDs v6 are partially generated based on the current time, so they are orderable.

require 'uuid_v6'

seq = UUIDv6::Sequence.new

uid = seq.call

# => "137ab4b3-3748-6090-80e3-0c859007c113"

sequential = 100.times.map { seq.call }
ordered = sequential.sort

sequential == ordered # => true

But watch out because v6 values are significantly less random than v4, so they are more likely to be guessed by an attacker.

Summary

Switching to UUID as a default primary key type in your Rails apps is worth considering. I can hardly think of a use case that can only be handled by integer keys but not UUIDs. The other way around it is often the case. When creating a new model, you cannot imagine all the possible business logic requirements it will have to handle. Using UUID in the first place might spare you a cumbersome migration in the future.



Back to index