How to Improve Rails Performance with Subquery Caching

Slow database queries are a common performance bottleneck for Ruby on Rails apps. Simplifying a complex query is often not possible due to the underlying business logic. Instead, you can extract parts of a query, cache and reuse them to improve performance. In this tutorial, I’ll describe a range of techniques on how to do it.

We will cover more advanced ActiveRecord and PostgreSQL topics like EXPLAIN ANALYZE, and query plan visualizer tool.

Let’s get started!

When subquery caching makes sense for slow ActiveRecord queries?

Subquery caching is a bit more convoluted solution for optimizing slow queries. Before you decide to implement it you should double-check if more straightforward fixes could help instead:

add missing database indexes and to avoid inefficient Seq Scan on large collections
make sure that the database server is adequately scaled and has a decent cache hit rate. You can use my RailsPGExtras gem to validate that
check if PostgreSQL configuration is correct for the type of the database server that you’re using. PGTune is your friend here
validate that your queries are not locking for too long, e.g., because of too wide ActiveRecord transaction scope

You can read out my other blogpost for an overview of tips for optimizing slow Rails queries.

If none of the above approaches helps, let’s discuss how you can split slow queries into the reusable parts.

Sample database structure

How to determine which part of a query to cache?

Unfortunately, there’s no simple answer to this question. It’s totally dependent on your application’s logic.

For this tutorial, we will assume the following database structure to explore some common scenarios:

Sample PostgreSQL database structure

Image generated with dbdiagram.io

app/models/user.rb

class User < ApplicationRecord
  has_many :group_assignments, dependent: :destroy
  has_many :groups, through: :group_assignments
  has_many :posts
end

app/models/group.rb

class Group < ApplicationRecord
  has_many :group_assignments, dependent: :destroy
  has_many :users, through: :group_assignments
  has_many :posts
end

app/models/post.rb

class Post < ApplicationRecord
  belongs_to :user
  belongs_to :group

  def to_json
    {
      id: id,
      text: text,
      user_email: user.email,
      group_name: group.name
    }
  end
end

app/models/group_assignment.rb

class GroupAssignment < ApplicationRecord
  self.table_name = "group_assignments"

  belongs_to :user
  belongs_to :group
end

User can belong to many Groups with a different role via the GroupAssignment join model. Posts are published in a Group by a User.

Let’s now explore potential slow queries that could originate in a system with similar schema.

Analyzing slow queries

We will focus on optimizing a single controller. It returns not approved posts that were published in groups that a current user is an admin in. It also validates if a user is an admin in at least one group.

A similar complexity level requiring data from four and more tables to execute a single query is common in non-trivial Rails apps.

We’re using eager_load because relation tables are used for both filter query and post’s JSON structure. It ensures that a single query will be executed to fetch and instantiate all the records eliminating N+1 issue.

app/models/moderation_controller.rb

class PostsModerationController < ApplicationController
  before_action :authorize_admin!

  def index
    render json: posts_to_moderate.to_json
  end

  def approve
    post = posts_to_moderate.find(
      params.fetch(:id)
    )
    post.approve!

    render json: post.to_json
  end

  private

  def authorize_admin!
    admin_groups.present? || (raise AccessDenied)
  end

  def posts_to_moderate
    Post.joins(:user, :group)
    .where(
      group: admin_groups,
      approved: false
    )
    .where.not(user: current_user)
  end

  def admin_groups
    Group.joins(:group_assignments)
    .where(
      "role = 'admin' and user_id = ?",
      current_user.id
    )
  end
end

admin_groups returns a relation object so two queries are merged together. The final query from posts_to_moderate method generates the following SQL:

SELECT "posts"."id" AS t0_r0, "posts"."user_id" AS t0_r1, "posts"."group_id" AS t0_r2, "posts"."approved" AS t0_r3, "posts"."featured" AS t0_r4, "posts"."text" AS t0_r5, "users"."id" AS t1_r0, "users"."email" AS t1_r1, "groups"."id" AS t2_r0, "groups"."name" AS t2_r1
FROM "posts"
LEFT OUTER JOIN "users" ON "users"."id" = "posts"."user_id"
LEFT OUTER JOIN "groups" ON "groups"."id" = "posts"."group_id" WHERE "posts"."group_id" IN (
  SELECT "groups"."id" FROM "groups"
  INNER JOIN "group_assignments" ON "group_assignments"."group_id" = "groups"."id"
    WHERE (role = 'admin' and user_id = 67478)
  )
AND "posts"."approved" = FALSE AND "posts"."user_id" != 67478

and this EXPLAIN output:

 Nested Loop Left Join  (cost=18.01..18.24 rows=1 width=96)
   ->  Nested Loop Left Join  (cost=17.59..17.77 rows=1 width=79)
         ->  Nested Loop  (cost=17.17..17.29 rows=1 width=50)
               Join Filter: (posts.group_id = group_assignments.group_id)
               ->  HashAggregate  (cost=16.75..16.76 rows=1 width=12)
                     Group Key: groups_1.id
                     ->  Nested Loop  (cost=0.71..16.75 rows=1 width=12)
                           ->  Index Scan using index_group_assignments_on_user_id_and_group_id on group_assignments  (cost=0.29..8.31 rows=1 width=4)
                                 Index Cond: (user_id = 67478)
                                 Filter: ((role)::text = 'admin'::text)
                           ->  Index Only Scan using groups_pkey on groups groups_1  (cost=0.42..8.44 rows=1 width=8)
                                 Index Cond: (id = group_assignments.group_id)
               ->  Index Scan using index_posts_on_group_id on posts  (cost=0.42..0.52 rows=1 width=50)
                     Index Cond: (group_id = groups_1.id)
                     Filter: ((NOT approved) AND (user_id <> 67478))
         ->  Index Scan using users_pkey on users  (cost=0.42..0.48 rows=1 width=29)
               Index Cond: (id = posts.user_id)
   ->  Index Scan using groups_pkey on groups  (cost=0.42..0.47 rows=1 width=17)
         Index Cond: (id = posts.group_id)

I doubt you want to analyze this output line by line. I'm pasting to for reference to compare with the output after the optimization

You can also analyze your query using a visualizer tool. Check out my other tutorial for tips on how to do it.

PostgreSQL EXPLAIN ANALYZE visualizer query plan

All the tests were carried out on a database seeded with ~200k objects in each table.

When analyzing the visualizer plans, it’s always worth noting how many Join operations the query performs. One way to speed up query execution is to limit the number of joins.

Let’s learn how to split bottleneck queries into more performant and cacheable pieces.

Extracting query parts to instance variables

The simplest extraction you can do is to fetch IDs of admin groups and reuse them for both authorization method and posts fetching. You don’t even need the actual group objects. You can read only the join model, significantly reducing the complexity. Just make sure to have all the foreign keys in place. You can read more about ensuring data integrity on the database level in my other blog post.

app/models/moderation_controller.rb

class PostsModerationController < ApplicationController
  before_action :authorize_admin!

  def index
    render json: posts_to_moderate.to_json
  end

  def approve
    post = posts_to_moderate.find(
      params.fetch(:id)
    )
    post.approve!

    render json: post.to_json
  end

  private

  def authorize_admin!
    admin_group_ids.present? || (raise AccessDenied)
  end

  def posts_to_moderate
    Post.joins(:user, :group)
    .where(
      group_id: admin_group_ids,
      approved: false
    ).where.not(user_id: current_user.id)
  end

  def admin_group_ids
    @admin_group_ids ||= GroupAssignment.where(
      role: 'admin',
      user_id: current_user.id
    ).pluck(:group_id)
  end
end

This implementation will still perform two queries like the initial one, but they will be significantly less complex. Results of the query checking admin memberships are reused in posts fetching query, reducing the need for two join operations!

Let’s see how it looks under the EXPLAIN ANALYZE microscope:

SELECT "posts"."id" AS t0_r0, "posts"."user_id" AS t0_r1, "posts"."group_id" AS t0_r2, "posts"."approved" AS t0_r3, "posts"."featured" AS t0_r4, "posts"."text" AS t0_r5, "users"."id" AS t1_r0, "users"."email" AS t1_r1, "groups"."id" AS t2_r0, "groups"."name" AS t2_r1
FROM "posts"
LEFT OUTER JOIN "users" ON "users"."id" = "posts"."user_id"
LEFT OUTER JOIN "groups" ON "groups"."id" = "posts"."group_id"
WHERE "posts"."group_id" = 67478 AND "posts"."approved" = FALSE AND "posts"."user_id" != 67478

 Nested Loop Left Join  (cost=1.26..25.33 rows=1 width=96)
   Join Filter: (groups.id = posts.group_id)
   ->  Nested Loop Left Join  (cost=0.84..16.88 rows=1 width=79)
         ->  Index Scan using index_posts_on_group_id on posts  (cost=0.42..8.44 rows=1 width=50)
               Index Cond: (group_id = 67478)
               Filter: ((NOT approved) AND (user_id <> 67478))
         ->  Index Scan using users_pkey on users  (cost=0.42..8.44 rows=1 width=29)
               Index Cond: (id = posts.user_id)
   ->  Index Scan using groups_pkey on groups  (cost=0.42..8.44 rows=1 width=17)
         Index Cond: (id = 67478)

Optimized EXPLAIN ANALYZE query plan

Query plan after optimization

PostgreSQL EXPLAIN ANALYZE visualizer query plan

For reference, the query plan before optimization

As you can see, both EXPLAIN query output and visualizer plan are now simpler. We’ve eliminated two join operations from potentially bottleneck query.

The best part is that we did not have to change the business logic of our app at all. Learning to spot reusable parts of Active Record queries is an insanely useful technique for optimizing the database performance of non-trivial Rails apps.

BTW be careful to always cache primitive values like arrays of integer IDs. If you cache an ActiveRecord::ActiveRelation object and try to reuse it in the other query, it will just make the final query more complex by adding a subquery. Also, caching complex objects instead of primitives, will consume more memory.

Is fewer queries always better?

Not necesarily. In the above example number of queries remains the same. The result of one query was reused in the other, so it was a clear win.

In some cases, two more complex queries might share a common part. Extracting it into a separate third query and then reusing could lead to a better performance despite the additional database call. Make sure always to measure the results of your optimizations with a dataset resembling production.

Caching parts of queries in Rails cache

Using instance variables is the most straightforward caching technique. The caveat is that the value still has to be calculated for every request. In some cases, storing the value in a periodically refreshed, persistent cache might be a better solution.

To calculate admin group memberships for a given user only once every 15 minutes you could use Rails cache:

  def admin_group_ids
    @admin_group_ids ||= begin
      Rails.cache.fetch(
        "admin_group_ids_user_#{current_user.id}",
        expires_in: 15.minutes
      ) do
        GroupAssignment.where(
          role: 'admin',
          user_id: current_user.id
        ).pluck(:group_id)
      end
    end
  end

To avoid blocking the webserver thread, you can also consider refreshing the cached values in the recurring Sidekiq job:

class User::AdminGroupsRefreshJob
  include Sidekiq::Worker

  def perform
    User.pluck(:id).each do |user_id|
      admin_group_ids = GroupAssignment.where(
        role: 'admin',
        user_id: user_id
      ).pluck(:group_id)

      Rails.cache.write(
        "admin_group_ids_user_#{user_id}",
         admin_group_ids
      )
    end
  end
end

For larger collections, you should consider splitting refresh operation into more small-scale jobs

Alternatively, you could manually refresh the cache for each user whenever his group memberships are updated, e.g., using ActiveRecord callbacks.

Check out the official docs for in-depth info on how to use Rails cache.

Using materialized views to cache parts of logic

Another approach you could use to cache parts of your queries is materialized views. If the logic is more complex you can build a materialized view containing the cached data.

Make sure to use materialized views, not standard ones. Non materialized views are just cached queries, not the actual data, so they don’t give any performance benefit.

Materialized views have to be refreshed periodically but are fully readable during the refresh. A detailed explanation of how to use materialized views is outside of the scope of this tutorial. Check out the Scenic gem documentation for more info.

Summary

I’ve seen impressive performance improvements from caching the subqueries. Just remember that adding a cache layer always increases the complexity and is a potential source of sneaky bugs. Make sure to start with more straightforward SQL optimization techniques first before trying to implement caching.

How to Improve ActiveRecord Query Performance with Subquery Caching