By default, Rails ActiveRecord executes most of the SQL queries in a non-optimal way. In this blog post, I’ll describe how to fix this issue to speed up bottlenecks and reduce memory usage.
Bloated ActiveRecord queries
Active Record empowers developers to write fairly complex database queries without understanding the underlying SQL. Unfortunately, the same ActiveRecord is a reason for the majority of the issues that I encounter while conducting my Rails performance audits. Next to the infamous N+1 calls, “bloated” queries are another common problem.
Let’s see it in action:
It is common for core models in legacy Rails projects to consist of dozens of columns. In the above example, we fetch 200 rows from the database and display info based on the logic implemented in a
SQL query executed in the controlled looks like that:
SELECT "users".* FROM "users" ORDER BY "updated_at" ASC LIMIT 200
Can you spot the issue now? A little hint:
By default, ActiveRecord fetches ALL the model columns. But, in the above example, we only really need the
blocked_until columns for our view to display correctly. If you only care about raw data, a recommended practice is using the ActiveRecord
pluck method to avoid the overhead of instantiating model objects.
But, in this case, we need the model logic. So to improve performance, we can use a
The above implementation would produce the following SQL query:
SELECT "users"."id", "users"."blocked_until" FROM "users" ORDER BY "updated_at" ASC LIMIT 200
pluck, chaining a
select method would still return ActiveRecord object instances, so you’ll be able to call methods implemented in the model. One catch is that if you try to read a value from a column that was not fetched from the database, an
ActiveModel::MissingAttributeError would be raised.
How to reduce ActiveRecord RAM usage
Now that we’ve discussed how to slim down bloated queries, let’s measure the potential impact on memory usage. memory_profiler gem offers a simple API for measuring in-memory object allocations.
Let’s start with checking how many objects are allocated when we fetch 1000 instances of a sample model with 12 columns:
0;to prevent IRB from printing the query output to the console and skew memory usage results.
MemoryProfiler outputs loads of data, but we’re interested in the top few lines:
Total allocated: 1723042 bytes represents memory usage increase caused by running the code snippet. The absolute value is not important, but it will serve as a reference point when we compare more efficient ways to run this query. The absolute memory usage would vary based on total columns number and the model implementation.
Let’s now see how memory usage looks if we
pluck two columns:
That’s a ~90% decrease! But, as previously mentioned,
pluck could be limiting in some cases since it returns raw data instead of model objects. So let’s now measure the memory impact of using
select to fetch only two columns:
That’s ~65% less than the original
* query. And you can still interact with full model APIs as long as non-fetched column values are not accessed.
Another advantage of putting bloated SQL queries on a diet is that fetching less columns will decrease the number of disk-read operations. Database read IOPS are sometimes a bottleneck for apps with larger traffic. Fetching fewer columns is a relatively straightforward way to improve this metric.
Like with most SQL-related issues, the benefits of fine-tuning queries will only provide measurable effect once the app and dataset reach a certain scale. But, I’ve seen significant performance and memory usage improvements from limiting columns fetched from the database. You can check out my blog post about rails-pg-extras to learn how to find queries that are likely to benefit from this optimization technique.