Ruby MRI does not support parallel CPU bound operations due to the dependency on non-thread safe C extensions. Input/Output operations like HTTP requests, are still a perfectly valid use case for spinning up multiple threads. Read on to learn what tools are available for requests concurrency in Ruby with all their cons and pros.
Global Interpreter Lock and blocking I/O
Let’s start by describing what’s a blocking I/O. Long story short, any operation that does not directly use the CPU cycles from its thread, but instead delegates the work to external processes is a blocking I/O. Typical examples in the context of Ruby on Rails web apps are SQL database queries, reading/writing to files, or HTTP requests.
To see what’s the practical difference between the CPU bound operation and blocking I/O check out the following code snippets. I encourage you to run them in your local IRB.
The first example of generating random strings is a CPU bound operation. Spinning up multiple threads does not affect its execution time. In Ruby MRI, Global Interpreter Lock (GIL) works as a mutex, not allowing several Ruby threads to run in parallel.
The latter is performing external HTTP calls (blocking I/O), so every new thread effectively parallelizes the execution, significantly reducing its duration.
The behavior will be different for JRuby and Rubinius because they don’t use GIL, but we’ll focus solely on MRI in this tutorial. To dive deeper into the topic of concurrency in the different flavors of Ruby, I can highly encourage somehow dated, but still surprisingly relevant eBook - Working with Ruby threads.
Now that we know what GIL, CPU, and I/O bound parallel operations are all about, let’s find out how this knowledge can be used in practice.
Case study: Slack API Requests
Abot is largely dependent on Slack API. All of the interactions with the anonymous bot command or UI interface can issue multiple HTTP requests. ScoutAPM does a good job alerting when some API endpoints are taking too long and where they are spending most of the time.
Occasionally Slack API responded slower than usual, accounting for most of the endpoint’s execution time. Slack requires bot users backend APIs to respond within a maximum of three seconds, so it was necessary to optimize the faulty endpoints.
The culprit was the following part of Abot UI, fetching both public and private channels data via two separate Slack API HTTP calls.
Code responsible for this part of the app was implemented in a
Each of the HTTP call was blocking the single main thread:
Let’s now discuss different approaches to parallelizing the HTTP calls:
Native Ruby Threads
A straightforward approach to solving this issue could be to rewrite the code as follows:
Every request is executed it its own thread, which can run in parallel because it is a blocking I/O. But can you see a catch here?
If you cannot, that’s exactly the point.
get_channels_list method implementations are potentially non-thread safe, and there is no simple way to validate it. You could check out the method’s implementation details but in a typical Ruby project it’s turtles all the way down, with the usual excess of external dependencies.
There’s no way of knowing if some gem down the call stack uses a shared mutable state, or a mutex that can cause a deadlock.
Concurrent Ruby gem promises
A somewhat better approach in terms of thread safety could be to use one of the concurrency abstractions offered by a popular concurrent-ruby library. For the price of yet another gem, you get some thread safety guarantees with an honest warning that:
“No concurrency library for Ruby can ever prevent the user from making thread safety mistakes…“
The discussed code example rewritten using promises would look like that:
Even if thread safety is somehow expected, there are still risks associated with parallelizing Ruby code execution. One worth mentioning is exhausting the SQL database connections pool.
Multithreading and Rails SQL database pool
Every Ruby process has a limit on how many connections it can establish to the database. Each spawned thread requires a new connection. Performing an SQL query in parallelized blocks of code could quickly deplete the process pool. As discussed before, it can be challenging to guarantee that code down the stack trace would never connect to the database.
A case when the pool exhaustion scenario is highly probably is parallelizing code execution within Sidekiq jobs. A single Sidekiq process usually executes jobs in a couple of threads. If any of those jobs also uses the threads internally, then M x N database connections are required where M is a Sidekiq process concurrency and N number of threads spawned by a job.
Unless necessary, you should always avoid spawning new threads within Sidekiq jobs. Usually, a better approach might be to spawn more fine-tuned jobs, than to spawn threads inside a job.
Another scenario when depleting the pool can happen is using threads when handling a request in a Puma server. Due to its multithreaded nature, Puma works similarly to Sidekiq. Number of Puma worker threads competing for a database connection can easily exceed the available pool, crashing your Rails production servers.
Probably the safest solution would be to stick with concurrency only on the layer of the HTTP requests. An excellent tool for that is a Typhoeus gem.
Its Hydra API allows dispatching parallelized requests. Contrary to the previous approaches, Typhoeus does not spawn Ruby threads but instead uses a cURL multithreading capabilities.
Because no Ruby threads are used, this solution doesn’t pose any of the risks described above. Multithreading is precisely scoped to the I/O section of the app, so shared state, the potential deadlock or sneaky SQL queries exhausting the database pool are not possible.
Implementing Typhoeus requires a bit more code changes than the previous solutions:
If you plan your HTTP client architecture upfront for Typhoeus and its Hydra API, then implementation details could be abstracted away.
All of the proposed solutions are equivalent in terms of performance. For the case described, the speedup was close to 100%, because two requests were parallelized. With more subsequent requests, e.g. when paginating over a large collection, the performance gain could be even more considerable. It’s always worth using a performance monitoring tool when applying similar optimizations to your project and observe the results.
Multithreading has to be applied with care. Spinning up threads to make code “run faster because concurrency” is a recipe for disaster. Every scenario is different, but for parallelizing HTTP requests, I would recommend sticking with Typhoeus Hydra as the safest of all the described methods.