How to Monitor Sidekiq Process Uptime in Rails Apps

Monitoring Sidekiq background process in Rails apps is represented by this line Photo by Luan Rezende from Pexels

Things usually work until they don’t. Sidekiq background job process can explode, quietly turn off, or get stuck for a variety of reasons. Random network errors, misconfigured email clients, shortage of RAM, or disk space on Redis to name a few. Adding a correct monitoring infrastructure can save you a lot of headaches and angry calls from customers. In this blog post, I’ll describe a simple way to monitor the uptime and responsiveness of Sidekiq processes in Rails apps.

Who’s watching the watcher?

In theory, you could monitor the Sidekiq process using the Sidekiq itself. Implementing similar classes could do the trick:

app/jobs/sidekiq_ping_job.rb

class SidekiqPingJob
  include Sidekiq::Worker
  sidekiq_options retry: false

  SIDEKIQ_PING_KEY = "SIDEKIQ_LAST_PING_DATE".freeze

  def perform
    $redis.set(SIDEKIQ_PING_KEY, Time.current.to_s)
  end
end

app/jobs/sidekiq_ping_check_job.rb

class SidekiqPingCheckJob
  include Sidekiq::Worker
  sidekiq_options retry: false

  PING_THRESHOLD = 5.minutes

  def perform
    last_ping = $redis.get(SidekiqPingJob::SIDEKIQ_PING_KEY).to_datetime

    if last_ping < PING_THRESHOLD.ago
      raise "Sidekiq queue downtime!"
    end
  end
end

You can run those jobs periodically by using a Sidekiq Cron. I usually prefer it to Whenever or Clockwork gems. One huge advantage is that it doesn’t require an additional scheduler process. Config is simple and only requires adding a single file:

config/schedule.yml

sidekiq_ping:
  cron: "*/5 * * * *"
  class: "SidekiqPingJob"
  queue: default
sidekiq_ping_check:
  cron: "*/10 * * * *"
  class: "SidekiqPingCheckJob"
  queue: default

SidekiqPingJob periodically updates a Redis entry with a current time, and SidekiqPingCheckJob triggers an exception if the entry has not been updated for too long.

Can you spot the error of this setup? We’ve created a kind of a paradox situation. You’ll be notified that your Sidekiq is not responsive only if it is still responsive. If for some reason, the SidekiqPingCheckJob is not executed you’ll never get notified about the downtime. You could try to configure a more_urgent queue that, in theory, will be more responsive than the default queue. But, in the end, you’re always constrained by the fact that it’s not possible to monitor the infrastructure from the inside correctly.

Cron monitoring to the rescue

Instead, you can delegate monitoring to a 3rd party tool. So-called cron monitors will work perfectly for our use case. I won’t recommend you any particular brand, so just Google “cron monitoring” and compare the feature set and pricing.

They all work similarly. You create a check, and it generates a unique URL. You also have to configure how often this URL is expected to receive a GET request. If a request does not arrive, you’ll get notified by email, Slack, or another channel of your choice.

Let’s see it in action:

app/jobs/sidekiq_ping_job.rb

class SidekiqPingJob
  include Sidekiq::Worker
  sidekiq_options retry: false

  def perform
    return unless (ping_url = ENV["SIDEKIQ_PING_URL"])

    Net::HTTP.get(URI.parse(ping_url))
  end
end

config/schedule.yml

sidekiq_ping:
  cron: "*/5 * * * *"
  class: "SidekiqPingJob"
  queue: default

That’s it! You’ll be notified if the URL is not pinged at an expected interval.

Summary

Monitoring Sidekiq process uptime and responsiveness is a simple best practice that can save you a lot of trouble. Setup takes only a moment, and cron monitoring tools are cheap or even free of charge. I highly encourage you to implement it if you’re still not using it in your project.