
Things usually work until they don’t. Sidekiq background job process can explode, quietly turn off, or get stuck for a variety of reasons. Random network errors, misconfigured email clients, shortage of RAM, or disk space on Redis to name a few. Adding a correct monitoring infrastructure can save you a lot of headaches and angry calls from customers. In this blog post, I’ll describe a simple way to monitor the uptime and responsiveness of Sidekiq processes in Rails apps.
Who’s watching the watcher?
In theory, you could monitor the Sidekiq process using the Sidekiq itself. Implementing similar classes could do the trick:
app/jobs/sidekiq_ping_job.rb
class SidekiqPingJob
include Sidekiq::Worker
sidekiq_options retry: false
SIDEKIQ_PING_KEY = "SIDEKIQ_LAST_PING_DATE".freeze
def perform
$redis.set(SIDEKIQ_PING_KEY, Time.current.to_s)
end
end
app/jobs/sidekiq_ping_check_job.rb
class SidekiqPingCheckJob
include Sidekiq::Worker
sidekiq_options retry: false
PING_THRESHOLD = 5.minutes
def perform
last_ping = $redis.get(SidekiqPingJob::SIDEKIQ_PING_KEY).to_datetime
if last_ping < PING_THRESHOLD.ago
raise "Sidekiq queue downtime!"
end
end
end
You can run those jobs periodically by using a Sidekiq Cron. I usually prefer it to Whenever or Clockwork gems. One huge advantage is that it doesn’t require an additional scheduler process. Config is simple and only requires adding a single file:
config/schedule.yml
sidekiq_ping:
cron: "*/5 * * * *"
class: "SidekiqPingJob"
queue: default
sidekiq_ping_check:
cron: "*/10 * * * *"
class: "SidekiqPingCheckJob"
queue: default
SidekiqPingJob
periodically updates a Redis entry with a current time, and SidekiqPingCheckJob
triggers an exception if the entry has not been updated for too long.
Can you spot the error of this setup? We’ve created a kind of a paradox situation. You’ll be notified that your Sidekiq is not responsive only if it is still responsive. If for some reason, the SidekiqPingCheckJob
is not executed you’ll never get notified about the downtime. You could try to configure a more_urgent
queue that, in theory, will be more responsive than the default
queue. But, in the end, you’re always constrained by the fact that it’s not possible to monitor the infrastructure from the inside correctly.
Cron monitoring to the rescue
Instead, you can delegate monitoring to a 3rd party tool. So-called cron monitors will work perfectly for our use case. I won’t recommend you any particular brand, so just Google “cron monitoring” and compare the feature set and pricing.
They all work similarly. You create a check, and it generates a unique URL. You also have to configure how often this URL is expected to receive a GET request. If a request does not arrive, you’ll get notified by email, Slack, or another channel of your choice.
Let’s see it in action:
app/jobs/sidekiq_ping_job.rb
class SidekiqPingJob
include Sidekiq::Worker
sidekiq_options retry: false
def perform
return unless (ping_url = ENV["SIDEKIQ_PING_URL"])
Net::HTTP.get(URI.parse(ping_url))
end
end
config/schedule.yml
sidekiq_ping:
cron: "*/5 * * * *"
class: "SidekiqPingJob"
queue: default
That’s it! You’ll be notified if the URL is not pinged at an expected interval.
Summary
Monitoring Sidekiq process uptime and responsiveness is a simple best practice that can save you a lot of trouble. Setup takes only a moment, and cron monitoring tools are cheap or even free of charge. I highly encourage you to implement it if you’re still not using it in your project.