Things usually work until they don’t. Sidekiq background job process can explode, quietly turn off, or get stuck for a variety of reasons. Random network errors, misconfigured email clients, shortage of RAM, or disk space on Redis to name a few. Adding a correct monitoring infrastructure can save you a lot of headaches and angry calls from customers. In this blog post, I’ll describe a simple way to monitor the uptime and responsiveness of Sidekiq processes in Rails apps.
Who’s watching the watcher?
In theory, you could monitor the Sidekiq process using the Sidekiq itself. Implementing similar classes could do the trick:
You can run those jobs periodically by using a Sidekiq Cron. I usually prefer it to Whenever or Clockwork gems. One huge advantage is that it doesn’t require an additional scheduler process. Config is simple and only requires adding a single file:
SidekiqPingJob periodically updates a Redis entry with a current time, and
SidekiqPingCheckJob triggers an exception if the entry has not been updated for too long.
Can you spot the error of this setup? We’ve created a kind of a paradox situation. You’ll be notified that your Sidekiq is not responsive only if it is still responsive. If for some reason, the
SidekiqPingCheckJob is not executed you’ll never get notified about the downtime. You could try to configure a
more_urgent queue that, in theory, will be more responsive than the
default queue. But, in the end, you’re always constrained by the fact that it’s not possible to monitor the infrastructure from the inside correctly.
Cron monitoring to the rescue
Instead, you can delegate monitoring to a 3rd party tool. So-called cron monitors will work perfectly for our use case. I won’t recommend you any particular brand, so just Google “cron monitoring” and compare the feature set and pricing.
They all work similarly. You create a check, and it generates a unique URL. You also have to configure how often this URL is expected to receive a GET request. If a request does not arrive, you’ll get notified by email, Slack, or another channel of your choice.
Let’s see it in action:
That’s it! You’ll be notified if the URL is not pinged at an expected interval.
Monitoring Sidekiq process uptime and responsiveness is a simple best practice that can save you a lot of trouble. Setup takes only a moment, and cron monitoring tools are cheap or even free of charge. I highly encourage you to implement it if you’re still not using it in your project.