Running Background Jobs in Ruby on Rails Revisited

3 comments Comments

Cbq
CBQ
21
May

A while back, we wrote an article on Running Background Jobs in Ruby on Rails.

The Ruby on Rails framework has a number of tools for running your code outside of the web-request, including the venerable script/runner for one-off tasks, but using them can be a little heavy on your server. If you want to run a task on the minute, or on demand, script/runner will load your entire Rails environment, which can be from 20-50 MB, depending on how many libraries and how much code you’re pulling in.

There are also a few other good guides, recipes, and libraries that we’ve mentioned before, including:

We’ve found that it’s not terribly hard to build your own job server that runs continuously in the background and can handle all kinds of jobs, including those that should run on a specified interval. Here’s how we did it.

We’re going to make use of the Daemons gem, so install it first:


sudo gem install daemons

Let’s go ahead and build in two types of jobs:
  • those that Run Once (immediately) and
  • those that Run on an interval (every x seconds or minutes or days)

We’ll use ActiveRecord’s Single Table Inheritance (STI) to handle both types of jobs and dictate their differing behaviors.

Create a PeriodicJob model:


script/generate model PeriodicJob type:string \
 job:text interval:integer last_run_at:datetime

And migrate up. Now, fill in the PeriodicJob#run! method:


# app/models/periodic_job.rb
class PeriodicJob < ActiveRecord::Base

  # Runs a job and updates the +last_run_at+ field.
  def run!
    begin
      eval(self.job)
    rescue Exception
      logger.error "'#{self.job}' could not run: #{$!.message}\n#{$!.backtrace}" 
    end
    self.last_run_at = Time.now.utc
    self.save  
  end

end

Note that we’re using Time.now.utc so as not to cause confusion—our AR is configured to use UTC by default.

Now, let’s create the subclass for Run Once jobs and let it inherit from our PeriodicJob model. We’ll add two more class methods to it, including a finder and a cleanup method:


# app/models/run_once_periodic_job.rb
class RunOncePeriodicJob < PeriodicJob

  # RunOncePeriodicJobs run if they have no PeriodicJob#last_run_at time.
  def self.find_all_need_to_run
    self.find(:all, :conditions => ["last_run_at IS NULL"])
  end

  # Cleans up all jobs older than a day.
  def self.cleanup
    self.destroy_all ['last_run_at < ?', 1.day.ago]
  end

end

Now let’s define the Run on an Interval Job and add the interval specific finder:


# app/models/run_interval_periodic_job.rb
class RunIntervalPeriodicJob < PeriodicJob

# RunIntervalPeriodicJobs run if PeriodicJob#last_run_at time plus 
# PeriodicJob#interval (in seconds) is past the current time (Time.now).
  def self.find_all_need_to_run
    self.find(:all).select {|job| job.last_run_at.nil? || 
      (job.last_run_at + job.interval <= Time.now.utc)}
  end

end

Now, let’s write some tests, to make it clear how it should work:


# test/unit/periodic_job_test.rb
require File.dirname(__FILE__) + '/../test_helper'

class PeriodicJobTest < Test::Unit::TestCase
  fixtures :periodic_jobs

  def test_should_run_job
    assert_nothing_thrown { periodic_jobs(:run_once_job).run! }
  end

  def test_should_find_run_once_job
    assert RunOncePeriodicJob.find_all_need_to_run.include?(periodic_jobs(:run_once_job))
  end

  def test_should_not_find_run_job_already_run
    assert !RunOncePeriodicJob.find_all_need_to_run.include?(periodic_jobs(:run_once_job_to_be_deleted))
  end

  def test_should_find_run_interval_job
    assert RunIntervalPeriodicJob.find_all_need_to_run.include?(periodic_jobs(:run_interval_job_needs_run))        
  end

  def test_should_not_find_run_interval_job_not_within_interval
    assert !RunIntervalPeriodicJob.find_all_need_to_run.include?(periodic_jobs(:run_interval_job_does_not_need_run))
  end

  def test_should_cleanup_old_jobs
    jobs_count = RunOncePeriodicJob.count

    assert periodic_jobs(:run_once_job_to_be_deleted).last_run_at
    RunOncePeriodicJob.cleanup

    assert jobs_count - 1, RunOncePeriodicJob.count
  end

end

Here are our fixtures that setup the scenarios:


# test/fixtures/periodic_jobs.yml
run_once_job:
  id: 1
  type: RunOncePeriodicJob
  job: 'what = "w00t once!"'
run_interval_job_needs_run:
  id: 2
  type: RunIntervalPeriodicJob
  interval: 60
  job: 'what = "w00t on the minute dood!"'
  last_run_at: <%= (Time.now.utc - 5.minutes).to_s(:db) %>
run_interval_job_does_not_need_run:
  id: 3
  type: RunIntervalPeriodicJob
  interval: 60
  job: 'what = "w00t on the minute dood!"'
  last_run_at: <%= (Time.now.utc - 5).to_s(:db) %>
run_once_job_to_be_deleted:
  id: 4
  type: RunOncePeriodicJob
  job: 'what = "w00t once!"'
  last_run_at: <%= (Time.now.utc - 8.days).to_s(:db) %>
run_interval_job_needs_run_never_run_before:
  id: 5
  type: RunIntervalPeriodicJob
  interval: 60
  job: 'what = "w00t on the minute dood!"'

Now, we have a built in system for running Periodic Jobs. Note that all we have to do is create a new Periodic Job with the actual code we would normally toss to script/runner in the PeriodicJob#code field, and when we call the PeriodicJob#run! method, it will evaluate it.

We now need a way to always run a background task server to check these PeriodicJobs and run them.

Create a file called task_server.rb in your script directory.


# script/task_server.rb
#!/usr/bin/env ruby
#
# Background Task Server
#
# Relies on ActiveRecord PeriodicJob and STI table (periodic_jobs):
#
# type:         string    ("RunOncePeriodicJob", or "RunIntervalPeriodicJob")
# interval:     integer   (in seconds)
# job:          text      (actual ruby code to eval)
# last_run_at:  datetime  (stored time of last run)
#
# Main algorithm is daemon process runs every XX seconds, wakes up and
# looks for jobs. Jobs placed in the RunOncePeriodicJob queue are run 
# immediately (if no last_run_at time) and stored until they are cleaned up 
# (deleted). Jobs placed in the RunIntervalPeriodicJob queue are run if: 
# their last_run_at time + their interval (in seconds) is past the current 
# time (Time.now).
#

options = {}
ARGV.options do |opts|

  opts.on( "-e", "--environment ENVIRONMENT", String,
           "The Rails Environment to run under." ) do |environment|
    options[:environment] = environment
  end

  opts.parse!
end

RAILS_ENV = options[:environment] || 'development'  

require File.dirname(__FILE__) + '/../config/environment.rb'

if RAILS_ENV  "development" or RAILS_ENV  “test” 
  SLEEP_TIME = 10
else
  SLEEP_TIME = 60
end

	


loop do
  # Find all Run Once jobs, and run them
  RunOncePeriodicJob.find_all_need_to_run.each do |job|
    job.run!
  end

  # Find all Run on Interval jobs, and run them  
  RunIntervalPeriodicJob.find_all_need_to_run.each do |job|
    job.run!
  end

  # Cleans up periodic jobs, removes all RunOncePeriodicJobs over one
  # day old.
  RunOncePeriodicJob.cleanup

  sleep(SLEEP_TIME)
end

That’s it. Now, we create a control script using the daemons gem.


# script/task_server_control.rb
#!/usr/bin/env ruby
#
# Background Task Server Control - A daemon for running jobs
#

require 'rubygems'
require 'daemons'

options = {}

default_pid_dir = "/var/run/task_server" 

if File.exists?(default_pid_dir)
  options[:dir_mode] = :normal
  options[:dir] = default_pid_dir
end

Daemons.run(File.dirname(__FILE__) + '/../script/task_server.rb', options)

Create an optional /var/run/task_server dir if you’re running on a server (in production mode):


mkdir -p /var/run/task_server
chown deploy:deploy /var/run/task_server

We can start it up in the normal server mode, as a daemon (using the start/stop commands, or we can start it up in interactive mode (so we can see the results) using the run command:


ruby script/task_server_control.rb run

In another window, add some jobs:


ruby script/console
>> RunOncePeriodicJob.create(:job => 'puts "This job will only run once."')
=> #...
RunIntervalPeriodicJob.create(:job => 'puts "This job runs every 30 seconds, and it ran: #{Time.now.utc}"', :interval => 30)
=> #...

You should see the task_server_control.rb file running these jobs as the task server wakes up.

And now, it wouldn’t be complete without some Capistrano support to enable restarting after we make code changes to the model, and to allow start/stop/restart:


# config/deploy.rb

# In case you're running on multiple app servers,
# we define the task_server to make sure that 
# jobs only run on one server.
role :task_server, "app_server1.example.com" 

namespace :background_task_server do

  task :setup, :roles => :task_server do
    run "mkdir -p /var/run/task_server" 
    run "chown #{user}:#{group} /var/run/task_server" 
  end

  # start background task server
  task :start, :roles => :task_server do
    run "#{current_path}/script/task_server_control.rb start -- -e production" 
  end

  # stop background task server
  task :stop, :roles => :task_server do
    run "#{current_path}/script/task_server_control.rb stop -- -e production" 
  end

  # start background task server
  task :restart, :roles => :task_server do
    # TODO: since restart won't cold_start, we could read call to status, if 
    # it returns:
    #    task_server.rb: no instances running
    # we could simply issue the start command
    run "#{current_path}/script/task_server_control.rb restart -- -e production" 
  end

end

# optional:
# after "deploy", "background_task_server:restart" 


Note the use of the task_server, so you can simply allow one app server to be your task server (if you’re running on multiple servers).

And now, because I’m feeling generous, let’s set monit up to monitor your task server, so that if it ever goes down for some strange reason, monit should boot it back up (this also ensures that restarts will boot your task server back up):


# /etc/monit.d/task_server.conf

check process task-server with pidfile /var/run/task_server/task_server.rb.pid
  group task-server
  start program = "/usr/bin/ruby /var/www/apps/example/current/script/task_server_control.rb start -- --environment=production" 
  stop program  = "/usr/bin/ruby /var/www/apps/example/current/script/task_server_control.rb stop -- --environment=production" 

That’s it! If there’s any interest in forming a plugin around this with generators to create the migration and models, I’ll give it a go and stick it on github.

Feedback appreciated!

Comments

  1. Charles Brian Quinn said about 14 hours later:

    This approach described above processes pending tasks sequentially at any one go. So, this means, if the interval wake up period is low (every 60 seconds in the example code above), you have a lot of 60 second interval jobs, they’ll be processed one after each other.

    This ends up working well in that you are assured you never have two of the same jobs running at the same time. Imagine running a job that sends out email notifications: it ensures that you don’t ever duplicate those notifications if you have to select or find all those that need notifications as part of the job.

    So, that also means that having several potentially long-running tasks may not be the best fit for this processor, as well, though.

    This background processor is not good for very time-sensitive tasks, i.e. something has to run exactly every 30 seconds or right on the :00 and :30 intervals—it actually attempts to run the tasks as close to 30 seconds as possible (assuming your interval is setup low enough), but generally it is better to think about the interval as the minimum amount of time between runs—a task with an interval of 30 seconds will not run if it has been run in the last 30 seconds.

    Other than that, for some real Scout results, check out our blog post by Derek on our sister blog entitled How We Handle Background Jobs: http://blog.scoutapp.com/articles/2008/05/22/how-we-handle-background-jobs

  2. Tammer Saleh said about 20 hours later:

    It seems like a lock column would be a good idea for the jobs table – set to true when starting the job, false when done, and don’t run the job again if it’s set.

  3. aleco said 1 day later:

    Thanks a lot for this article. I’m closely watching the various attempt for daemonizing processes (there are a few projects on github, such as daemon_generator, rufus-scheduler, workling or background-fu), and they all seem to have their advantages and disadvantages.

    As for your solution, there are two things I’d like to see (btw, I like your approach of using monit instead of another ruby process like daemon_generator does):

    1) scheduled tasks, similar to cronjobs (rufus can do cronjobs btw). An example: I have a small app that polls latest stock data from finance.yahoo.com. So I’d like to create a job that runs monday to friday 6pm. And if for some reason the stock data isn’t available yet, I’d like the daemon to run new attempts every 30min until the data is finally available. Besides that, there’s a task I run each night at 2am to do some calculations and send some daily emails. Etc.

    My current solution is to run a daemon every 30sec and then check which time and day it is, to then decide if the daemon goes to sleep again or if it should do something. I don’t like that approach :)

    ...but this leads to problem #2:

    2) tasks are saved in the DB. I’m unsure if I like that idea. I’d prefer to e.g. have a single yaml file for recurring tasks that contains all information about the task, aka when to run it (e.g. every 30min between 6pm and 10pm monday-friday) and which method to call. Otherwise, how would you ever delete a job without opening a sql console?

    Besides that, and maybe I’m missing something, but I still wonder why rails has no simple way of listening to ‘events’. Aka methods can send events, and others listen to these events and act accordingly. This feels much dryer than keeping class methods public only because we need to trigger them from e.g. a daemon process.

(leave url/email »)