Monday, August 15, 2011

Replace DelayedJob with Resque

Background task in Ruby on Rails

One of our task of migrating background task from Delayed_job server to the Resque server. But I have kept this task with low priority in my task list queue.

But since last week, our IT team is facing an issue with Delayed_job server. Our delayed_job server is getting crashed 20 times in 24 hours.

I havn't done the root cause analysis. but I have decided to take this an opportunity to replace delayed_job to resque server on high priority.

Is is not mean that Resque is better than Delayed_job. They have their own features. The working functionality of both server is same.

As Resque is best fitting in our requirements I have choose Resque in place of Delayed_job.

Also our IT team don't have any way or clue get the list of background tasks which are completed, which are pending and which are currently in process.

Why Resque server:

Well following info is sufficient from page https://github.com/defunkt/resque

Resque vs DelayedJob

How does Resque compare to DelayedJob, and why would you choose one over the other?

  • Resque supports multiple queues
  • DelayedJob supports finer grained priorities
  • Resque workers are resilient to memory leaks / bloat
  • DelayedJob workers are extremely simple and easy to modify
  • Resque requires Redis
  • DelayedJob requires ActiveRecord
  • Resque can only place JSONable Ruby objects on a queue as arguments
  • DelayedJob can place any Ruby object on its queue as arguments
  • Resque includes a Sinatra app for monitoring what's going on
  • DelayedJob can be queried from within your Rails app if you want to add an interface

If you're doing Rails development, you already have a database and ActiveRecord. DelayedJob is super easy to setup and works great. GitHub used it for many months to process almost 200 million jobs.

Choose Resque if:

  • You need multiple queues
  • You don't care / dislike numeric priorities
  • You don't need to persist every Ruby object ever
  • You have potentially huge queues
  • You want to see what's going on
  • You expect a lot of failure / chaos
  • You can setup Redis
  • You're not running short on RAM

Choose DelayedJob if:

  • You like numeric priorities
  • You're not doing a gigantic amount of jobs each day
  • Your queue stays small and nimble
  • There is not a lot failure / chaos
  • You want to easily throw anything on the queue
  • You don't want to setup Redis

In no way is Resque a "better" DelayedJob, so make sure you pick the tool that's best for your app.

Action:

This task is done for, quick solution to the production issue. So you might also use it, if you have same issue.

This task I have done on one of our project which is running on rails 2.3.5 with REE 1.8.7. But it will work same for rails 3 also.

Before diving into migration, I highly encourage you to checkout railscasts 171(delayed_job) and RAILSCASTS-271(Resque).

1. Install Redis server

It is dependency for resque server. Like Delayed_jobs uses database(delayed_jobs table), Resque uses Redis server.

I am using ubuntu so,

> sudo apt-get install redis-server

If you want to install latest version of the Redis you can install manually.

2. Install resque gem

If you are using bundler mention it in Gemfile or in your config/environment.rb depending on your application.

3. Load Resque server tasks

Add a file called lib/tasks/resque.rake

# loads all rake task of resque
require 'resque/tasks'

# following statement is required only if your background task needs rails enviroment else skip it
task "resque:setup" => :environment

4. Replace your Delayed_job code with Resque

A.

Find a places in code where you or your developers has been used delayed_jobs.

For this you can simply find a line "send_later" or "Delayed::Job.enqueue" in your code.

Use your development IDE or use grep tool from command line

> cd your_project

> grep -rn "send_later" .

and

> grep -rn "Delayed::Job.enqueue" .

B.

From the above result you will able to find location where delayed_job is used.

So if your are using SomeModel.send_later(function_name, parameter1, parameter2, ..., parameter n)

Replace a line like this

#SomeModel.send_later(function_name, parameter1, parameter2, ..., parameter n)
Resque.enqueue(NameOfModelOrClass, parameter1, parameter2, ..., parameter n)

Name of NameOfModelOrClass can be define from the following step

Either use one of the following

1. Modify respective model

Add following line into the model,

@queue = :name_of_queue

Name of the queue can be any thing whatever you want. You can also logically separate background task by using queue name.

Mofify function_name in such way,

def self.function_name(parameter1, parameter2, ..., parameter n)
{
	# Defination of a function
}

to,

#def self.function_name(parameter1, parameter2, ..., parameter n)
def self.perform(parameter1, parameter2, ..., parameter n)
{
	# Defination of a function
}

2. Separate code from model which I have preferred.

Define class in lib folder(any location you want)

Class FileImportWorker
{
	@queue = :name_of_the_queue

	def self.perform()
	{
		# cut the defination of the function_name and paste here
	}

}

Pass this class name to Resque.enqueue method.

C.

if you are using Delayed::Job.enqueue(ClassOfJob.new(parameter1, parameter2, ..., parameter n))

#Delayed::Job.enqueue(ClassOfJob.new(parameter1, parameter2, ..., parameter n))
Resque.enqueue(ClassOfJob, parameter1, parameter2, ..., parameter n)

e.g

class NewsletterJob < Struct.new(:text, :emails)
    def perform
      emails.each { |e| NewsletterMailer.deliver_text_to_email(text, e) }
    end    
  end 
Delayed::Job.enqueue NewsletterJob.new('lorem ipsum...', Customers.find(:all).collect(&:email))

to,

class NewsletterJob
    @queue = :name_of_the_queue

    def self.perform(text, emails)
      emails.each { |e| NewsletterMailer.deliver_text_to_email(text, e) }
    end    
  end 
Resque.enqueue(NewsletterJob, 'lorem ipsum...', Customers.find(:all).collect(&:email))

5. Now start your Resque server,

You can start server using a following command.

rake resque:work QUEUE=* COUNT=1

QUEUE=* : specify on which queue to work. * for all Queue

COUNT=1 : Number of worker process

6. Access to queue

Now you can run your code and can have look of pending and in-progress tasks. For this run the following command,

resque-web

It will start resque web interface on port 5678

http://localhost:5678

Bonus Point:

1. Mount Reqsue web interface server into your application

Instead of accessing resque web interface using localhost:5478, you can mount web interface into your application.

Add the following line into your config/routes.rb

mount Resque::Server, :at => "/resque"

Now, access via http://your_server/resque

Also, you can provide basic authentication for it,

add a file called config/initializers/resque_auth.rb with following code,

Resque::Server.use(Rack::Auth::Basic) do |user, password|
  password == "secret"
end

2. You can club background tasks (which we have defined in lib folder) into single folder

As done in rails cast you can add a folder app/workers, and keep your all tasks file in it.

or

Which I done by adding folder lib/workers

Only you need to load this folder into your application,

In rails 2,

Add file config/initializers/load_workers.rb with following code.

Dir["#{RAILS_ROOT}/lib/workers/*.rb"].each { |f| require(f) }

In rails 3,

add following line in application.rb,

config.autoload_paths += %W(#{config.root}/lib/workers)

Troubleshooting:

1. Resque database connection issue

We are using postgresql in our portal but found that when I pushed my code on testing server where we are using same configuration which we have for our production ( We have configured Nginx + Passenger + REE for same), resque was not working.

After looking into resque-web interface found following error,

PGError: server closed the connection unexpectedly

Which I solved by adding following code into /lib/tasks/resque.rake

task "resque:setup" => :environment do
  ENV['QUEUE'] = '*'

  Resque.after_fork do |job|
    ActiveRecord::Base.establish_connection
  end

end

desc "Alias for resque:work (To run workers on Heroku)"
task "jobs:work" => "resque:work"

You can find this solution in following URL.

RAILSCASTS-271(Resque)

http://stackoverflow.com/questions/2611747/rails-resque-workers-fail-with-pgerror-server-closed-the-connection-unexpectedly

I didn't find time to do root cause analysis for this issue, but I thing this is due to passenger as it was working well on my development machine in production mode.

2. Sequence issue

Exact error I forgot but I have faced a sequence related error. Solution was to not to pass a complex object in call Resque.enqueue, like ActiveRecord, File data etc.

3. undefined method `******' for #Hash:000

This is due to, if your are storing complex object(ActiveRecord) in resque. Resque uses Redis server which is key-value store, which returns object in hash.

So tackle this simple pass object-id as parameter and in background processing code extract the original object again using ID.

No comments:

Post a Comment