3 min read

Non-Blocking Methods Ruby

Writing Ruby methods that return immediately on blocking operations
Non-Blocking Methods Ruby
Photo by Andrew Wulf / Unsplash

Writing Ruby methods that return immediately on blocking operations.

I'm going to go through some background, feel free to skip to the meat and potatoes.


Background

I have a need to write a method which returns immediately (or nearly immediately) but continues to do work in the background.

I'm building a formatter for Cucumber. As a Cucumber test pack is executed, Cucumber sends events to its formatters which can then do things with the results. I want to send a variety of HTTP requests when one of these events occurs. These might come in quick succession and I don't want to slow the test pack down for lots of network activity which can happen in the background.

I do want the background activity to complete though. I don't want the runtime to quit while there's still some work to do.

Async

The first thing I reach for is the Async gem. This is great and works like this:

# this is the normal flow of a program
puts 'normal procedural code'

Async do 
  # This will start an event loop
  # but we're still synchronous at this point
  Async do
    # this Async block will run asynchronously with the other
  end

  Async do
    # this Async block will run asynchronously with the other
  end
end

This is fantastic, but you need to wrap all the asynchronous jobs in a wrapping Async block. I have multiple entry points (each Cucumber event) and I can't wrap the whole thing.

Threads

Making a new thread is easy:

Thread.new do
  # Some work here
end

But don't make too many, they're expensive, you can end up in context switching hell and if the main thread exits, any others you've created will die.

This means creating and managing thread pools, shunting work onto a threadsafe queue and making sure the main thread waits for the pool to drain before terminating.

None of this is hard, but, urgh, really? Do I have too?

Ractors

Oh Ractors, you make my heart ache for all the possibilities you hold, but you break my heart by being completely unusable. I didn't bother trying.

Ractors don't work with shared state while Ruby programmers (including me) love a bit of it. We've all got classes and modules with instance variables all over the place (lovely config blocks) which will immediately make your Ractor go pop.

For the love of all that's good in the world, can we please have copy-on-write semantics for Ractors and class / module state.


Meat and potatoes

Fibers

These are low-ish level concepts, much lighter weight than a Thread, which run concurrently, cooperatively. They must actively yield control and be resumed. However, in Ruby 3, an interface to the fiber scheduler interface was added that allows us to write or use a scheduler for controlling fiber execution.

A list of schedulers is available here. I'm going to use the Aync gem's scheduler.

module API
  scheduler = Async::Scheduler.new
  Fiber.set_scheduler(scheduler)

  def self.perform
    Fiber.schedule do
      puts 'Making request...'
      req = HTTParty.get('http://localhost:9292')
      puts " \tDone with request #{req.body[0..30]}"
    end
  end
end

A call to API.perform will create and schedule the fiber which will run immediately. As soon as it hits the HTTParty.get, a blocking operation, the fiber will yield control and the method will return. This lets me start lots of fibers, all running asynchronously.

Putting a wrapper on the above:

require 'async/scheduler'
require 'httparty'
require_relative 'api'

puts 'Starting up...'
5.times do
  API.perform
end
puts 'Finished!'

And with a trivial Roda server running in another process which looks like this:

require 'roda'

class App < Roda
  route do |r|
    # GET / request
    r.root do
      sleep(rand(0.3..1)) # Simulate some work
      'Hello world!'
    end
  end
end

run App.freeze.app

My code produces this output:

Starting up...
Making request...
Making request...
Making request...
Making request...
Making request...
Finished!
 	Done with request Hello world!
 	Done with request Hello world!
 	Done with request Hello world!
 	Done with request Hello world!
 	Done with request Hello world!

You can see each fiber running as far as the puts and then returning control. The Finished! line comes from the main thread but the fibers all run to completion asynchronously. Perfect this is exactly what I want.


Bonus

The simple Roda server, fronted by Rackup, was able to cope with about 500 connections on my machine before refusing any additional connections. Putting the Falcon server, written by Samuel Williams, author of Async, let me push this much harder. In fact, I exhausted my machine's open file handles before maxing out the server.

With 2,000 fibers connecting to the web server, and each request pausing for between 0.3 and 1.0 seconds, overall timings were:

ruby main.rb  0.65s user 0.26s system 45% cpu 1.992 total

Under 2 seconds for two thousand requests. That's crazy for a single-threaded client and includes the YJIT startup cost.