Say we’re writing a program that performs several time-consuming operations, like network requests, disk accesses, or complex calculations. We want them to execute concurrently, but we also want our code to remain simple and easy to understand. There are many different ways of approaching this problem, but in this post I’ll focus on implicit futures and how they can be implemented in Ruby.
This code demonstrates the type of scenario we’re dealing with:
1 2 3 4 5 6 7 |
|
We perform one or more time-consuming operations, potentially do something else, then use the results of the operations.
In the code above, the time-consuming operations are executed sequentially. This can result in very poor performance (e.g. CPU might be forced to sit idle while a network request completes). Also, the call to other_stuff
won’t begin running until all three of the previous calls have finished, even though it doesn’t depend on their results.
This is the solution we’re working towards:
1 2 3 4 5 6 7 |
|
(If you’re unfamiliar with Ruby, foo { bar }
is equivalent to foo(lambda: bar())
in Python)
The expensive calls are now executed asynchronously, and other_stuff
can begin running immediately. If line 7 is reached before all three results are available, the main thread will automatically block and wait for them to finish.
Nice! How on Earth can this be implemented though…? At first glance, it looks like we’d need to modify the language itself. As we’ll see, there’s actually a much simpler approach.
Threads in Ruby
We’ll use Ruby’s Thread
class to do most of the heavy-lifting. Here’s a quick primer on it:
1 2 |
|
Thread.new
spawns a thread to execute the given code block, and t.value
returns the thread’s result (waiting for it to finish if necessary).
Let’s rewrite our original code to use Thread
:
1 2 3 4 5 6 7 |
|
This almost gets us where we want to be, but not quite.
Explicit vs implicit
In the code above, we have to explicitly retrieve the results by calling value
on the futures/threads. This means they’re explicit futures rather than implicit ones.
This might not seem like much of an issue, but what if we wanted to pass one of the results on to another piece of code? We could either:
- Call
value
on the future, and pass along the result directly. - Pass along the future, and let the other piece of code call
value
when it needs the result.
Neither of these options are very good. The first can result in suboptimal performance, because we’re calling value
before we really need to (remember that value
might block execution if the result isn’t ready yet). The second option limits the reusability of the other piece of code, as it will now only be able to work with futures.
Implicit futures don’t have either of these issues. The code that uses them can remain blissfully ignorant of the fact that they’re futures, and no blocking will occur until the result is actually needed (i.e. a method is called on it).
Delegating method calls
To summarise what we’re trying to achieve, we want the future object to behave as though it were the result object. When we call a method on it, it should delegate the call to the result (blocking if necessary until the result becomes available).
It’s not immediately obvious how methods can be delegated like this. The future should be able to work with all types of result objects, so we don’t know in advance which methods need forwarding.
Ruby has a rather interesting feature called method_missing
that comes in handy here. Calling a non-existent method would normally result in an error, but if we define a method called method_missing
, Ruby will call that instead.
Here’s a quick demo:
1 2 3 4 5 6 7 8 9 10 11 |
|
This gives the following output:
1 2 3 4 |
|
Our future class can use method_missing
to intercept method calls. However, it still needs to actually forward the intercepted methods to the result. This can be done using send
, which lets us dynamically call any method on an object:
1
|
|
Putting it all together
We now have everything we need to implement implicit futures. We can define a Future
class that:
- Uses
Thread
to:- Asynchronously compute the result.
- Block execution if the result is requested before it’s ready.
- Uses
method_missing
andsend
to delegate all method calls to the result.
The code ends up being very simple:
1 2 3 4 5 6 7 8 9 10 |
|
When a future is created, a thread is immediately spawned to execute the given code block. When a method is called on the future, it simply forwards it to @thread.value
(which will block execution if the thread hasn’t finished running yet).
To make it slightly easier to use, we’ll define a standalone future
function that creates and returns a Future
object:
1 2 3 |
|
We can now create futures like this:
1
|
|
This looks exactly like what we set out to achieve! Might be a good idea to check if it works though…
Testing!
Let’s start with a quick sanity-check. To mimic an expensive operation, I’ve placed a call to sleep
in the code given to the future:
1 2 3 4 5 6 7 8 9 10 11 |
|
The call to f.upcase
should cause the main thread to block for several seconds until the result is ready. If we run the code, this is exactly what we see happen. Here’s the output:
1 2 3 4 |
|
It worked!
Now let’s try more realistic example. We’ll download a bunch of webpages and record how long it takes, with and without futures:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
And the results are…
1 2 3 |
|
13.8 seconds down to 2.6 seconds. Not bad!
Caveats
The Future
class given above isn’t exactly production-ready. If you were using it in a real project, you’d want it to:
- Capture exceptions that occur during the thread’s execution and re-throw them when the result is accessed.
- Inherit from
BasicObject
instead ofObject
, so methods defined onObject
are also delegated to the result. - Use a thread pool to prevent the system from being overloaded by a large number of threads.
- Probably a bunch of other things I haven’t thought of.
A basic implementation in 9 lines of code isn’t too shabby though!
Conclusion
Futures are awesome. Ruby is awesome. The next time you need to deal with concurrency, consider whether this kind of approach could be beneficial.