Some people say “effect handlers are the new monads”. So I thought, before all these “Effect Handlers are like X” (for X in { Burritos, Containers, … }) blog posts come into existence, I contribute another point-of-view. This blog post does not really explain the theory behind algebraic effects and handlers. Neither does it explain why you should use them, or how. We just look at how effect handlers work operationally, by implementing them.
This post is a bit longer. If you are in a hurry, I recommend you jump to Step 2.
I do recommend the blog post Algebraic effects for the rest of us which explores another aspect of the topic.
I spend most of my awake time doing research on effect handlers (in particular in Scala and Java). Unsurprisingly, I am happy to see that effect handlers not only catch on in research, but also programmers in industry find them interesting. In particular, some developers at Facebook seem to use them as conceptual model for aspects of React.
Programming language researchers like to cut the semantics of a programming language in little pieces. Each one should be easy to explain. Understanding the language as a whole, should be understanding the little pieces. Turns out that’s already difficult with simple effects like state (that is, mutable variables, heap, etc.), exceptions, I/O (that is, files, console input, println, etc.). Like monads, algebraic effects are motivated by the fact that it is terribly hard to think about programs that use effects. But nothing of this will concern us here.
When reading about the topic, you will see different terminology: algebraic effects, effects, effect handlers, … What’s the difference really? In short: The word “effects” is very overloaded – don’t try to understand it; algebraic effects means thinking about effect operations (like “throw”, or “print”) as building blocks to do calculus with (as in, “algebraic” reasoning); effect handlers are a generalization of exception handlers (as also explained in this post). They are practically relevant to write modular programs. That’s what we are interested in.
First of all, it should be made clear that the purpose of this post is not to provide an industry strength implementation, but to give one that’s as simple as possible. The first thing we will do is to implement dynamically scoped variables.
What? Are you crazy? Lisp got dynamic scoping by accident. Isn’t that a bad thing?
Yes, and no. But let’s not go down this rabbit hole and instead focus on the coding. What we want to implement is the following. Given a function
def render
url = Eff.get :url
puts url
end
we want to be able to (locally) provide a context, such that url
is bound to a desired value.
For instance
Eff.with url: "http://github.com" do
render
end
Eff.with url: "http://b-studios.de" do
render
end
should print both urls. Alternatively, we could also pass the url as an additional argument to render
. However,
this way we would not only pass it to render
, but also to all methods that call render
and so forth.
That is,
def view
puts "... rendering ..."
render
end
Eff.with url: "http://github.com" do
view
end
should print
... rendering ...
http://github.com
Furthermore, it should always resolve to the closest binding. That is,
Eff.with url: "http://github.com" do
Eff.with url: "http://b-studios.de" do
render
end
end
should print "http://b-studios.de"
.
Maybe you already have an idea how we could implement this API? In fact,
this is not very difficult. We can just use a mutable field and carefully keep
track that it always contains the correct value. Eff.with
writes the value, Eff.get
reads it. That’s not how we gonna do it. Instead we are going to use fibers.
What? What? What’s the connection between dynamic scoping and fibers?
Bear with me, there is an immediate connection (that Daan Leijen at Microsoft Research and I explored in the new language design of the Koka language, explained here). It will also greatly help us in Step 2.
So, how do we start? First of all, we think of binding variables and looking them up as communication between separate processes (or fibers). Looking up a variable corresponds to sending a request to an outer process. The outer process can decide what to respond to the request. It determines what value the variable is bound to.
Enough talk, here is the implementation:
require 'fiber'
module Eff
def self.get(name)
Fiber.yield name
end
def self.with(bindings, &block)
Binder.new(bindings, &block).run(nil)
end
class Binder < Fiber
def initialize(bindings, &block)
@bindings = bindings
super(&block)
end
def run(arg)
# receive a request from the running program / fiber
req = resume(arg)
.
# done, nothing to do here - the request is actually the result
return req if not alive?
# we received an op
binding = @bindings[req]
if not binding.nil?
run(binding) # we got it, continue execution
else
run(Fiber.yield req) # forward request to an outer binder
end
end
end
end
Let us walk through this, step by step. Variable lookup with get
is yielding the fiber, sending the requested variable name:
Fiber.yield name
We also say the fiber is sending a request or sending an effect operation call.
The method get
assumes that the current code is executed in some fiber. We’ll get to that in a second. Yielding the fiber means suspending
execution and handing over control to the creator of the fiber. But what fiber?
The next snippet of code, we’ll look at is the implementation of with
.
def self.with(bindings, &block)
Binder.new(bindings, &block).run(nil)
end
Given some bindings, and a block these bindings should be valid in, we create a new Binder
and run it with run
.
We pass nil
, which is ignored.
In fact, a Binder
is just a fiber (hence, Binder < Fiber
). The bindings (like url: "http://..."
) are stored in an
equally named field.
Let us now look at the implementation of run
. Running the fiber for the first time, that is,
invoking run(nil)
will trigger a call to resume
. This
conceptually starts running the block, that we provided to the constructor.
If running the block results in a variable lookup (that is, a call to get :variable
), the execution of the fiber
will be suspended. In this case alive?
will return true
, and req
(short for “request”) will be the
name we tried to lookup (that is, :variable
). Otherwise it will return false
and the fiber completed its
execution. In this case req
will be our overall result.
If we find a binding (not binding.nil?
), we resume the execution of the fiber with run(binding)
.
Remember, the first thing run
does is invoking resume
. Otherwise, we don’t know the binding.
Maybe another, surrounding binder does? So we call Fiber.yield req
again.
This will return the resulting binding, which we pass to run
.
Forwarding to an outer binder is necessary to support something like:
Eff.with x: 1 do
Eff.with y: 2 do
puts (Eff.get :x) + (Eff.get :y)
end
end
#> 3
This sounds crazy complicated! Why go through all that trouble?
Well, dynamically scoped variables are very easy to implement. But that’s not all there is to effect handlers. Otherwise we wouldn’t call them like this. They can do more.
As a simple motivating example, let us assume we have a method async_get(url, &block)
that fetches the given
url
asynchronously and calls the block with the result. We have to call it like
def my_fun(&callback)
async_get "http://github.com" do |result|
# do something with the result
myResult = ...
callback.call(myResult)
end
end
and locate everything that should happen once the result is available inside of the block. This also
affects the callers of the methods like my_fun
– it’s infectious. This style of programming
is also called continuation passing style (or as I sometimes call it: “callback passing style”).
It is so confusing and verbose, that languages like JavaScript added language support for async
/ await
to relieve programmers.
As it turns out, we don’t need special support for async
/ await
but can use effect handlers for this.
Here is an example. First, we write a method ping
that uses the :http_get
effect:
def ping(url)
puts "Requesting #{url} ..."
res = Eff.send :http_get, url
status = res.code
puts "Response from #{url}: #{status}"
status
end
It sends an http-get request to print (and return) the status code. Note, that we did not provide a callback
to Eff.send
but assume it blocks and returns the value.
Now we write a very simple user program that uses ping
:
def ping_both
ping "http://b-studios.de"
ping "http://github.com"
end
To run it, we need to say what :http_get
means. Like with dynamic binding, we do this by wrapping the call to ping_both
into a binder:
Eff.with :http_get => lambda { |url, resume| async_get(url, &resume) } do
ping_both
end
Now that’s a bit special. Instead of binding :http_get
to a value like 1
or 2
above, we bind it to a lambda. The lambda receives two arguments: The original argument passed to Eff.send :http_get, url
and a block – the continuation. All the magic of passing around callbacks has been reduced to this one position. Neither the call to Eff.send :http_get, url
required a callback, nor the call to ping
, nor the call to ping_both
. That’s great!
So, how do we implement this? Turns out, we already did large parts of it in the previous step…
require 'fiber'
module Eff
# We just renamed `get` to `send`
def self.get(name)
send(name)
end
# That's new: sending operations with arguments
def self.send(name, *args)
Fiber.yield operation: name, arguments: args
end
def self.with(bindings, &block)
Binder.new(bindings, &block).run(nil)
end
class Binder < Fiber
def initialize(bindings, &block)
@bindings = bindings
super(&block)
end
def run(arg)
req = resume(arg)
return req if not alive?
# Now the request is a map with keys :operation, and :arguments
binding = @bindings[req[:operation]]
# is it something we can call?
if (binding.is_a? Proc) or (binding.is_a? Method)
binding.call(*req[:arguments], lambda { |res| self.run(res) })
# just as before
elsif not binding.nil?
run(binding)
else
run(Fiber.yield req)
end
end
end
end
There are two important changes, compared to the previous implementation:
send(name, args*)
now takes arguments in addition to the name. We are sending a
request to the surrounding binder. The request is a map with entries :operation
(e.g., :http_get
) and
:arguments
(e.g., url
).call
we now check, whether the requested operation resolves to something we can call.
(that is, a Proc
or method). If that’s not the case, we proceed as before. Nothing changed.All the important magic happens, if the binder resolves to something we can call:
binding.call(*req[:arguments], lambda { |res| self.run(res) })
There is a lot happening in this single line. Let us remember that binding
is something like:
lambda { |url, resume| async_get(url, &resume) }
Firstly, we forward the arguments that we received to the lambda. This way, the url http://b-studios.de
is
passed as parameter url
. Secondly, we pass a lambda as additional argument. It will be bound to resume
in
the lambda. This lambda is the callback (the continuation). The binder implementation can call it to resume
computation at the point where the effect operation was originally called.
If we do not call it in the binder, computation will abort there:
Eff.with :abort => lambda { |resume| nil } do
puts "first"
Eff.send :abort
puts "second"
end
puts "done"
will print
first
done
Getting access to the continuation is essential to express many more advanced use cases of effect handlers.
We have seen how to implement effect handlers in two steps. In this post, I purposefully referred to handlers as “binders” to make the connection to dynamically scoped variables.
The implementation itself has some flaws:
It will quickly run out of stack space since run
is recursive. Two calls to run
are tail-calls and can
easily be rewritten to a loop. However, the call to run
in the lambda is not and would require special treatment
to be stack-safe.
As far as I know, fibers in Ruby cannot be cloned. So we can only call resume
once. This rules out many interesting
use cases of effect handlers like probabilistic programming, parsers, backtracking search, and many more.
Feel free to “fork” the implementations and try to solve those problems :) Please let me know if you do