Supervision trees, an example in Elixir

So any time recently that I’ve gone looking for a good overview of supervision trees in Elixir I haven’t found what I want. I’m pretty sure I used to find some that covered making simple supervisors and workers without assuming you want a module for each Supervisor. I now believe those were following ye olde Supervisor.Spec which had helpers for that. How to make a module based Supervisor is in the docs so I won’t be spending time on that.

Since that method was deprecated I figured it was time I bite the bullet and get comfortable with child specs and the way they work and figure out if I can avoid creating a module for normal use of a simple bog standard Supervisor. Spoiler: I could.

This repo has all the code for what I built. So if we dive into lib/supervisor_sample/application.ex we find the following:

  # ..
  def start(_type, _args) do
    children = [
      worker(:root_worker),
      supervisor(
        :one_for_one,
        [
          worker(:worker_1),
          worker(:worker_2)
        ],
        name: :supervisor_1
      ),
      supervisor(
        :rest_for_one,
        [
          worker(:worker_3),
          worker(:worker_4),
          worker(:worker_5),
          supervisor(
            :one_for_one,
            [worker(:subworker_1)],
            name: :subsupervisor_1
          )
        ],
        name: :supervisor_2
      ),
      supervisor(
        :one_for_all,
        [
          worker(:worker_6),
          worker(:worker_7),
          worker(:worker_8)
        ],
        name: :supervisor_3
      ),
      worker(:transient_root_worker, :transient)
    ]

    # The root of the tree is a supervisor that runs everything we defined above
    opts = [strategy: :one_for_one, name: SupervisorSample.Supervisor]
    Supervisor.start_link(children, opts)
  end

  # ..

GitHub link

This is my supervision tree definition. It uses utility functions supervisor and worker to make it easier to get an overview. These functions generate a child spec with each call. Even with some experience using OTP I never really spent any time understanding child specs. I won’t go too deep into them here, the docs above honestly cover it but I’ll try to make it digestible in my own way.

This is what the supervisor child spec looks like if I call supervisor(:one_for_one, [], name: :my_supervisor):

%{
  id: -576460752303423326,
  start: {Supervisor, :start_link,
   [[], [strategy: :one_for_one, name: :my_supervisor]]}
}

So the :id is fairly arbitrary. According to my sources, I asked on Twitter, more knowledgeable people responded, it is used for restarts and whatnot. It can also be used when doing interesting things with your supervisor implementation. But it is not important aside from needing to be unique, unless you have specific plans for it.

The :start key gives what we are actually starting. The format might become familiar to you. A tuple with a module atom, a function atom and a list of args to pass into the function. This matches the signature of apply/3. In this case the args are a list of children and some options. Because that is what the Supervisor module takes for the function start_link.

That’s all that is necessary. The child spec docs cover the other options. We can check our worker example as well:

%{
  id: -576460752303423294,
  restart: :permanent,
  start: {SupervisorSample.Worker, :start_link,
   [[label: :my_worker, name: :my_worker]]}
}

Here we also use the :restart key because I have an example with :transient and so I set it explicitly.

These maps aren’t complicated to create and they shouldn’t be intimidating. But they are visually bulky and I think there are many ways of building the tree that could be done in a visually pleasing and less noisy way. That’s what I use the utility functions in the sample project for.

GenServer and child_spec/1

In many cases you don’t actually have to create the child spec yourself. Anything that is a GenServer will have a child_spec/1 already included. So then we can reduce the above to {SupervisorSample.Worker, name: :my_worker}. Or without a name it could be SupervisorSample.Worker. Very clean. A bit of convention saving you a bunch of repetitive detail. But the Supervisor module doesn’t offer child_spec/1. It offers child_spec/2 which is used to modify child_specs for module’s that already have them. Usually because you want to override something in the default child spec. Such as the :id.

Most libraries you’d use where you need to start an instance of them as part of your supervision tree would already provide you with a child_spec. If they don’t, you can create one yourself quite easily just as we did for Supervisor.

Another thing you can do is create a module that provides a child_spec/1 for starting a supervisor as detailed in this converstation on the forum. The code is partial, but I think the idea is complete. Then you could use that module instead of Supervisor.

The tests & strategies

So the supervision tree above showcases the different strategies available. It also shows that we can supervise a supervisor, that’s how you build a bigger tree with processes that depend on one another.

To demonstrate how these work I’ll direct you to test/supervisor_sample_test.exs. Every test looks something like this:

  # ..
  test "restart root worker" do
    Worker.stop(:root_worker)

    # Should restart
    assert_receive {:stopped, :root_worker}
    assert_receive {:started, :root_worker}
    # Shouldn't restart anything else
    refute_received {:stopped, _}
    refute_received {:started, _}
  end
  # ..

They follow this model of, okay, let’s tell a worker process to stop and assert that we receive the messages we expect. I expect this one to be stopped and then restarted. These messages are sent in lib/supervisor_Sample/worker.ex and we register for listening in the setup hook in the test module.

The different strategies are succinctly explained in the Elixir docs. I’ll briefly restate it here:

  • :one_for_one, the supervisor will restart each child separately if they terminate. Only if the supervisor goes down, does it affect the whole group.
  • :one_for_all, if a single child process terminates the whole set of child processes will be restarted.
  • :rest_for_one, this one is interesting. If a child process terminates all the children started after it in the list will be restarted. This might seem odd but has some uses.

So you can look at the tests to see examples of these behaviors.

This doesn’t really cover DynamicSupervisor at all. But that has a lot of its own considerations. It is very useful and maybe I should cover it at some point. Alex Koutmos one of my co-hosts on Elixir Mix has a good piece on DynamicSupervisor.

I hope this is helpful to people. Thank you for your attention.

If you have questions, thoughts or more of a comment, really, you can find me on twitter {{ lars_twitter }} or reach me via email {{ lars_email }}.

Latest Posts

Asking a tech recruiter

While working I mostly found the attention of recruiters slightly reassuring but often annoying. I think that annoyance is fairly common, usually built up from countless LinkedIn drive-by attempts from unreading keyword-hunting recruiters. I thought that now, out on my own, maybe this legion of recruiters can be my sales department....

Read More

The Mac is losing me

I've been mostly happy using a Mac since I got myself my first computer earned with programmer money. I believe it was a mid 2009 15" MacBook Pro. That was a computer I used at least until 2016 which I consider very decent usable life. At that point I had replaced the hard-drive with an SSD, upgraded the RAM and switched a battery that was worn out. I stopped using it when it just straight died some time in 2016....

Read More

The BEAM marches forward

The BEAM is the virtual machine that Erlang and Elixir runs on. It is widely cited as a battle-tested piece of software though I don't know in which wars it has seen action. It has definitely paid its dues in the telecom space as well as globally scaled projects such as Whatsapp and Discord. It is well suited to tackle soft-realtime distributed systems with heavy concurrency. It has been a good platform chugging along. And with a small team at Ericsson responsible for much of its continuing development it has been managed in a deeply pragmatic way. Erlang has always been a bit of a secret and silent success. Almost no-one uses it if you look at market shares. But among the ones that use it there seems to be a very positive consensus. And then Elixir came and caused a bit of a boom. I think the BEAM has benefited from Elixir and Elixir wouldn't exist without the BEAM. With that bit of background I'd like to shine a light on some cool developments that I think makes the BEAM more interesting or even uniquely interesting in the future....

Read More
Read All Posts →