Scripting with Elixir
2023-06-12Underjord is a tiny, wholesome team doing Elixir consulting and contract work. If you like the writing you should really try the code. See our services for more information.
I was a Python developer for some time and one great joy of Python is that you have an expressive language that you can use for your serious apps as well as for your hacky little one-off script or bespoke pieces of automation. By expressive I mean that typing very little can give you a lot of progress towards your result. I’ve scripted a fair bit in Python. Now I do it with Elixir.
Overall I would recommend scripting in whatever language you are most comfortable in that is at least reasonably comfortable for scripting. A quick script benefits from a low barrier between thought and execution. Use what makes sense for you. I find Elixir surprisingly good as a scripting language ever since the introduction of Mix.install
.
Elixir actually answers a question that Python, to my knowledge does not, in this regard. While you do need the language and its runtime installed, how about dependencies? I usually globally installed requests
in Python, maybe an AWS SDK library, because I’d need them eventually.
In Elixir each script can briefly define dependencies.
#!/usr/bin/env elixir
Mix.install([:req, :jason])
"https://api.github.com/users/lawik"
|> Req.get!()
|> IO.inspect()
The Mix.install
part installs the top two things I need. An easy-to-use HTTP client (Erlang httpc
works but is kinda clunky) and a JSON library. I could similarly pull in CSV-parsers or what-have-you. In Python you have JSON and CSV built in but you generally do want a better HTTP client.
I have occasionally found Elixir to be a bit less convenient for doing nice sloppy dictionaries as mutable state is less conveniently available and if I’m parsing a bunch of files or larger structures and want to build up convenient key/values for them I often want to exfiltrate some key and value out of multiple nested loops. This usually leads to Enum.reduce in Elixir and it is less convenient. It totally works though. And if you want to be really gnarly you can do Process.put/2
and Process.get/2
. I bet you could do it reasonably comfortable with an ETS table but I haven’t really used those while scripting. Yet.
Let’s compare.
#/usr/bin/env python3
index = {}
with f in open("file.json"):
for item in json.loads(f):
for subitem in item["children"]:
index[subitem["id"]] = subitem
Versus
#/usr/bin/env elixir
"file.json"
|> File.read!()
|> Jason.decode!()
|> Enum.reduce(%{}, fn item, index ->
Enum.reduce(item["children"], index, fn subitem, index ->
Map.put(index, subitem["id"], subitem)
end)
end)
It can get a bit involved in nested cases. Especially if you need to operate on multiple data structures and such. Lacking mutability means paying more attention to data structures and how they flow through functions.
Overall though I’ve also found that whenever I do something slightly time-consuming I can optimize it significantly by using a single Task.async_stream/3
which will effectively utilize all my cores as it works through whatever list I’m processing. Getting this equivalent thing going in Python is significantly more painful.
#!/usr/bin/env elixir
Mix.install([:req, :jason])
"https://api.github.com/users/lawik/repos"
|> Req.get!()
|> then(& &1.body)
# And now we get very parallel
|> Task.async_stream(fn %{"stargazers_url" => url} ->
url
|> Req.get!()
|> then(& &1.body)
end)
|> Enum.flat_map(fn result ->
case result do
{:ok, stargazers} ->
stargazers
err ->
IO.inspect(err, label: "Error fetching stargazers")
[]
end
end)
|> Enum.uniq()
|> Enum.count()
|> IO.inspect()
When writing out this script and trying it I hit the Github API limit for unauthenticated calls on like my third or forth run :)
I also like that most of the Elixir syntax ends up being basic function calls. Instead of with f in open
to open a file I’ll call the File
module and the read!
function. The exclamation point (or bang) is a special convention. In most cases there are matching functions without that symbol and those generally return an {:ok, content}
tuple or an error tuple on failure. The bang functions will raise an error on failure. These bang functions make for very convenient pipelines. If you need to operate on the failure case it is generally better to use the regular one and a case statement.
This is very Erlang “let it crash” but applied in a different context. By using them in a script we are saying that any failure to exeute the function has no reasonable mitigation or that we don’t care to spend time defending against small deviations from expectation. Handling all conceivable possible outcomes of attempting to read a file is generally a waste of time and effort in a script that requires the file to exist. Just error out if it doesn’t.
First run can be a bit slow if the Mix.install needs to pull down and compile the packages (just as the initial pip install would) but subsequent runs will be reasonably snappy.
If you want more examples of doing scripting in Elixir I suggest looking at Wojtek Mach’s Mix Install Examples repo. It has a lot of things. Wallaby for driving a browser. A single-file web server with Phoenix LiveView. Or why not write a NIF in C. Lots of fun stuff in there.
This is intentionally a shorter post. Hope you enjoyed it and it made you a bit curious. If you have thoughts, questions or anything of the sort you can reach me at lars@underjord.io or as @lawik@fosstodon.org on the Fediverse.
Underjord is a 4 people team doing Elixir consulting and contract work. If you like the writing you should really try the code. See our services for more information.
Note: Or try the videos on the YouTube channel.