A wall too tall - Nerves & k3s

Short update on the general state of things. Pandemic quarantine in full swing. Me and mine are doing fine. Thankfully. We are staying at home awaiting a baby. I'm likely to be fairly sporadic for a few months. But I do intend to keep writing. Most of my blog posts are written to be useful in the longer term. If you want more in-the-moment writing, the newsletter is more temporally anchored publication (signup further down, no pop-up).

With all this and finishing up client work for April I really haven't had so much time to dive into the good Elixir stuff I want to provide here. But I stopped my client work from 1st of May to have some time off before parental leave. So obviously I started experimenting. This is a story about giving up on something for the foreseeable future after an intense investigation.

The idea - A cheap but resilient distributed server infrastructure

From a discussion with some activist people on Twitter the question was raised why there aren't activist data centers for the established networks of grassroots political activism. This centered on Swedish political movements, I intend to center this on the technical topic, if you have a genuine interest in my politics, get in touch, I'm not secretive. I started thinking about how to keep investment cost low since I don't think most indie groups have the budget for an actual datacenter.

As you might know I really like the Nerves project (new website, all grown up). I find their way of doing IoT or Connected Devices a healthy change of pace. Add NervesHub for safe and secure OTA firmware updates. Nerves is a central pillar in what I wanted to achieve with this idea. Putting device reliability and contributor simplicity front and center.

So my thoughts went to the Raspberry Pi 4 which is the most recent iteration and a quite performant little machine. There are almost certainly better options for optimizing compute per dollar but the Pi are ubiquitous and not particularly intimidating. And I'm pretty well familiar with both Nerves and these devices.

So I figured, to get the device ready, power supply, SD card, NervesKey chip with a private key burned into hardware to allow an operations group to update firmware in the field through NervesHub. Build your system on this and you can ship out the system to whoever wants to contribute bandwidth and power. As it starts up it can call back to NervesHub once installed at an appropriate location. From that authentication we can use that secure channel to provide additional information to bootstrap whatever software process and configuration we want. But this is outside the scope of what Nerves provides out of the box. So what to do there?

I have a buddy in the consulting space, Robin Morero of Spoonish fame, who gets excited by devops, containers, deployment, automation, the finer points of scaling and high availability, all that stuff. He follows this world like I follow the Elixir ecosystem. So he offered me a few recommendations that I found sound. k3s as a lightweight replacement for Kubernetes (also known as k8s). It can be delivered as a single Go binary, around 40 Mb in total. Rio, another Go binary, this time to provide a bunch of opinionated niceties on top of the Kubernetes concepts. We also discussed NAT, networking and some potential centralization issues there and he brought up RadVPN which seems super interesting.

Armed with this and having called him up and asked roughly a thousand questiosn about the details and what to expect I moved forward. I can dev the occasional ops but that's not what I know best. And this is precisely what he enjoys and excels at.

What would this give us if we can set up a Raspberry Pi with Nerves for underpinnings that basically goes directly into joining a lightweight Kubernetes cluster? It would allow us to push updates via Nerves, down to the OS-as-firmware level and remote management even if the device can't join the cluster. The hardware encryption chip should give a fair bit of safety and security in preventing random people running these devices from maliciously joining your cluster. It would at the very least make it non-trivial. Once joined to the cluster we can distribute application containers or whatever workloads we find necessary across these off-the-shelf hardware units. If an SD-card breaks, flash the firmware to a new one, nothing sensitive in the firmware. The thing starts back up and bootstraps via the hardware key.

This would make the buy-in to contribute to the server farm the cost of a Pi, case, power supply and the dirt cheap NervesKey chip. The trusted authorities that hold the crypto keys would have to bless the specific device before sending it out of course.

Note, I'm not trying to achieve decentralized, zero-trust infrastructure here. Because I want something that works (heh) in organizational structures that are already bootstrapped off of human relationships and a measure of trust. I'm trying to make something very flexible and resilient with commodity hardware which requires minimal knowledge to host on the contributor end. I wanted to try to verify if the idea works. So lets see where my explorations took me.

At this point I almost murdered my laptop to get Erlang + Homebrew to agree that OpenSSL exists on MacOS Catalina. Fair enough, I start my Nerves project. I throw some ARM binaries into the project via the rootfs overlay folder as /cluster/k3s. Then I spend some time fiddling with SD-cards, network cables and getting the Pi hooked up. Nerves did fine, my home network has too much NAT. I got the project up and running.

I run the binary from the iex prompt with a simple /cluster/k3s server It logs a bunch of errors about being on a read-only filesystem. Yeah. Makes sense. Nerves sets you up with a writable /root for your application and some ephemeral tmpfs storage for /tmp but the system partition and all of that stuff is read-only to avoid SD card corruption bricking your device. Cool, cool.

My first approach was to see if I can restrict where k3s check for files, writes files, etc. The answer is, partly. There is a --data-dir option which as far as I went worked fine for the data directory. But it still expects to create configuration in /etc/rancher/k3s, do some other stuff in /var/lib/rancher/k3s.

Not entirely unreasonable. So I went for setting up separate partitions for just these folders. This meant making a custom version of the nerves_system_pi4 repo and adding a bunch of partition-making in fwup.conf and some more stuff in the erlinit config file. After figuring out some compilation errors for the system I could settle in for the long compile time and I got my "system" built so it can be used from my project. This would have worked but hit the limit in fwup that says I can't use more than 4 partitions with an MBR approach. Not a shocker, just didn't think about it because I didn't do it in fdisk.

I had some other stuff to deal with from the logs as well, so to improve my odds of providing the environment that this would need I ripped most of the changes from Justin Schnecks Pi4 Docker system.

New approach, I made sure the directories it required were in place. And then I mounted tmpfs mounts at those locations to allow k3s to write what it wanted. This seemed to work fine. I had to bump the size-allocation a bit.

Next problem. Cgroups. k3s seems to mostly expect systemd or System V. Nerves is neither afaik, it is a fairly minimal Buildroot setup. With Justin's Docker changes I did however seem to have most of the system details in place for cgroups and I figured out how to enable cpuacct specifically. I couldn't quite get k3s to absorb this information though and I think it expects some things that my system doesn't actually provide.

That was pretty much where I stopped my dive, after spending some evening time with it for three days. Buildroot and Linux internals are at the edge of what I know. Customizing these cluster-y devops setups is also at the point where I need to do constant reading and research to even know what I'm doing. I think this is feasible. But it was fairly clear to me that this uphill struggle was starting to look like a wall. I'm sure I could work through it. But I can't justify sinking more time into it right now. There are several lower-hanging fruit which I might actually be able to do without developing new skillsets from scratch.

This was incredibly educational. Nerves was well-behaved. The issues I'm having are inherent in trying to run k3s where it hasn't been intended to run versus the tightness of the Nerves system to stay resilient while avoiding bloat and complexity. k3s seemed like a nice project. I only got to barely scratch Rio as part of testing on Raspbian and I didn't get into RadVPN at all. The Nerves channel was helpful as always. Thanks to Justin and Frank specifically, but I know a few others chimed in as well. Robin was very helpful and if you want to sanity-check the way you run your infrastructure, he's definitely one to talk to.

I don't think this idea is dead. But I need to set it aside currently, focus on more fruitful things. The system repo is here though you can basically start from a fresh pi4 fork. If you have questions about my thinking or want to pick it up and run with it. Feel free :)

For me, it was nice to revisit Nerves as I haven't had an opportunity in quite a bit. It was good to improve my understanding of Kubernetes, that ecosystem and also to try some alternative approaches. Overall I wanted to proof-of-concept this thing and it turned out to be a lot of finicky details to figure out to even try it well. I guess the proof-of-concept might be better done from Raspbian just to verify. But I already know the Pi4 can cluster from Will it cluster? k3s on your Raspberry Pi. So that part wasn't interesting.

So the lesson today is, sometimes things are not worth doing or stop being fun when they are supposed to. Its OK to quit things when they suck or to cut your losses and go do something that feels more worthwhile. Especially when the only reason you are doing the thing is for your own sake. I feel very adult about all of this ;)

Stay safe out there.

If you need to tell me what I should have done instead, or have actual questions, I'm available at lars@underjord.io and occasionally on Twitter where I'm @lawik.