Post-mortem: 10 years in the vertical - Part 2

Underjord is a tiny, wholesome team doing Elixir consulting and contract work. If you like the writing you should really try the code. See our services for more information.

This continues our dive from part 1.

Content warning: Contains poor technical decisions, inexperience and stories of a developer just starting out and up to roughly present day. Be kind to who I was, he gets enough shit from me.

The better system

The team that had formed inside our little division of this larger organisation ended up breaking out to run an agency startup. Apps and web development would be our game. And it was. I did not quite know what a REST API was, every problem could be solved with Drupal. But damn if I wasn’t on the right side of the curve to not know how much I did not know.

So I was the tech lead and most senior web dev. I had my trusty guy from the cheese factory by my side. We had a young, enthusiastic designer who was still a goddamn delight of a person last time I saw him. We had a somewhat grizzled veteran of solo entrepreneurship running the ship and doing all the sales, project management, marketing. A lot.

We also brought on a newly schooled app developer who worked on apps for this system, remotely.

There was a bunch of things going on and a lot of craziness throughout. But to the product point, we got tasked with building a modernized version of the system we had built, with mobile apps and video support. This time for a tech-focused company in the education sector.

I am not sure I have ever pulled a more finicky tooth of technology than the Video Field module that was available for Drupal 6. FFMPEG conversion commands, presets that didn’t work, cron-jobs and JS libraries for display. Abstraction layers for days. It was a whole thing. Today I think I’d pick that thing up quite quickly. But at the time it was absurdly complex.

During this work we started learning how to build things The Drupal Way, there were a lot of interesting flavors of Kool Aid. I won’t dismiss all of The Drupal Way, it has some very interesting properties. But I do not want to revisit that tech. It is all about managing whatever configuration and state you end up having in your database. Because anything that isn’t code ends up in the database.

If you know any Drupal from this era, this was CCK, Views, CTools, Pages. The ecosystem is honestly fascinating and powerful in an extremely odd way that barely transfers to anything else.

Agency life

As you might imagine from our relative inexperience in building services and being very new at building apps and APIs the process was quite fraught. Our fledgling company had trouble balancing different client projects and a bunch of delays and complications lead to a hefty backlog, overtime, expanding the team and the company swelling like a balloon. The system had releases but keeping it moving forward and keeping the client happy was a rough process.

The client was honestly very reasonable and patient. But as a company we were struggling. Mostly because we couldn’t deliver and close things down. This was again largely due to our inexperience but also had do to with scoping projects, pitching with enough margin and generally saying no to things.

It was an extreme learning process for everyone involved. The product was launched, updated and put to use. Bugs and continuous delays aside it was a good product. It had great market fit. It was the first commercial system anyone ever heard of for the preschool segment in Sweden. It had charming design, it had native apps that allowed smartphones to be used for publishing to the system. Parents could see what their children were doing and receive push notifications.

We could never quite finish up our commitments to the client for this product, there was always something else. Always one more bug, always one more promised feature that hadn’t quite landed. Around this time our company folded because of the rapidly expanding payroll and continued limited cashflow.

You could say the way this business was run heavily influences how restrictive I am with my commitments when running Underjord. I am very particular about not committing to more than I can deliver on and my aim has always been to grow organically, in a sustainable fashion, if I am to grow it at all.

I quit just before the company folded actually. Not that it mattered. I got in writing that if I wished I could work for our client that ordered the product because I’d spent some time at their office towards the end and it seemed like a good team, good offices, good company.

Maintenance hell

Oh yes, the product was successful. Probably not profitable, I was not privy to those details at the time. But it was popular, sold well, was appreciated. It was also getting dog slow.

The “node” table grew huge. We had to raise the integer size on the primary key. And we did not have that many customers. In the Drupal 6 days this was the table which gathered every piece of content. In the end, most data aside from user profiles ended up in there.

I ended up freelancing for the company and working on this product, massaging away bottlenecks, figuring out performance issues. Finding a new solution to file storage which was getting wild. I’ll have you know we stored Terabytes of media!

The system was well-liked overall but was often slow at this point. It was hard to maintain, run and develop. Drupal 6 with heaps of custom code and configuration on top of it did not give us a clean path forward. Meanwhile the company had another product in Drupal 7 which also had performance issues and a lot of unwanted complexity. So our team at this company turned our eyes to the horizon. It was time for a new system.

But first. Lets review.

What did it do well?

It delivered something the customers definitely wanted and were excited about. It delivered on an organizational requirement they had that was handed down to them by the government. It mandated a certain systematic documentation of activities in preschool education. It also served a number of practical needs and desires. Such as sending information out to guardians.

In some sense it was built in the simplest way we knew. It was never over-engineered.

It was simple and easy to use. It was an incredibly common piece of feedback about the system. The customers understood it. It modeled their organizations well, it made sense. It largely acted as expected. It was simple and that was the single best thing about it. We had untrained preschool teachers working as effective admins for their orgs. It mostly worked well.

What technical flaws contributed to its end?

So many. File management was done through Drupal 6 and PHP, including a maddening video conversion cron-job that was incredibly flaky. It all ran on one server, a physical one. Not bad in itself but scalability was definitely a concern. After a while files were moved to a mounted file server when adding disks became untenable.

VPS:es were a thing, we didn’t know much about them when we started and there was precious little breathing room for making transitions and non-critical work.

Performance, caching, page loading performance, file serving. So much performance-related stuff. PHP opcode caching was one thing that helped. But there was so much complexity, global state gone wild and so much weirdness. We used a lot of Views and Pages which are Drupal modules that do rather cool things in some of the most complex ways I can imagine. The are incredibly flexible, powerful and hard to not shoot yourself in the foot with. Its an odd ecosystem. I was really into it at the time. But also falling out of it rather quickly.

Our users were change-sensitive. I think we did manage a fairly successful overhaul of the admin backend. But in general, changing anything related to finding your way through the system was heavily disliked. I think that was a good take-away honestly. Don’t change things unless there is good reason.

The LAMP stack wasn’t the problem, Drupal 6 wasn’t the problem, file storage wasn’t the problem. But I will say that our pressure-cooked code-base made by inexperienced and stressed people in this combination of tools was an absolute liability. But it was also a successful and much loved product. Go figure.

What did you learn from working on it?

It don’t gotta be fancy.

We were early on the app side of things. We were fairly old-school on the web side, a major version back which in Drupal terms, that’s like a major version of Debian back (or at least how Debian used to be, glacial).

A good fit for the customer matters a lot.

People genuinely loved this product. It was heart-warming. They could be pissed at it and about things we did. But we had so much positive feedback.

Friendly and helpful customer support is golden.

Our product was largely represented by one person. An upbeat guy with patience for days and a real enthusiasm for helping people figure things out. And when he couldn’t wrangle the issue I’d usually be involved, or one of the other occasional team members.

Fit your purpose and know your people.

I think this was an underserved group. Actually, I know it was. It still is to some extent. And they rallied around their solution. The product was only for preschools. It only prioritized preschools. It didn’t aspire to be anything for primary school or above. The preschools in Sweden run quite differently from the stages of schooling that come after it. They loved the platform for this. It was theirs.

On more personal skillsets. I learned how to run and not run projects. How to get things done under intense pressures as well as under more reasonable work-life balances. I learned how much more productive I can be when I’m not splitting myself over a large number of priorities.

On a technical level I picked up a lot of Apache, Nginx, PHP, Drupal 6, MySQL, some Redis and plenty of Linux admin. All the PHP optimization tips and tricks. I did weird things with .htaccess files. FFMPEG and video conversion. AJAX, which was hot back then and often really bad to work with in Drupal (AHAH, I remember you). I had learned Python in parallel but didn’t get a chance to use it. RabbitMQ for moving away from the video cronjob was a thing. So much random stuff.

I’m not certain it matters how I feel about the total experience. Because it has been incredibly influential on my work and what I do. It is base facts and part of the groundwork for everything I know now. And I have a warm place in my heart for the product and its users, a touch of shame for some of the technical outcomes and plenty of understanding for why we ended up where we did.

Next, we knew we needed to scale…

Part 3 is slated to arrive in two weeks. So join the newsletter or subscribe to the RSS and you’ll find out when it happens.

Underjord is a 4 people team doing Elixir consulting and contract work. If you like the writing you should really try the code. See our services for more information.

Note: Or try the videos on the YouTube channel.