The First Tenant: Moving This Blog onto the Cluster (Series: Part 4)

The First Tenant: Moving This Blog onto the Cluster (Series: Part 4)

The previous three posts built a Kubernetes platform on Talos and Proxmox — control plane, CNI, storage, ingress, certificates, secrets, observability — and then had nowhere to put it. A platform with no workloads is just an expensive way to run health checks. So: the first real tenant.

There’s a small joke here. The workload I migrated first is the blog you’re reading — until recently a Docker Compose stack (WordPress and MariaDB on an LXC), now running on the cluster it documents. It was a good first candidate: self-contained, recoverable, and if I broke it I’d know immediately. Here’s the migration, and the three things that went wrong — none of them the things I’d planned for.

The plan, briefly

Unremarkable: a custom WordPress image (theme baked in) from a private registry, MariaDB as a single-replica StatefulSet on NFS, secrets pulled from Vault by the External Secrets Operator, an ingress with an automatic TLS cert. Dump the database, point DNS at the new front door, keep the old Docker stack as a rollback until I trusted the new one.

The standard part worked. The interesting part was everything the plan didn’t mention.

Gotcha 1: an immutable image needs the whole dependency tree

The point of baking the theme into the image is immutability — the container is reproducible, and nobody can quietly change it through the admin panel. So I copied my theme in and moved on.

My theme is a child theme. Its stylesheet declares one line I’d stopped reading: Template: ollie. A child theme overrides a few pieces of a parent and inherits the rest; without the parent it doesn’t degrade gracefully, it doesn’t load at all. And the site’s active theme wasn’t the child — it was the parent.

So my image held a theme that depended on a theme that wasn’t in it, while the actual active theme was the missing one. On the old Docker stack this never showed: the parent had been installed through the admin panel months earlier and persisted in a volume I wasn’t migrating. The dependency was real — just invisible, because it lived somewhere I’d stopped looking.

A code review caught it before deploy, reading the stylesheet header I’d skimmed past a dozen times. The fix was to vendor the exact parent into the build and bake both:

COPY theme/ollie            /usr/src/wordpress/wp-content/themes/ollie
COPY theme/clean-technical  /usr/src/wordpress/wp-content/themes/clean-technical

The lesson: when you make something immutable, you own its entire dependency tree, not just the parts you wrote. The pieces that used to be “already there” — installed by hand, persisted in a volume, assumed — are exactly the ones an immutable build will be missing, because nobody declared them anywhere.

Gotcha 2: a container that chowns its data dir fights network storage

MariaDB came up, crash-looped, and died on one line:

chown: changing ownership of '/var/lib/mysql/': Operation not permitted

Started as root, the official MariaDB image tries to chown its data directory to the mysql user before dropping privileges. On a local disk that’s free. My data lives on an NFS export that squashes root, so the server refuses the ownership change outright — and the entrypoint treats that as fatal. The storage already makes provisioned directories world-writable, so the chown wasn’t even necessary; it was reflexive.

The fix is to not start as root. Run the container as mysql directly and the entrypoint skips the chown, writing straight to the already-writable directory:

securityContext:
  runAsUser: 999      # mysql
  runAsGroup: 999
  runAsNonRoot: true

The lesson: any stateful image that touches ownership or permissions on its data directory is betting on local storage. On network storage that bet often loses, and the tell is Operation not permitted at startup. Run as the data dir’s owner and the whole class disappears.

Gotcha 3: a cutover isn’t done until every resolver agrees

I migrated the data, pointed public DNS at the cluster, watched real traffic land on the new pod, and let it run. Satisfied, I shut the old Docker stack down. Within seconds: 502 Bad Gateway — but only from inside the house.

Externally the site was fine; on my own LAN it was broken. Split-horizon DNS: my network had a local override for the blog’s hostname pointing at the internal reverse proxy, which forwarded to the Docker container. Outside visitors resolved through the public path and reached the cluster; I, on the LAN, resolved to the old internal shortcut. While the Docker stack was up, both paths worked and the seam stayed invisible. Take it down, and the internal path had nowhere to go.

The fix: point the internal record at the cluster ingress too, and retire the old proxy entry, so both paths land in the same place:

# internal resolver: blog.example.com  A  198.51.100.20   (the cluster ingress)

The lesson: “I cut over the DNS” usually means one DNS. A name can resolve differently depending on who’s asking and where they stand, and a migration isn’t finished until every resolution path points at the new origin. The one you’ll forget is the internal shortcut you set up so long ago you no longer think of it as DNS.

The general lesson

Every one of these had the same shape: a dependency that had worked for so long I’d stopped seeing it. A parent theme installed by hand. A chmod local disks always allowed. A DNS record that had quietly worked for a year. Moving a workload doesn’t break what you’re thinking about — you ported that deliberately. It breaks what had gone invisible because it used to be free.

The first workload is the expensive one: it’s where all those invisible assumptions get billed at once. The second pays almost none of it — the path is proven, the patterns are written down, the next migration is mostly fill-in-the-blanks. Which is exactly why you move the cheap, recoverable workload first.

This blog now serves from the cluster, inside and out, and the Docker stack is gone. Part 5, whenever the next tenant moves in, should be a lot shorter.