16 Mar 2026
Google OAuth was broken in production. Not intermittently. Completely. Every user who clicked “Login with Google” hit an error page. The callback URL was wrong, NextAuth was rejecting it, and nothing in our recent deployments explained why.
The logs showed a clean 308 redirect. HAProxy was adding a trailing slash to our OAuth callback path. One character, silently appended, was enough to invalidate the entire flow.
What made it worse was that we had not touched the Ingress responsible for that route. No config changes, no deployments to that service, nothing. The Ingress looked exactly as it should.
More …
16 Mar 2026
There was a time when the database schema was clean. Table names followed a consistent pattern. Database changes went through a defined process. Developers knew what they were responsible for and where the boundary was. The infrastructure team could manage environments with confidence because what they saw in development roughly resembled what they would see in production.
Then the principal backend left.
Not immediately, not in a single dramatic incident, but gradually. The kind of collapse that you only recognize clearly in hindsight. Standards started becoming suggestions, and suggestions started becoming optional.
More …
02 Mar 2026
Switching ingress controllers is not a lift-and-shift operation. NGINX and HAProxy are built on different architectural assumptions, and those differences compound at every layer — from how configuration is loaded to how certificates are selected to how the system behaves when a single rule is malformed.
This is a post-migration review of what we found, what broke, and what needs to be in place before any team runs this in production.
More …
01 Dec 2025
We needed an API gateway. Kong was $30k/year, AWS API Gateway had its own cost trap. Tyk’s open source gateway looked like the answer: free, performant, written in Go.
The problem was route management. Tyk uses imperative API calls by default, but our infrastructure is fully declarative. Everything lives in Git, deployed with kubectl apply. We needed an operator.
Tyk has one. It’s called Tyk Operator and it’s exactly what we needed: declarative, GitOps-ready.
More …
20 Oct 2025
The alert sound is burned into my brain now. That specific PagerDuty tone that means something is really wrong. Not “a pod restarted” wrong. Not “latency spike” wrong. The kind of wrong that makes your stomach drop before you even look at your phone.
Late Sunday night. I’d finally convinced myself to stop checking Slack every five minutes and actually relax. Big mistake.
More …