The HAProxy Ingress Behavior That Broke Our Google SSO in Production

Google OAuth was broken in production. Not intermittently. Completely. Every user who clicked “Login with Google” hit an error page. The callback URL was wrong, NextAuth was rejecting it, and nothing in our recent deployments explained why.

The logs showed a clean 308 redirect. HAProxy was adding a trailing slash to our OAuth callback path. One character, silently appended, was enough to invalidate the entire flow.

What made it worse was that we had not touched the Ingress responsible for that route. No config changes, no deployments to that service, nothing. The Ingress looked exactly as it should.

More …

After the Architect Left, Everything Became Optional

There was a time when the database schema was clean. Table names followed a consistent pattern. Database changes went through a defined process. Developers knew what they were responsible for and where the boundary was. The infrastructure team could manage environments with confidence because what they saw in development roughly resembled what they would see in production.

Then the principal backend left.

Not immediately, not in a single dramatic incident, but gradually. The kind of collapse that you only recognize clearly in hindsight. Standards started becoming suggestions, and suggestions started becoming optional.

More …

What NGINX to HAProxy Migration Taught Us About Config Blast Radius

Switching ingress controllers is not a lift-and-shift operation. NGINX and HAProxy are built on different architectural assumptions, and those differences compound at every layer — from how configuration is loaded to how certificates are selected to how the system behaves when a single rule is malformed.

This is a post-migration review of what we found, what broke, and what needs to be in place before any team runs this in production.

More …

The Open Source Bait and Switch Nobody Talks About

We needed an API gateway. Kong was $30k/year, AWS API Gateway had its own cost trap. Tyk’s open source gateway looked like the answer: free, performant, written in Go.

The problem was route management. Tyk uses imperative API calls by default, but our infrastructure is fully declarative. Everything lives in Git, deployed with kubectl apply. We needed an operator.

Tyk has one. It’s called Tyk Operator and it’s exactly what we needed: declarative, GitOps-ready.

More …

Why Auto-Upgrade is Playing Russian Roulette With Your Uptime

The alert sound is burned into my brain now. That specific PagerDuty tone that means something is really wrong. Not “a pod restarted” wrong. Not “latency spike” wrong. The kind of wrong that makes your stomach drop before you even look at your phone.

Late Sunday night. I’d finally convinced myself to stop checking Slack every five minutes and actually relax. Big mistake.

More …