Why "100 more servers" did not get faster — 6 scale-out pitfalls
Problems horizontal scale alone cannot fix: single DB bottleneck, cold cache spike, microservice chain latency, sticky sessions, connection pools, cascading failures. Each demonstrated in the simulator.
“Traffic grew. Just add more servers, right?” — the most common claim on the whiteboard. In production, six pitfalls regularly make horizontal scale not work.
1. Single-DB bottleneck
App instances multiply, but all writes still hit one primary DB. Read replicas help reads; writes serialize. Solution: read/write split, sharding, write-through cache for idempotent operations, eventual move to NewSQL (CockroachDB / Spanner-class).
2. Cold cache spike
Add an app instance — its in-process cache starts empty. Every request misses for the first few minutes, hammering the DB. Solution: warm caches on startup, distributed cache (Redis) shared across instances, request coalescing on misses.
3. Microservice chain latency
One user request fans out to N microservices. p99 of the chain = max of all p99s + RTT per hop. Each new service adds ~5-10ms minimum. Solution: critical-path budget, parallel fan-out where possible, batched aggregation, careful service decomposition.
4. Sticky sessions
In-memory session state pins a user to one instance — that instance becomes a hot spot, scaling becomes uneven, deploys risk dropping sessions. Solution: sticky session (HAProxy cookie / source IP), external session store (Redis), or stateless JWT (short TTL + refresh token). Modern default: JWT.
5. Connection pool exhaustion
Each app instance opens its own DB pool (say 50 conns). 100 instances → 5000 connections, exhausting the DB’s max_connections. Solution: connection pooler middleware (PgBouncer / proxysql / managed DB proxy) — multiplex app’s 5000 conns into ~100 actual DB conns.
6. Cascading failure
One downstream slows down. Upstream callers retry. Each retry holds a thread. Soon every app instance has all threads stuck on the slow downstream — even healthy endpoints stop responding. Solution: Circuit Breaker, bulkhead, request timeout, retry with jitter, load shedding.
Summary
Horizontal scale solves stateless CPU/IO. Six pitfalls require dedicated architectural work. The simulator’s “scale-out single DB” preset demonstrates pitfall #1 directly — add app instances and watch DB util saturate while throughput plateaus.
The scenarios in this post are runnable in the simulator. Turn the knobs and watch the result change.
Open simulator →