serversim
SimulatorMy DocumentsTemplatesPricing
…
…
BLOG
🔥

Flash sale, 10k users on the same item → DB serialization → payment failures

2026-05-14·7 min·
hot-keypaymentflash-saledatabase

Hot-key contention. Concurrent updates on the same row trigger RDBMS row-lock serialization → backpressure → app threads exhausted → cascading failure. Four solutions (Redis INCR / sub-counter / Kafka serial / cell architecture).

Situation

A flash sale launches. 10,000 users click the same item at once. Every request runs UPDATE inventory SET qty = qty - 1 WHERE id = 123.

Result: DB row-lock serialization. Only one request commits at a time. The remaining 9,999 queue up — each holds an app thread — upstream backpressure builds — the whole pipeline cascades to failure.

Why does this happen?

Pessimistic SELECT ... FOR UPDATEor any UPDATE on the same row holds an exclusive lock until commit. With p99 commit at ~10ms, you cap at ~100 successful writes/sec per row, regardless of how many app instances you add. Horizontal scaling does not help: every instance funnels through the same row lock.

Four solutions

1. Redis INCR (atomic counter)

Move the inventory counter to Redis. DECR inventory:123is single-threaded and atomic — Redis processes ~100k+ ops/sec, far above the lock ceiling. Persist totals back to DB asynchronously.

Trade-off: weaker durability (Redis snapshot). Acceptable if the DB is the system of record and reconciliation runs every few seconds.

2. Sub-counter sharding

Split the single counter into N sub-counters (e.g. 16 sub-rows). Each request decrements a random sub-counter. Periodically sum the sub-counters for the real total. Lock contention drops 16x.

Trade-off: total reads are approximate until a periodic sum runs. Use when an eventually-accurate total is acceptable.

3. Kafka serial worker

Producers publish purchase events to Kafka with the item ID as the partition key. A single consumer per partition processes events sequentially. The bottleneck remains, but the backpressure is in the queue (not in app threads) — the system stays responsive and you can advertise a fair ETA.

Trade-off: extra latency (queue + processing). Best for fairness scenarios — ticket sales, limited drops.

4. Cell architecture (partition the workload)

Split traffic into N cells (e.g. by user ID hash). Each cell holds an independent replica of the inventory row, sized for cell capacity (e.g. 100 units of total 1600 across 16 cells). Cells operate in isolation — no cross-cell contention.

Trade-off: operational complexity. Best when contention dominates and you need full horizontal scale.

Summary

Adding servers does not break row-lock contention. Pick a strategy that matches your consistency needs:

  • Redis INCR — fastest, async DB reconcile.
  • Sub-counters — middle ground.
  • Kafka serial — fairness + ETA.
  • Cell architecture — full isolation.
🧪 Try it in the simulator

The scenarios in this post are runnable in the simulator. Turn the knobs and watch the result change.

Open simulator →
← Back to blog
© 2025-2026 serversim · Architecture simulation tool