MSA vs Monolith — compared with a real load simulator
Is service decomposition always the right answer? Same workload, monolith single server vs MSA chain — compare throughput, latency, and operational cost. Core whiteboard interview discussion.
One-line conclusion
Monolith is faster — until your team and traffic shape forces decomposition. MSA pays for organizational scale, not for throughput.
The experiment
Same workload, two topologies:
- Monolith: NGINX → Spring (single service) → MySQL
- MSA: NGINX → Gateway → 4 microservices in a chain → MySQL each
Drive 1000 RPS into both. Watch p99 latency, throughput, and CPU.
Result
Monolith: ~50ms p99, full throughput. MSA: ~120ms p99 (added hop latency × 4), slightly lower throughput, but each service can scale independently — and a slow downstream only affects one chain instead of the whole app.
When does MSA pay off?
- Different services have wildly different scale needs (recommendation vs payment).
- Different teams own different domains and want independent deploys.
- Different SLAs — payment must not be brought down by analytics.
- Different languages / runtimes — image processing in Go, ML in Python.
When monolith wins
- Small team (< 10 engineers) — coordination cost dominates.
- Low traffic — one DB is enough.
- Tightly coupled domain — distributed transactions become a tax.
- Early product — schema and boundaries are still moving.
Common mistakes
- Starting with MSA: you pay distributed-system cost from day one with no organizational benefit. Start monolith, modularize, then extract services once boundaries are clear.
- Sharing a DB across services: services depend on each other’s schema. Now you have distributed-system pain and monolith coupling.
- Sync chain calls: A → B → C → D. p99 sums up. Use async or CQRS where you can.
- No saga / outbox: distributed transactions break on first failure. Pick an eventual-consistency pattern up front.
Whiteboard answer
“I would start with a monolith. Once a domain shows clear ownership and scale signals — payment requiring tighter SLA, recommendation needing GPU instances — I would extract it as a service with its own DB and the Saga / Outbox patterns. Throughput gain alone never justifies the operational cost.”
The scenarios in this post are runnable in the simulator. Turn the knobs and watch the result change.
Open simulator →