Microservices: a pattern language, not a silver bullet

February 25, 2024

The microservice architecture is an architectural style. That single sentence is worth stating plainly, because most of the confusion in the industry comes from people treating it as a product, a team structure, or a badge of seriousness. It is none of those. It is a way of structuring an application as a collection of two or more services that are independently deployable and loosely coupled, aligned with business capabilities, and owned by small teams.

The reason this style exists is not fashion. It is that large, successful software — the kind that runs a business — tends to grow past the point where a single deployable unit can keep up with the rate of change the business demands. You feel it as a slowdown: the build takes longer, the test suite gets flaky, a one-line change waits behind somebody else’s risky refactor, the on-call rotation inherits problems nobody can isolate. That is not a failure of the monolith as a concept. The monolithic architecture is not an anti-pattern. It is a good choice for small teams and small projects. It becomes a poor choice when the application, the organization, or both have outgrown what a single deployable unit can carry.

The microservice architecture is one response to that outgrowth. It is not the only response, and it is not free.

The forces: dark energy and dark matter

Before we talk about patterns, we should talk about forces — the pressures that push a design toward or away from decomposition. I find it useful to group them into two opposing sets.

Dark energy forces push a system apart. They reward decomposition:

Simple components. A service that owns one or two subdomains is easier to understand and maintain than a component that owns twelve.
Team autonomy. A team needs to develop, test, and deploy its software without waiting on other teams’ calendars.
Fast deployment pipeline. Fast feedback and high deployment frequency require components that are fast to build and fast to test.
Support multiple technology stacks. Different subdomains have different needs; a single stack rarely fits all of them equally well.
Segregate by characteristics. Resource, availability, and security requirements differ across the system. Segregating them lets you scale, secure, and harden each one independently.

Dark matter forces pull a system together. They resist decomposition:

Simple interactions. An operation local to a single component is easier to reason about than a distributed one.
Efficient interactions. A distributed operation that requires many network round-trips or large data transfers is often too slow.
Prefer ACID over BASE. It is easier to implement a transaction as an ACID transaction inside a single database than as an eventually consistent saga across several.
Minimize runtime coupling. Every synchronous call between services is a place where availability and latency can degrade.
Minimize design-time coupling. Services that must change together reduce the independence the whole style is trying to buy.

An architecture is not a choice between these sets of forces. It is an equilibrium between them. Every boundary you draw, every API you expose, every database you split is a local judgment about which force wins in that place.

The pattern: microservice architecture

Context. You are developing a business-critical enterprise application. You need to deliver changes rapidly, frequently, and reliably, with multiple teams contributing in parallel. The application is large enough, and the organization complex enough, that a single deployable unit is slowing everyone down.

Problem. How should you organize subdomains into deployable, executable components so that the dark energy forces can do their work without the dark matter forces overwhelming them?

Solution. Structure the application as a set of two or more independently deployable, loosely coupled services. Each service:

Is owned by a single team.
Is aligned with a business capability or subdomain.
Has its own persistent state, private to the service and accessible only through its API.
Can be built, tested, deployed, scaled, and replaced on its own schedule.

Resulting context — benefits. Simple services are easier to understand and maintain. Teams become cross-functional and autonomous. Each service can use the technology that fits it. Scaling and availability decisions can be made per service.

Resulting context — drawbacks (potential). The application is now a distributed system. Developers must handle partial failure, inter-process communication, and eventual consistency. Testing is harder. Deployment is harder. The organization itself has to change.

Resulting context — issues. Every one of those drawbacks is the entry point to another pattern. How do you decompose? Decompose by business capability or decompose by subdomain. How do you manage data? Database per service. How do you implement transactions that cross services? Saga. How do you query across them? API composition or CQRS. How do clients reach them? API gateway. How do they find each other? Service registry and client- or server-side discovery. The microservice architecture is not a single design — it is a language of patterns, each of which exists to resolve a specific force that decomposition introduces.

Decomposition: by business capability

The most durable way to draw service boundaries is to align them with what the business does to generate value. A business capability is an organizational concept: product catalog, inventory, order management, delivery, payment, customer accounts. Capabilities change slowly. Technology changes quickly. Aligning services with capabilities gives the architecture a stable spine.

An online store decomposed this way might have:

A Product Catalog service that owns what is for sale.
An Inventory service that owns how much is available.
An Order Management service that owns the lifecycle of a customer order.
A Delivery service that owns getting the order to the customer.
A Payment service that owns charging the customer.

The corresponding teams are cross-functional and own their capability end to end. Changes that affect multiple capabilities require coordination across teams — so the cost of a bad boundary is immediate and organizational, not merely technical. That is, in fact, the point.

An alternative decomposition strategy — decompose by subdomain — applies Domain-Driven Design. The two strategies often converge on a similar answer, because business capabilities and bounded contexts describe the same organizational reality from different angles.

Data: database per service

Decomposition of code without decomposition of data is not decomposition. If two services read and write the same tables, a schema change requires both services to deploy together, and the dark energy force team autonomy has lost.

The database per service pattern makes each service’s persistent data private. It is part of the service’s implementation; other services may not read it directly, and they may not write it at all. Access is only through the service’s API. Depending on your infrastructure, “its own database” may mean its own tables, its own schema, or its own database cluster — what matters is that the ownership boundary is real.

The benefits are straightforward. Services are loosely coupled: a change to one service’s schema does not ripple outward. Each service can choose a data store that fits its workload — a relational store for the order history, a document store for the product catalog, a search engine for full-text queries, a time-series store for telemetry.

The drawbacks are the reason most of the rest of the pattern language exists:

Transactions that cross service boundaries. You cannot use a local ACID transaction. You cannot practically use two-phase commit either.
Queries that cross service boundaries. There is no JOIN across databases you do not own.
Operational surface area. You now run more than one data store.

Each of those drawbacks has a pattern on the other side.

Transactions: the saga

A saga is a sequence of local transactions. Each local transaction updates a single service’s database and publishes a message or event that triggers the next local transaction. If a step fails, the saga executes a sequence of compensating transactions that undo the work of the earlier steps.

There are two coordination styles:

Choreography. Each service reacts to events from the previous step and emits events for the next. There is no central coordinator. The workflow lives in the shape of the event graph.
Orchestration. A dedicated orchestrator tells each service which step to perform next. The workflow lives inside the orchestrator.

Choreography is attractive for simple sagas because there is no single owner of the workflow; it gets painful when the saga is long or the graph of events becomes hard to picture. Orchestration is attractive for complex sagas because the workflow is explicit in one place; the cost is a new service to own and operate.

The pattern’s hardest constraints are rarely discussed enough:

No automatic rollback. Unlike a database transaction, a saga does not unwind itself. The developer must design a compensating transaction for every step that needs to be undone. Some steps cannot be undone at all — emails sent, payments captured, warehouses dispatched — and must be handled with semantic compensation instead.
No isolation. A saga gives up the I in ACID. Between the first local transaction and the last, another saga may read state that is only partially consistent. You handle this with countermeasures — semantic locks, commutative updates, pessimistic views, version files — rather than pretending it does not exist.

If any step of a saga is implemented as a conversational exchange rather than as a transaction with its own local commit, you have built something that looks like a saga but behaves like a distributed function call. That is a different, worse thing.

Queries across services

Two patterns, roughly complementary:

API composition. The client (or a composing service) calls each participating service and joins the results in memory. This is the simplest thing that works. It is acceptable when the number of services is small, when the data sets returned are small, and when the latency budget allows for the worst case of the slowest dependency.

CQRS. Maintain a separate view database that is kept up to date by subscribing to the events published by the services that own the source data. Queries run against the view; commands run against the source services. This is more machinery — more services, more event handling, more eventual consistency to reason about — in exchange for query performance and flexibility that API composition cannot match.

Pick API composition unless you have a concrete reason not to. The reasons tend to be: the join is too expensive to do per request, the query needs data the source services cannot shape, or the clients cannot tolerate the latency tail of a fan-out.

The edge: API gateway

External clients should not know about your internal service topology, and they should not be the ones paying the cost of fanning out across it. An API gateway sits at the edge and does the work the clients should not: request routing, protocol translation, composition of backend calls into client-shaped responses, authentication, rate limiting. Different client types (mobile, web, partner) often deserve different gateway surfaces — the Backends for Frontends variant makes that explicit.

The gateway is real infrastructure with real costs. It is a deployable, it is on the request path, and its availability is now part of your product’s availability. Make it somebody’s job.

The failure mode: distributed monolith

If you get the decomposition wrong, you do not end up with microservices. You end up with a distributed monolith — services that must deploy together, services that call each other synchronously for every operation, services that share a schema through a back door, services whose tests only pass when the whole constellation is running. A distributed monolith has all the operational cost of a microservice architecture and none of the autonomy. It is slower than the monolith it replaced.

The diagnostic signals are consistent:

Releases are coordinated across multiple services in the same change window.
A single incident pages multiple teams because failure cascades through synchronous calls.
The integration test suite can only run in a shared environment.
Teams block on each other’s roadmap to get data they need.
A schema change requires a migration in more than one service.

If any of those is true, the architecture is not yet what the diagram says it is. The dark matter forces have won in a place where the dark energy forces were needed. The fix is almost never to add a service. It is usually to redraw an existing boundary — to move a capability, a piece of data, or a workflow to the side it actually belongs on.

Getting there from here: the strangler fig

Greenfield microservices are rare. Most of the time, the starting point is a monolith that works and pays salaries, and the question is how to evolve it without a rewrite. The rewrite is almost always a mistake. Two years of parallel development against a moving target is how organizations discover that their monolith contained knowledge the team had forgotten it had.

The strangler fig pattern is the honest alternative. You leave the monolith running. You add new capabilities as new services. When you want to migrate an existing capability, you carve a seam in the monolith so that a new service can take over that capability, route traffic to the new service, and then remove the old code once it is cold. The monolith shrinks as the service estate grows, a piece at a time. At no point is there a big-bang cutover.

Two practical rules:

Build new things outside the monolith when it is cheap to do so. This keeps the monolith from growing and gives the team practice with the operational patterns (deployment, observability, data ownership) before they have to do it under pressure.
Extract by capability, not by convenience. The tempting extraction is “the code I happen to be working in.” The useful extraction is “the capability whose pace of change is mismatched with the rest of the monolith.” Those are rarely the same thing.

Migration is a program of work, not a project. Treat it that way.

What the style will and will not buy you

The microservice architecture, when it fits, buys you the ability to let many teams ship independently against a shared application, and to make independent decisions about technology, scaling, and availability. Those are valuable things — and they are organizational benefits at least as much as technical ones.

It does not buy you reliability for free. It does not buy you performance. It does not buy you clarity. Those come from the patterns you apply on top of it to resolve the distributed-system problems that decomposition creates. Saga. Database per service. API composition. Event-driven communication. Observability of the interactions, not just the services. A deployment pipeline that takes each service from commit to production on its own schedule.

If you adopt the style without the patterns, you will build a distributed monolith. If you adopt the style without the organizational change — teams aligned with capabilities, empowered to own their services end to end — you will build a distributed monolith on top of an org chart that cannot deploy it.

The architecture is not the picture on the whiteboard. It is the set of forces the design has resolved, the patterns it has chosen to resolve them with, and the organization that can keep operating them. Draw your boundaries where those three agree. Move them when they stop agreeing. That is the whole job.