Database selection: the default is Postgres, and the deviations need reasons

June 11, 2023

A database choice is one of the most durable decisions in a system’s architecture. Code gets rewritten. Frameworks get replaced. Services get decomposed. Databases, once they have data in them, tend to outlive everything around them — because migrating a running database is a program of work, not a weekend refactor. That durability is why getting the choice approximately right matters more than getting it perfectly right; the first store you pick will be running years after you expected it to be replaced.

The honest advice for most applications is: use Postgres. Not because Postgres is the best at any single thing — it is not — but because it is good enough at almost everything, mature enough to be operationally boring, and general enough that a system outgrowing its database usually needs a specific capability the relational engine lacks, rather than a wholesale different paradigm. Starting with Postgres and adding specialized stores for the specific workloads that justify them is a more reliable path than starting with five specialized stores and hoping they compose.

That said, the specialized stores exist for reasons. This post is a map of the landscape — relational, document, key-value, graph, time-series, columnar, search, wide-column — and of the axes that separate them, so that when the Postgres answer is wrong, it is wrong for a specific reason you can name.

The axes that actually matter

Most database comparison tables lead with “scalability” and “performance,” which are marketing words that do not survive contact with a real workload. The axes that actually discriminate between stores are:

Access pattern. Is the data accessed by key, by relationship, by time range, by content, by aggregation? A store optimized for one access pattern is usually inefficient at the others.
Schema flexibility. Is the shape of the data stable (you can migrate the schema when it changes) or genuinely variable (you get arbitrary documents with arbitrary fields)?
Consistency model. Do you need strict serializability, snapshot isolation, causal consistency, or eventual consistency? Discussed at length in the consistency-models post.
Read/write ratio and pattern. Is the workload mostly reads (catalog, content), mostly appends (logs, telemetry), mostly updates (stateful objects)? Each favors different storage.
Scale axis. Does scale come from more data, more throughput, or more geographic distribution? These are different problems.
Operational weight. How hard is it to run, back up, upgrade, and recover? A “faster” database that nobody on the team knows how to operate is slower in practice than a well-run boring one.

A database fits a workload when its optimizations match the workload’s dominant access pattern. A database is mis-fit when you are fighting it — when the workload’s access pattern requires emulating a capability the database does not natively provide.

Relational (Postgres, MySQL)

The default. Tables, rows, schemas, SQL, transactions, ACID, foreign keys. Forty years of operational maturity. Rich query language. JOINs, subqueries, window functions, CTEs — everything you need to ask questions you had not anticipated when you designed the schema.

Postgres specifically has evolved into something closer to a multi-paradigm store than a pure relational database. jsonb columns give you document storage with index support. Arrays, range types, and composite types give you richer modeling than traditional SQL. Full-text search is built in. hstore, ltree, PostGIS, pgvector — the extension ecosystem handles specialized workloads without forcing a separate store. Logical replication enables CDC. Partitioning handles large tables. With pg_stat_statements, query plans are observable in production.

Where relational hits walls:

Extreme write throughput. Single-writer OLTP databases top out somewhere — varies by workload, usually in the tens-of-thousands-of -writes-per-second range for a beefy Postgres — before the write-ahead log becomes the bottleneck. Sharding buys more, at operational cost.
Very large data volumes (terabytes of live operational data). Postgres handles this but becomes painful to operate at the top end.
Unpredictable schema. If each row really has a different shape and the shape is user-defined, you are fighting the schema system.
Graph traversal. Recursive CTEs work for shallow queries; they get expensive fast for deep traversal.
Analytics over billions of rows. Row-oriented storage loses to columnar engines by one to two orders of magnitude on aggregation queries.

The default should be Postgres (or MySQL, if your stack is already there) until one of those walls shows up as a real problem, not a theoretical one. The majority of applications that introduce a specialized store alongside Postgres could have kept going on Postgres alone and avoided the operational cost of running a second system.

Document (MongoDB, DocumentDB)

Documents are JSON-shaped records, indexed by id and by selected fields. The promise is schema flexibility — store arbitrary documents, query by field — plus horizontal scaling by sharding on a chosen key.

Where document stores fit:

Content-heavy workloads where each record is a self-contained document with variable structure: CMS content, product catalogs with heterogeneous attributes, user-generated content with plugin fields.
Rapidly evolving schemas in early-stage products where the data shape is genuinely unsettled. (Note: in mature products, the shape settles, and the flexibility becomes a liability.)
Workloads that naturally denormalize — an “order” that contains line items, addresses, and customer snapshots as a single document maps cleanly onto a document model.

Where document stores mis-fit:

Queries across documents. A JOIN between two collections is either impossible or slow. Workloads that need to query relationships end up denormalizing heavily — duplicating data across documents, with the resulting consistency headaches.
Transactions. MongoDB has multi-document transactions since 4.0; they work, and they are slower than the single-document operations the engine is optimized for. If you need transactions routinely, you are fighting the tool.
“We’ll figure out the schema later.” In practice, the schema is in the code — the code reads and writes fields with specific names and types — and letting the database not enforce it produces bugs that relational’s NOT NULL would have caught at insert.

The “document store vs. Postgres+jsonb” decision is close enough in most cases that the tiebreaker is usually “what does the team already run?” Postgres with a jsonb column gives you a document model with the escape hatch of full relational queries when you need them; that escape hatch is worth more than most teams expect it to be.

Key-value (Redis, DynamoDB, RocksDB)

The simplest model. A value (a string, a number, a serialized record) stored under a key. Get by key, set by key, delete by key. No query language. No joins. No secondary indexes (except sometimes, and with care).

This constraint is also the optimization. When your access pattern is genuinely “get by known key, set by known key,” a key-value store is one to two orders of magnitude faster than a relational store, scales trivially, and is operationally simple.

Where key-value stores fit:

Caching. The single biggest deployment of key-value stores is caches — Redis, Memcached — in front of a slower source of truth.
Session storage, rate-limiting counters, feature flags, short-lived state. The data is small, simple, accessed by a clear key, and often ephemeral.
At scale, when the access pattern is designed around the key. DynamoDB at scale, Cassandra for wide-column workloads — these stores win when the data model is shaped like “partition by X, sort by Y within each partition.” Fighting that shape produces bad performance and high bills.

The DynamoDB-style discipline — designing the access patterns first, then shaping the data to fit them — is the opposite of the relational discipline, which lets you model the data and query it flexibly later. Neither is wrong; they fit different team styles and different workloads. Teams that adopt DynamoDB and try to use it like a relational database build expensive, slow systems that they eventually migrate off.

Graph (Neo4j, Neptune, Dgraph)

Graph stores model data as nodes and edges, both with properties. Queries are traversals — find all friends of friends who also follow this account — and are fundamentally recursive.

The use cases for graph stores are narrower than the marketing suggests but real:

Social networks at scale: friend graphs, follower graphs.
Recommendation engines that compute over relationships.
Fraud detection — finding rings of connected accounts, chains of transactions.
Knowledge graphs and semantic data — ontologies, taxonomies with rich interconnection.
Identity and permission systems with deeply nested group membership (see ReBAC in the authn/authz post).

Where graph stores mis-fit:

Data that is mostly tabular, with some relationships. A customer with orders is not a graph problem; it is a relational problem with two tables and a foreign key. Using a graph store because “everything is connected” produces the wrong optimization.
Transactional workloads with heavy updates. Most graph stores prioritize read-heavy traversal over high-throughput writes.

The practical test for whether you need a graph database: are your queries recursive to depth greater than 2, and do you run them frequently? If yes, a graph store is worth considering. If no, a relational store with a few well-indexed joins is usually better.

Time-series (TimescaleDB, InfluxDB, Prometheus)

Time-series workloads have a specific shape: append-heavy, ordered by time, queried primarily by time range, often aggregated by time buckets, rarely updated after insertion. Metrics, telemetry, IoT sensor data, financial ticks, application events.

A time-series store is optimized for this shape. Data is partitioned by time, often compressed in large chunks, indexed by the common query patterns (time range + tag filters), and retained with time-based policies (keep raw for 7 days, downsample to 1-minute aggregates for 30 days, 1-hour aggregates for 1 year).

TimescaleDB is a Postgres extension that adds this to a relational database — which is useful because time-series data is often correlated with business data you already have in Postgres. Prometheus is a dedicated time-series database optimized for the observability use case specifically (label-based metrics, PromQL, pull-based scraping). Influx is a standalone general-purpose time-series database.

The diagnostic: if you are regularly querying “give me the p99 of this value, bucketed by 1-minute intervals, over the last 7 days, filtered by these tags” — a time-series database will make this 10× to 100× faster than a naive relational schema. If you are not, you do not need one.

Columnar / OLAP (ClickHouse, BigQuery, Snowflake, DuckDB)

Relational databases store data row-by-row: all the columns of row 1, then all the columns of row 2, and so on. Columnar databases store data column-by-column: all the values of column A across all rows, then all the values of column B. For aggregation queries — “what is the average of column A?” — columnar storage reads 1/N as much data (where N is the column count) and compresses dramatically better because adjacent values tend to be similar.

For OLTP workloads — insert a row, read a row, update a row — row-oriented storage wins. For OLAP workloads — scan billions of rows, aggregate, group, report — columnar storage wins by orders of magnitude.

Where columnar stores fit:

Analytics dashboards, BI, reporting over large volumes.
Event-stream analysis — clickstream, logs, user behavior.
Financial and business analytics aggregated across customers, time, geographies.
Data warehouses, almost by definition.

The typical modern architecture is a transactional store (Postgres, MySQL, or a stream of events) as the source of truth, with CDC or batch ETL feeding a columnar store for analytics. The transactional store handles the day-to-day; the columnar store handles the questions the business asks at the end of the quarter. Trying to run analytics on the OLTP store is how the OLTP store gets slow for users.

ClickHouse in particular has become the pragmatic choice for self-hosted OLAP: much faster than warehouse alternatives for the workloads it fits, operationally more complex than managed services, much cheaper than them at scale. DuckDB is the same engineering philosophy in a single-node, embeddable form — often the right answer for analytics at up to a few hundred gigabytes.

Search (Elasticsearch, OpenSearch, Meilisearch, Typesense)

Full-text search is a distinct problem. Relational LIKE '%term%' queries are slow and do not rank results. Proper search requires an inverted index — for each word, the list of documents that contain it — plus tokenization, stemming, stop-word handling, and a ranking function.

Search engines are expensive to run well. Elasticsearch clusters are notoriously finicky; Meilisearch and Typesense are simpler but less featureful. Postgres’s built-in full-text search (tsvector, tsquery) is often good enough for moderate corpuses and avoids running a second store.

The practical test: if search is the primary feature (an e-commerce site, a documentation portal, a social platform’s search bar), a dedicated search engine is justified. If search is incidental (a basic filter on a list page), Postgres full-text search or ILIKE on a trigram index is usually enough.

Search engines also tend to be not the source of truth. Data lives in the primary store; the search engine is a derived index rebuilt from the primary. Treating the search engine as authoritative is a mistake — you will need to rebuild it, and you will need the source data to do so.

Polyglot persistence and its costs

The term of art for “use the right database for each job” is polyglot persistence. The marketing is good; the reality is operational.

Each additional database is:

A backup and recovery story that must be tested.
A monitoring surface with its own metrics and quirks.
An upgrade path with its own release cycle.
A skill set the team must maintain.
A data-consistency boundary — the data between two stores is eventually consistent, at best.
A cost line item that grows with scale.

A polyglot architecture pays off when the specialization is real — when a single-store approach is demonstrably inadequate for a specific workload. It does not pay off when it is aspirational — when the team adopts a new store because it is interesting, or because a blog post said it was scalable, and now has two databases to operate instead of one.

The working heuristic: every additional database needs a story. “We use DynamoDB for X because the access pattern is pure key-lookup at volumes Postgres cannot sustain.” “We use ClickHouse for Y because the analytics query pattern is orders of magnitude faster there.” If you cannot produce a sentence of this form, you are paying for polyglot persistence without benefiting from it.

The rule, once more

Start with Postgres. Stay on Postgres until a specific workload has outgrown it in a way you can name. Add specialized stores for those workloads, one at a time, with a clear operational owner for each. Keep the source of truth in as few places as possible. Treat derived stores — caches, search indexes, analytics warehouses — as rebuildable from the source.

The mistake is not “we chose the wrong database.” The mistake is “we chose the database before we understood the access pattern.” Postgres buys you the time to understand. Spend that time.