CAP Theorem and PACELC: What They Mean for Real Database Decisions

Why These Theorems Matter

When you choose between PostgreSQL and Cassandra, between DynamoDB and CockroachDB, between eventual consistency and strong consistency — you are making a trade-off that CAP theorem formalizes. Most engineers who use these databases have only a vague understanding of the trade-off they are accepting.

CAP theorem does not tell you which database to use. It tells you what is impossible — and therefore what every distributed database must sacrifice. Understanding what was sacrificed helps you understand when a database will fail you and why.

PACELC goes further. CAP only describes behavior during failures. PACELC describes the trade-off that exists even when everything is working normally — which is 99.9% of the time.

CAP Theorem: The Three Properties

CAP theorem states that a distributed data store can guarantee at most two of three properties simultaneously:

Consistency (C): Every read returns the most recent write or an error. All nodes see the same data at the same time. This is linearizability — the strongest form of consistency.

Availability (A): Every request receives a response (not an error). The system does not refuse to answer. Note: the response may contain stale data.

Partition Tolerance (P): The system continues operating when network partitions occur — when some nodes cannot communicate with others.

The CAP Triangle:

              Consistency
                  /\
                 /  \
                /    \
               /      \
              /  CA     \
             /  (impossible\
            /   in distrib.) \
           /──────────────────\
      CP  /                    \ AP
         /                      \
        /                        \
       ────────────────────────────
              Partition Tolerance

The Key Insight: P Is Not Optional

The most important — and most misunderstood — aspect of CAP is that partition tolerance is not a choice for distributed systems. Network partitions happen. They are not hypothetical edge cases; they are a physical reality of running software across multiple machines.

Network partitions happen because:
  - Network cables fail
  - Switches malfunction
  - Datacenters lose connectivity
  - A server's NIC fails
  - Cloud provider experiences a network event
  - GC pause makes a node temporarily unresponsive
  - Asymmetric routing causes one-way connectivity

In any system running on multiple machines,
you will experience network partitions.
You must decide what to do when they occur.

Since P is mandatory, the real choice is: when a partition occurs, do you choose Consistency or Availability?

CP system: When a partition occurs, some nodes refuse to answer requests (sacrifice availability) to ensure every response is consistent.
AP system: When a partition occurs, nodes continue answering requests with potentially stale data (sacrifice consistency) to remain available.

There is no CA system in a distributed context — you cannot have both consistency and availability when a network partition exists. CA is only possible on a single machine.

What Happens During a Partition

CP system behavior (PostgreSQL with synchronous replication):

Normal operation:
  Client writes → Primary → confirms to Replica → returns success

During partition (Primary cannot reach Replica):
  Client writes → Primary → cannot confirm to Replica
  Primary REFUSES the write (to maintain consistency)
  → Returns error to client
  → Availability sacrificed to preserve consistency

When partition heals:
  Primary resumes writes
  All nodes are consistent

AP system behavior (Cassandra with eventual consistency):

Normal operation:
  Client writes → Any node accepts → replicates in background

During partition (Node A cannot reach Node B):
  Client writes to Node A → Node A accepts → returns success
  Node A cannot reach Node B → B gets the write later
  Client reads from Node B → sees stale data (or old value)
  → Availability preserved, consistency temporarily sacrificed

When partition heals:
  Nodes sync via anti-entropy
  Eventually both nodes have the same data
  Conflict resolution (LWW, CRDT) handles any divergence

The CAP Classification of Real Databases

CP Databases (sacrifice availability during partitions):
  PostgreSQL      — synchronous replication refuses writes when replica unreachable
  MySQL           — with sync replication (semi-sync default)
  HBase           — master refuses writes during partition
  ZooKeeper       — refuses writes without quorum
  Etcd            — refuses writes without Raft quorum
  MongoDB         — with write concern majority

AP Databases (sacrifice consistency during partitions):
  Cassandra       — accepts writes on any node, eventually consistent
  DynamoDB        — eventually consistent by default
  CouchDB         — accepts writes, syncs later
  Riak            — eventual consistency, conflict resolution
  DNS             — classically AP: stale responses during partition

"CA" (single-node, not truly distributed):
  SQLite          — single file, no distribution
  Single-node PostgreSQL without replication
  — these are not distributed systems; CAP doesn't apply

Important nuance — tunable consistency:

Many "AP" databases allow you to tune consistency per operation. Cassandra can be configured for strong consistency (CP behavior) by requiring quorum reads and writes. DynamoDB has strongly consistent reads as an option. The classification is about the default or typical configuration.

-- Cassandra: trade availability for consistency per query
-- AP default (fast, eventually consistent):
SELECT * FROM orders WHERE id = 1 USING CONSISTENCY ONE;

-- CP behavior (slower, strongly consistent, may error during partition):
SELECT * FROM orders WHERE id = 1 USING CONSISTENCY QUORUM;
INSERT INTO orders ... USING CONSISTENCY QUORUM;

The Problem with CAP: It Only Covers Failures

CAP theorem only describes behavior during network partitions. But your system is not partitioned 99.9% of the time. During normal operation, every database must make a different trade-off: latency vs consistency.

This is what PACELC captures.

PACELC: The More Complete Model

PACELC was proposed by Daniel Abadi in 2012 as an extension to CAP. It describes trade-offs in both failure and normal operation:

PACELC:

P: if there is a Partition
  A: trade Availability
  C: for Consistency

E: Else (no partition, normal operation)
  L: trade Latency
  C: for Consistency

In plain English: during partitions, you choose A or C. During normal operation, you choose L (low latency) or C (strong consistency).

Why the EL/EC trade-off matters:

To achieve strong consistency (C) during normal operation, every write must be confirmed by multiple nodes before returning success. This adds network round-trips → higher latency.

To achieve low latency (L), writes return as soon as the local node commits — before replicas confirm. This means different nodes may have different values momentarily.

EC (Else — favor Consistency over Latency):
  Write to primary
  Wait for majority of replicas to confirm (network round-trip)
  Return success to client
  Latency: primary write time + 1 network RTT (10–50ms cross-datacenter)

EL (Else — favor Latency over Consistency):
  Write to local node
  Return success immediately
  Replicate asynchronously in background
  Latency: local write time only (< 1ms)
  Trade-off: reads may return stale data briefly

PACELC Classification of Real Databases

Database        Partition    Else         Notes
─────────────────────────────────────────────────────────────────
PostgreSQL      CP           EC           Sync replication = EC;
(sync replicas)                           async replication = EL
PostgreSQL      CP           EL           Default async replication
(async replicas)
Cassandra       AP           EL           Default eventual consistency
                                          QUORUM reads/writes → PA/EC
DynamoDB        AP           EL           Eventually consistent default
                                          Strongly consistent reads = PA/EC
CockroachDB     CP           EC           Always strongly consistent
                                          Higher write latency by design
Google Spanner  CP           EC           TrueTime for global consistency
                                          ~10ms write latency cross-region
MongoDB         CP           EC/EL        Depends on write concern
                                          majority = EC, w:1 = EL
HBase           CP           EC           Strong consistency always
Riak            AP           EL           Tunable via N/W/R values

Concrete Examples: Where the Trade-off Shows Up

Example 1 — E-commerce checkout (need CP/EC):

User completes payment.
Two operations must succeed atomically:
  1. Charge payment method
  2. Decrement inventory

If the system is AP/EL:
  Payment charges successfully.
  Network partition occurs before inventory decrement replicates.
  User sees "order confirmed."
  Another user also buys the last item.
  Inventory goes to -1.

If the system is CP/EC:
  Payment charges successfully.
  Inventory decrement waits for majority acknowledgment.
  If partition occurs: operation returns error.
  User sees "order failed, please retry."
  Inventory is never decremented without confirmed payment.
  Higher latency, but no overselling.

Decision: CP/EC for financial transactions, inventory updates.

Example 2 — Social media "like" counter (AP/EL is fine):

User clicks Like on a post.
Post like count is 10,000.

If AP/EL:
  Like written to local node immediately.
  Count may show 9,999 or 10,001 for a few milliseconds on other nodes.
  User never notices — approximate counts are acceptable.
  Zero impact on user experience.
  Very low write latency.

If CP/EC:
  Like write must be confirmed by majority before returning.
  10–50ms additional latency per like.
  At 1M likes/day: 10–50 seconds of total wasted user wait.
  No observable correctness benefit for this use case.

Decision: AP/EL for non-critical counters, activity feeds, view counts.

Example 3 — Bank account balance (CP/EC, always):

User checks account balance: $1,000.
User initiates transfer: -$500.

If AP/EL:
  Transfer written locally, returns success.
  User immediately reads balance from another node: still shows $1,000.
  User initiates another transfer: -$700.
  Both transfers process. Balance goes to -$200.

If CP/EC:
  Transfer confirmed by majority before success returned.
  Balance immediately reflects $500 on all nodes.
  Second transfer attempt rejected (insufficient funds).

Decision: CP/EC mandatory for financial balances.

The Decision Framework

Does incorrect data cause financial loss, legal liability, or safety issues?
  └─ Yes → CP/EC (strong consistency always)
     Examples: payments, inventory, account balances, medical records

Is the data used for display/analytics where approximate values are acceptable?
  └─ Yes → AP/EL (eventual consistency, low latency)
     Examples: view counts, like counters, activity feeds, search indexes

Do users expect to read their own writes immediately?
  └─ Yes → CP or AP with read-your-own-writes guarantee
     (Cassandra with QUORUM reads, or route reads to the node just written)

Is the system deployed across multiple datacenters?
  └─ Yes, active-active → AP/EL natural fit (avoid cross-DC write latency)
  └─ Yes, active-passive → CP feasible (async replica in secondary DC)

What is the acceptable write latency?
  └─ < 5ms → EL mandatory (EC adds network RTT)
  └─ 10–50ms acceptable → EC viable (strong consistency worth the cost)

Are you building a financial system, healthcare app, or e-commerce inventory?
  └─ Start with CP/EC. The cost of inconsistency exceeds the cost of latency.

Are you building social features, analytics, or user activity tracking?
  └─ AP/EL. Eventual consistency is invisible to users and dramatically simpler.

The Practical Summary

CAP theorem tells you what is impossible. PACELC tells you what you are trading even when nothing is broken. Together they answer one question:

"What happens to my data when something goes wrong — and what am I paying in normal operation for that guarantee?"

Most applications need both: strong consistency for transactional data (payments, inventory, auth) and eventual consistency for non-critical data (feeds, analytics, counters). The architecture that serves both is not one database configured one way — it is the right database for each workload.

🧭 What's Next

Post 18: Distributed Transactions — once data spans multiple nodes or services, ACID transactions no longer come for free; two-phase commit and the Saga pattern are the two main answers, and they trade safety for availability in opposite ways

Why These Theorems Matter

CAP Theorem: The Three Properties

The Key Insight: P Is Not Optional

What Happens During a Partition

The CAP Classification of Real Databases

The Problem with CAP: It Only Covers Failures

PACELC: The More Complete Model

PACELC Classification of Real Databases

Concrete Examples: Where the Trade-off Shows Up

The Decision Framework

The Practical Summary

🧭 What's Next

Related

Distributed Transactions: Two-Phase Commit, Saga and When to Use Each

Database Partitioning and Sharding: Range, Hash and Consistent Hashing

Failover, Consensus and High Availability: Raft, Paxos and Automatic Failover

Leave a comment

Comments