Transactions & ACID: Isolation Levels, Locking and the Bugs They Prevent
The Bug That Appears Only in Production
You have a banking application. Two users simultaneously transfer money from the same account. In your local environment, this never happens — you test one operation at a time. In production, with thousands of concurrent users, both transfers execute simultaneously, and the account balance becomes negative. No error is thrown. No log entry is written. The money simply vanishes.
This class of bug — invisible in development, catastrophic in production — is a concurrency anomaly. It is the reason transactions exist. Without transactions, every multi-step database operation is a potential source of silent data corruption.
This post explains exactly how transactions protect you: what ACID properties guarantee, what can still go wrong at each isolation level, and how the database implements these guarantees under the hood.
What Is a Transaction?
A transaction is a sequence of database operations that the database treats as a single indivisible unit. Either all operations succeed and are committed, or none of them are — the database rolls back to the state before the transaction started.
-- Without a transaction: each statement is its own unit
-- If the second UPDATE fails, Alice loses money but Bob never receives it
UPDATE accounts SET balance = balance - 1000 WHERE id = 1; -- Alice
UPDATE accounts SET balance = balance + 1000 WHERE id = 2; -- Bob
-- With a transaction: both succeed or both roll back
BEGIN;
UPDATE accounts SET balance = balance - 1000 WHERE id = 1;
UPDATE accounts SET balance = balance + 1000 WHERE id = 2;
COMMIT; -- both changes persist
-- If anything goes wrong:
BEGIN;
UPDATE accounts SET balance = balance - 1000 WHERE id = 1;
-- application crashes here
ROLLBACK; -- Alice's money is restored automaticallyThe database enforces four properties on every transaction, known collectively as ACID.
ACID: The Four Guarantees
Atomicity — All or Nothing
A transaction is atomic: either every operation in it succeeds, or none of them do. There is no partial success.
If your server crashes after deducting from Alice's account but before crediting Bob's, the database recovers to the state before the transaction began. Alice's balance is restored. No money is lost.
How it works: The database records every change to a Write-Ahead Log (WAL) before applying it. On crash recovery, incomplete transactions (those without a COMMIT record in the WAL) are rolled back.
BEGIN;
INSERT INTO orders (customer_id, amount) VALUES (42, 299.99);
INSERT INTO order_items (order_id, product_id, qty) VALUES (LASTVAL(), 7, 1);
UPDATE inventory SET stock = stock - 1 WHERE product_id = 7;
COMMIT;
-- All three succeed, or none do. Inventory is never decremented without an order.Consistency — Rules Are Always Enforced
A transaction can only bring the database from one valid state to another valid state. All constraints — primary keys, foreign keys, CHECK constraints, NOT NULL — are enforced at commit time.
If any constraint is violated, the transaction is rolled back in its entirety.
-- This transaction violates a CHECK constraint: balance >= 0
BEGIN;
UPDATE accounts SET balance = balance - 9999 WHERE id = 1;
-- Alice only has 500 — CHECK (balance >= 0) fails
COMMIT;
-- The entire transaction is rolled back. Balance unchanged.Important: Consistency is partly the database's responsibility (enforcing constraints) and partly the application's responsibility (writing correct business logic). The database cannot enforce rules it does not know about.
Isolation — Concurrent Transactions Do Not See Each Other's Incomplete Work
When multiple transactions run concurrently, each transaction behaves as if it is the only one running. Intermediate states of other transactions are invisible.
This is the most complex of the four properties — and the most nuanced. Total isolation (each transaction executes serially, one at a time) is perfectly safe but too slow for any real application. Weaker isolation allows more concurrency but introduces specific anomalies.
The four standard isolation levels are a spectrum from weakest (most anomalies allowed, most concurrency) to strongest (no anomalies, least concurrency). We will cover all four below.
Durability — Committed Data Survives Crashes
Once a transaction is committed, its changes persist even if the server crashes immediately after. The commit is permanent.
How it works: Before returning a COMMIT acknowledgment to the client, the database flushes the WAL record for this transaction to durable storage (disk). Even if the server crashes the next millisecond, the WAL record survives and the transaction is re-applied on restart.
Timeline:
T=0: BEGIN
T=1: UPDATE accounts ...
T=2: COMMIT sent by client
T=3: Database writes WAL record to disk (fsync)
T=4: Database returns "COMMIT" to client ← data is now durable
T=5: Server crashes
T=6: Server restarts, reads WAL, re-applies committed transaction
T=7: Data is intactThe fsync call at T=3 is expensive — it forces the OS to flush write buffers to the physical disk. This is why disabling fsync (fsync=off in PostgreSQL) makes writes dramatically faster but removes the durability guarantee. Never disable fsync in production.
The Concurrency Anomalies
Before explaining isolation levels, you need to understand the four anomalies they protect against.
Dirty Read
Transaction A reads data that Transaction B has written but not yet committed. If B rolls back, A has read data that never officially existed.
Time Transaction A Transaction B
───────────────────────────────────────────────────
T1 BEGIN
T2 UPDATE accounts SET balance = 0 WHERE id=1
T3 BEGIN
T4 SELECT balance FROM accounts -- reads 0 (B's uncommitted write)
WHERE id=1 → returns 0
T5 ROLLBACK -- B decides to abort
T6 -- A makes decisions based on -- balance is back to original value
-- a balance that never existedA dirty read is the most dangerous anomaly. A read an uncommitted, then-rolled-back value and made real decisions based on ghost data.
Non-Repeatable Read
Transaction A reads the same row twice and gets different values because Transaction B committed a change between A's two reads.
Time Transaction A Transaction B
───────────────────────────────────────────────────
T1 BEGIN
T2 SELECT balance FROM accounts
WHERE id=1 → returns 1000
T3 BEGIN
T4 UPDATE accounts SET balance=500 WHERE id=1
T5 COMMIT
T6 SELECT balance FROM accounts
WHERE id=1 → returns 500 -- different value! B changed it.
T7 COMMIT
-- A read 1000, then 500. Which is the "real" balance for this transaction?Phantom Read
Transaction A executes a range query twice. Between the two reads, Transaction B inserts or deletes rows that fall within the range. A sees different sets of rows on each read.
Time Transaction A Transaction B
─────────────────────────────────────────────────────────
T1 BEGIN
T2 SELECT COUNT(*) FROM orders
WHERE customer_id=42 → returns 5
T3 BEGIN
T4 INSERT INTO orders (customer_id, ...) VALUES (42, ...)
T5 COMMIT
T6 SELECT COUNT(*) FROM orders
WHERE customer_id=42 → returns 6 -- phantom row appeared
T7 COMMIT
-- A counted 5 orders, then 6. Which count is correct for this transaction?Lost Update
Two transactions read the same value, both modify it, and both write back. The second write overwrites the first, silently discarding it.
Time Transaction A Transaction B
───────────────────────────────────────────────────
T1 BEGIN BEGIN
T2 SELECT balance FROM accounts
WHERE id=1 → reads 1000
T3 SELECT balance FROM accounts
WHERE id=1 → reads 1000
T4 UPDATE accounts
SET balance = 1000 - 200 = 800
WHERE id=1
T5 UPDATE accounts
SET balance = 1000 - 300 = 700
WHERE id=1
T6 COMMIT COMMIT
-- Final balance: 700. A's deduction of 200 is lost. Should be 500.The Four Isolation Levels
SQL standard defines four isolation levels. Each level prevents a specific set of anomalies.
Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read | Lost Update |
|---|---|---|---|---|
Read Uncommitted | ✅ possible | ✅ possible | ✅ possible | ✅ possible |
Read Committed | ❌ prevented | ✅ possible | ✅ possible | ✅ possible |
Repeatable Read | ❌ prevented | ❌ prevented | ✅ possible* | ❌ prevented |
Serializable | ❌ prevented | ❌ prevented | ❌ prevented | ❌ prevented |
*PostgreSQL's Repeatable Read also prevents phantom reads in practice due to MVCC implementation.
Level 1: Read Uncommitted
The weakest isolation level. Transactions can read uncommitted changes from other transactions.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN;
SELECT balance FROM accounts WHERE id = 1;
-- May return a value that another transaction wrote but hasn't committed yet
COMMIT;Used in practice: Almost never. The dirty read anomaly makes this unsafe for almost any real use case. Some analytics workloads use it to avoid lock overhead when approximate results are acceptable.
Level 2: Read Committed (Default in PostgreSQL and Oracle)
Each statement within a transaction sees only data committed before that statement began. Uncommitted changes from other transactions are invisible.
-- PostgreSQL default
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
SELECT balance FROM accounts WHERE id = 1; -- sees committed data only
-- another transaction commits a change here
SELECT balance FROM accounts WHERE id = 1; -- may see the NEW committed value
COMMIT;Prevents: Dirty reads. Allows: Non-repeatable reads, phantom reads, some lost updates.
Used in practice: The default for most applications. Sufficient for workloads where each statement is logically independent. Not safe for multi-statement transactions that need a consistent snapshot.
Level 3: Repeatable Read (Default in MySQL InnoDB)
A transaction sees a consistent snapshot of data as it existed when the transaction began. Other transactions' commits after this point are invisible — re-reading the same row always returns the same value.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
SELECT balance FROM accounts WHERE id = 1; -- reads 1000
-- another transaction commits UPDATE accounts SET balance=500 WHERE id=1
SELECT balance FROM accounts WHERE id = 1; -- still reads 1000 (snapshot)
COMMIT;Prevents: Dirty reads, non-repeatable reads, lost updates. Allows: Phantom reads (in standard SQL; PostgreSQL prevents these too via MVCC).
How it works — MVCC (Multi-Version Concurrency Control):
Instead of locking rows, the database keeps multiple versions of each row. When a transaction begins, it gets a snapshot ID. Reads return the most recent version committed before that snapshot ID. Other transactions write new versions without overwriting old ones.
Row versions for accounts WHERE id=1:
Version 1: balance=1000, created_at_txn=100, deleted_at_txn=NULL ← our snapshot sees this
Version 2: balance=500, created_at_txn=250, deleted_at_txn=NULL ← committed after our snapshot
Our transaction (snapshot_id=200) reads Version 1.
Transaction with snapshot_id=300 reads Version 2.
No locks needed for reads. Readers never block writers. Writers never block readers.MVCC is why PostgreSQL and MySQL can handle thousands of concurrent readers without contention — reads never take locks.
Level 4: Serializable
The strongest isolation level. Transactions execute as if they ran one at a time — serially. The result of any concurrent execution is identical to some serial ordering of those transactions.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
SELECT COUNT(*) FROM orders WHERE customer_id = 42; -- reads 5
-- another transaction inserts an order for customer 42
SELECT COUNT(*) FROM orders WHERE customer_id = 42; -- still reads 5
COMMIT; -- if a conflict is detected, this COMMIT may fail with serialization error
-- Application must retry the transaction on serialization failurePrevents: All anomalies. Cost: Higher overhead. Transactions may fail with a serialization error and require retry logic in the application.
How PostgreSQL implements it — SSI (Serializable Snapshot Isolation):
PostgreSQL uses SSI, which tracks read/write dependencies between concurrent transactions. If a cycle of dependencies is detected (meaning the transactions could not have run serially in any order), one transaction is aborted. This approach avoids most locking while still guaranteeing serializability.
# Application must handle serialization failures
import psycopg2
from psycopg2 import errors
def transfer_money(from_id, to_id, amount):
max_retries = 3
for attempt in range(max_retries):
try:
with conn.cursor() as cur:
cur.execute("SET TRANSACTION ISOLATION LEVEL SERIALIZABLE")
cur.execute("BEGIN")
cur.execute("UPDATE accounts SET balance = balance - %s WHERE id = %s",
(amount, from_id))
cur.execute("UPDATE accounts SET balance = balance + %s WHERE id = %s",
(amount, to_id))
conn.commit()
return # success
except errors.SerializationFailure:
conn.rollback()
if attempt == max_retries - 1:
raise # give up after max retriesLocking: The Other Side of Isolation
MVCC handles read-write conflicts without locks. Write-write conflicts still require locking.
Shared and Exclusive Locks
-- Shared lock (S): multiple transactions can hold simultaneously
-- Acquired automatically on SELECT in some isolation levels
SELECT * FROM accounts WHERE id = 1;
-- Explicit shared lock: "I'm reading this, please don't let anyone modify it"
SELECT * FROM accounts WHERE id = 1 FOR SHARE;
-- Exclusive lock (X): only one transaction can hold at a time
-- Acquired automatically on UPDATE, DELETE, INSERT
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
-- Explicit exclusive lock: "I'm about to modify this, lock it now"
SELECT * FROM accounts WHERE id = 1 FOR UPDATE;
-- Blocks other transactions from reading (FOR SHARE) or writing (FOR UPDATE) this rowPreventing Lost Updates with SELECT FOR UPDATE
The lost update scenario from earlier can be fixed with SELECT FOR UPDATE:
-- Transaction A
BEGIN;
SELECT balance FROM accounts WHERE id=1 FOR UPDATE; -- acquires exclusive lock
-- Transaction B tries the same: BLOCKS until A commits or rolls back
UPDATE accounts SET balance = balance - 200 WHERE id=1;
COMMIT; -- lock released, Transaction B proceeds with correct valueDeadlocks
A deadlock occurs when two transactions each hold a lock the other needs:
Transaction A holds lock on row 1, wants lock on row 2.
Transaction B holds lock on row 2, wants lock on row 1.
Neither can proceed. Both wait forever.Databases detect deadlocks automatically and abort one of the transactions (typically the one that has done less work). The aborted transaction receives an error and must be retried.
# Always be prepared to retry on deadlock
try:
execute_transaction()
except DeadlockDetected:
rollback()
retry_transaction()Prevention: Always acquire locks in the same order across all transactions. If every transaction locks row 1 before row 2, deadlocks on this pair are impossible.
Choosing the Right Isolation Level
Does your transaction read data and then make decisions based on it?
└─ Yes → Use REPEATABLE READ or SERIALIZABLE
└─ No (each statement is independent) → READ COMMITTED is fine
Does your transaction check a condition and then insert/update based on it?
(e.g. "if no duplicate exists, insert")
└─ Yes → Use SERIALIZABLE to prevent write skew anomaly
Does your transaction touch multiple rows that must be consistent together?
└─ Yes → Use REPEATABLE READ or SERIALIZABLE
Is your workload read-heavy analytics where approximate results are OK?
└─ Yes → READ COMMITTED with no explicit locks
Are you implementing optimistic concurrency (version/timestamp column)?
└─ READ COMMITTED + application-level version check is commonPractical default: Use READ COMMITTED (PostgreSQL default) for most operations. Upgrade to REPEATABLE READ for transactions that need a consistent snapshot. Upgrade to SERIALIZABLE for transactions with complex invariants that must hold across multiple rows.
Chapter 1 Complete
You have finished the Database Fundamentals chapter:
✅ Post 01: What is a Database & DBMS
✅ Post 02: Relational Model & Normalization
✅ Post 03: SQL Fundamentals
✅ Post 04: Database Indexes
✅ Post 05: Transactions & ACID
With these five concepts — the relational model, SQL, indexes, and transactions — you have the foundation to understand everything that follows: how data is stored on disk, how databases replicate across machines, how they scale to billions of rows, and how distributed systems keep data consistent without sacrificing availability.
🧭 What's Next — Chapter 2: SQL Deep Dive
Post 06: Advanced SQL — window functions, CTEs, and recursive queries that turn 20-line subquery nightmares into clean, readable SQL
Related
Database Indexes: B-Tree, Hash, Composite — When and Why to Use Them
Indexes are the biggest lever for database performance — and the most misused. Learn how B-Tree and Hash indexes work, when to add them, and when they hurt more than help.
SQL Fundamentals: SELECT, JOIN, GROUP BY and the Queries That Trip Everyone Up
SQL is easy to start and hard to master. This post covers every JOIN type, GROUP BY traps, subqueries, and the mental model that makes complex queries feel obvious.
Relational Model & Normalization: How to Design Tables That Last
Bad schema design creates bugs invisible until production. Learn the relational model from first principles, then normalize to 3NF/BCNF with real examples and trade-offs.
Comments