Recursion & Backtracking: Think in Trees, Code in Functions
The Mental Model That Changes Everything
Most candidates learn recursion by example: fibonacci, factorial, binary search. They learn backtracking by pattern-matching: "if the problem says 'all combinations,' use backtracking." Neither approach builds the mental model that makes recursion and backtracking feel intuitive rather than mysterious.
Here is that mental model, stated once and clearly:
Every recursive function call creates a node in a tree. The function's parameters are the node's label. The recursive calls are the node's children. The base cases are the leaves. When you write a recursive function, you are describing how to traverse a tree — even if no tree exists in the problem.
Once you see recursion as tree traversal, everything falls into place:
Base case = what to do at a leaf node (nothing left to recurse into)
Recursive case = what to do at an internal node (process current state, spawn children)
Backtracking = DFS on a tree where you undo state after visiting each child
Pruning = not visiting subtrees whose root already violates the constraints
Memoization = caching results at nodes you've visited before
Every technique in this post is a consequence of this one idea.
Part 1: Recursion Fundamentals
The Three Questions Before Writing Any Recursive Function
Before writing code, answer these three questions. They define the function completely.
Q1: What does this function return / compute? Be precise. Not "it solves the problem" — what specifically does it return given its parameters?
Q2: What is the base case? What is the smallest input where the answer is trivially known? (Empty array, n=0, n=1, reached a leaf, etc.)
Q3: How does the recursive case use smaller subproblems? Assume the function works correctly for smaller inputs. How do you combine those results to solve the current input?
# Example: maximum depth of a binary tree
# Q1: Returns the maximum depth of the subtree rooted at `node`
# Q2: If node is None, depth is 0
# Q3: depth = 1 + max(depth of left subtree, depth of right subtree)
def max_depth(node):
if not node: # base case
return 0
left = max_depth(node.left) # trust the recursion
right = max_depth(node.right)
return 1 + max(left, right) # combineThe phrase "trust the recursion" is not hand-waving — it's a formal principle called strong induction. If the function is correct for all inputs of size < n, and you write the recursive case correctly in terms of smaller inputs, the function is correct for size n. You never need to mentally simulate more than one level of recursion.
The Call Stack — Why Recursion Has O(n) Space
Each recursive call occupies a stack frame — a block of memory holding the function's local variables and return address. If a function calls itself n times before hitting the base case, there are n stack frames alive simultaneously: O(n) space.
# This uses O(n) call stack space
def sum_recursive(n):
if n == 0:
return 0
return n + sum_recursive(n - 1) # n frames alive at max depth
# This uses O(1) space (tail recursion — Python doesn't optimize this,
# but the concept is important for interviews)
def sum_tail(n, acc=0):
if n == 0:
return acc
return sum_tail(n - 1, acc + n) # only current frame neededPython's default recursion limit is 1000. For n > 1000, use an iterative approach or sys.setrecursionlimit(). In interviews, mention this tradeoff explicitly — it signals awareness of production constraints.
The Fibonacci Trap — Exponential vs Linear
The naive fibonacci demonstrates why recursion without memoization destroys performance:
# Naive: O(2^n) time — each call branches twice, creating 2^n leaves
def fib_naive(n):
if n <= 1:
return n
return fib_naive(n-1) + fib_naive(n-2)
# With memoization: O(n) time, O(n) space
def fib_memo(n, cache={}):
if n in cache:
return cache[n]
if n <= 1:
return n
cache[n] = fib_memo(n-1, cache) + fib_memo(n-2, cache)
return cache[n]
# Bottom-up DP: O(n) time, O(1) space
def fib_dp(n):
if n <= 1:
return n
a, b = 0, 1
for _ in range(2, n+1):
a, b = b, a + b
return bThe recursion tree for fib(5):
fib(5)
/ \
fib(4) fib(3)
/ \ / \
fib(3) fib(2) fib(2) fib(1)
...fib(3) is computed twice, fib(2) three times. Memoization stores each result the first time — every node in the tree is visited exactly once after that. O(n) total.
Part 2: Backtracking
The Core Idea
Backtracking is DFS on a decision tree. At each node, you make a choice (add an element, place a queen, turn left). You explore that choice by recursing into it. After returning, you undo the choice (backtrack) and try the next option.
The undo step is what makes backtracking differ from plain recursion. Without undoing, state from one branch "leaks" into sibling branches.
The Universal Backtracking Template
def backtrack(result, current_state, choices, ...):
# Base case: current state is a complete solution
if IS_COMPLETE(current_state):
result.append(SNAPSHOT(current_state)) # record — don't just append reference
return
for choice in choices:
# Pruning: skip invalid choices early
if NOT_VALID(choice, current_state):
continue
# Make the choice
APPLY(choice, current_state)
# Recurse with updated state
backtrack(result, current_state, NEXT_CHOICES(choices, choice), ...)
# Undo the choice (backtrack)
UNDO(choice, current_state)The snapshot rule: When recording a solution, always append a copy (list(path), path[:]), never the list itself. The list will be modified by subsequent backtracking. This is the single most common backtracking bug in interviews.
The 4 Backtracking Problem Types
All backtracking interview problems fall into one of these four categories. Each has a slightly different template.
Type 1: Subsets (Power Set)
Problem: Generate all subsets of a given set. Decision at each step: Include or exclude the current element. Tree shape: Binary tree — two branches at every node.
def subsets(nums):
result = []
def backtrack(start, current):
result.append(list(current)) # every node is a valid subset
for i in range(start, len(nums)):
current.append(nums[i]) # include nums[i]
backtrack(i + 1, current) # recurse with remaining elements
current.pop() # exclude nums[i] (backtrack)
backtrack(0, [])
return result
# nums = [1, 2, 3]
# result = [[], [1], [1,2], [1,2,3], [1,3], [2], [2,3], [3]]With duplicates — sort first, then skip duplicate values at the same depth:
def subsets_with_dup(nums):
nums.sort()
result = []
def backtrack(start, current):
result.append(list(current))
for i in range(start, len(nums)):
if i > start and nums[i] == nums[i-1]: # skip duplicates at same level
continue
current.append(nums[i])
backtrack(i + 1, current)
current.pop()
backtrack(0, [])
return resultType 2: Permutations
Problem: Generate all permutations of a given list. Decision at each step: Which unused element goes in the current position? Tree shape: n-ary tree — n branches at root, n-1 at depth 1, etc.
def permutations(nums):
result = []
def backtrack(current, remaining):
if not remaining:
result.append(list(current))
return
for i in range(len(remaining)):
current.append(remaining[i])
backtrack(current, remaining[:i] + remaining[i+1:])
current.pop()
backtrack([], nums)
return result
# More efficient: swap in-place instead of slicing
def permutations_inplace(nums):
result = []
def backtrack(start):
if start == len(nums):
result.append(list(nums))
return
for i in range(start, len(nums)):
nums[start], nums[i] = nums[i], nums[start] # choose
backtrack(start + 1)
nums[start], nums[i] = nums[i], nums[start] # undo
backtrack(0)
return resultWith duplicates:
def permutations_unique(nums):
nums.sort()
result = []
used = [False] * len(nums)
def backtrack(current):
if len(current) == len(nums):
result.append(list(current))
return
for i in range(len(nums)):
if used[i]:
continue
# Skip: same value as previous, and previous was not used
# (means we already explored this choice at this depth)
if i > 0 and nums[i] == nums[i-1] and not used[i-1]:
continue
used[i] = True
current.append(nums[i])
backtrack(current)
current.pop()
used[i] = False
backtrack([])
return resultType 3: Combinations
Problem: Generate all combinations of k elements from n elements. Decision at each step: Which element to add next (from remaining elements, maintaining order). Key difference from subsets: Fixed target length k, not all lengths.
def combinations(n, k):
result = []
def backtrack(start, current):
if len(current) == k:
result.append(list(current))
return
# Pruning: remaining elements must be enough to complete the combination
remaining_needed = k - len(current)
remaining_available = n - start + 1
if remaining_available < remaining_needed:
return # impossible to complete — prune this branch
for i in range(start, n + 1):
current.append(i)
backtrack(i + 1, current)
current.pop()
backtrack(1, [])
return result
def combination_sum(candidates, target):
"""Combinations that sum to target. Elements can be reused."""
result = []
def backtrack(start, current, remaining):
if remaining == 0:
result.append(list(current))
return
if remaining < 0:
return # pruning: exceeded target
for i in range(start, len(candidates)):
current.append(candidates[i])
backtrack(i, current, remaining - candidates[i]) # i, not i+1: can reuse
current.pop()
candidates.sort() # sort enables pruning when remaining < 0
backtrack(0, [], target)
return resultType 4: Path Problems (Grid / Tree)
Problem: Find all paths satisfying a condition — in a grid, tree, or graph. Decision at each step: Which direction/neighbor to move to next. Key addition: Mark visited before recursing, unmark after (backtrack).
def word_search(board, word):
rows, cols = len(board), len(board[0])
def backtrack(r, c, idx):
if idx == len(word):
return True # found the complete word
if (r < 0 or r >= rows or c < 0 or c >= cols
or board[r][c] != word[idx]):
return False # out of bounds or wrong character
# Mark as visited (in-place, no extra space)
temp = board[r][c]
board[r][c] = '#'
found = (backtrack(r+1, c, idx+1) or
backtrack(r-1, c, idx+1) or
backtrack(r, c+1, idx+1) or
backtrack(r, c-1, idx+1))
board[r][c] = temp # restore (backtrack)
return found
for r in range(rows):
for c in range(cols):
if backtrack(r, c, 0):
return True
return False
def all_paths_source_target(graph):
"""Find all paths from node 0 to node n-1 in a DAG."""
n = len(graph)
result = []
def backtrack(node, path):
if node == n - 1:
result.append(list(path))
return
for neighbor in graph[node]:
path.append(neighbor)
backtrack(neighbor, path)
path.pop()
backtrack(0, [0])
return resultPruning: The Difference Between Passing and Timing Out
Backtracking without pruning is brute force. Pruning is what makes it efficient enough to pass interview problems within time limits. Every piece of pruning reduces the size of the decision tree you actually explore.
The three pruning strategies:
Strategy 1 — Constraint violation: If the current partial solution already violates a constraint, don't recurse further.
# Combination sum: if remaining < 0, no point going deeper
if remaining < 0:
return
# N-Queens: if placing a queen here creates a conflict, skip
if not is_safe(board, row, col):
continueStrategy 2 — Impossibility: If it's mathematically impossible to complete the solution from the current state, prune.
# Combinations of k from n: if not enough elements left
remaining_needed = k - len(current)
if n - start + 1 < remaining_needed:
returnStrategy 3 — Duplicate avoidance: Sort the input first, then skip duplicate values at the same level to avoid generating identical solutions via different orderings.
# Skip same value as previous at the same recursion depth
if i > start and nums[i] == nums[i-1]:
continueClassic Hard Problem: N-Queens
N-Queens is the canonical backtracking problem that combines all four ideas — state, choices, constraints, and pruning:
def solve_n_queens(n):
result = []
# Track which columns and diagonals are under attack
cols = set()
diag_pos = set() # row - col
diag_neg = set() # row + col
board = [['.' ] * n for _ in range(n)]
def backtrack(row):
if row == n:
result.append([''.join(r) for r in board])
return
for col in range(n):
if col in cols or (row-col) in diag_pos or (row+col) in diag_neg:
continue # under attack — prune
# Place queen
cols.add(col); diag_pos.add(row-col); diag_neg.add(row+col)
board[row][col] = 'Q'
backtrack(row + 1) # move to next row
# Remove queen (backtrack)
cols.discard(col); diag_pos.discard(row-col); diag_neg.discard(row+col)
board[row][col] = '.'
backtrack(0)
return resultWhy sets for attacked positions? O(1) lookup per queen placement check. Without sets, checking O(n) per column/diagonal per placement gives O(n²) per row — the sets reduce it to O(1) per column check, O(1) per diagonal.
The Decision Framework
PROBLEM ASKS FOR "ALL" SOMETHING?
│
▼
What are you generating?
├─ All subsets / power set → TYPE 1: Subsets
├─ All orderings / arrangements → TYPE 2: Permutations
├─ All size-k selections → TYPE 3: Combinations
└─ All paths satisfying condition → TYPE 4: Path search
│
▼
COMMON SETUP STEPS:
1. Sort input if duplicates need to be handled
2. Start with empty current[] and call backtrack(start=0, current=[])
3. At base case: append list(current) to result (COPY, not reference)
4. In loop: append → recurse → pop
│
▼
PRUNING CHECKLIST:
□ Constraint violated? → return immediately
□ Not enough elements left? → return immediately
□ Same value as previous at same depth? → skip with continue
□ Already visited this cell/node? → skip (path problems)The 9 Must-Know Problems
# | Problem | Type | Difficulty |
|---|---|---|---|
1 | Subsets | Medium | |
2 | Subsets + dedup | Medium | |
3 | Permutations | Medium | |
4 | Permutations + dedup | Medium | |
5 | Combinations | Medium | |
6 | Combinations (reuse) | Medium | |
7 | Path (grid) | Medium | |
8 | Constraints + all types | Hard | |
9 | Constraints + path | Hard |
Work through 1–6 in order — they build on each other directly. Problem 7 introduces the grid path type. Problems 8 and 9 are the capstones — both combine constraint checking with the backtracking template in non-obvious ways.
Bottom Line
Recursion is tree traversal. Backtracking is DFS with undo. Pruning is not visiting subtrees that are already invalid. State the three questions, draw the decision tree, write the template, add pruning. That is the complete algorithm for every backtracking problem you will ever encounter.
The candidates who find backtracking hard are usually skipping the drawing step. They try to hold the entire decision tree in their head while writing code. Draw it. Even just the first two or three levels. Once the tree is visible, the code is obvious.
🧭 What's Next in Chapter 3
Post 12: Dynamic Programming — from scary to systematic in one post, with the 6 patterns that cover every DP problem in interviews
Related
Divide & Conquer: The Strategy Behind the Fastest Algorithms
Divide and conquer is behind merge sort, binary search, and fast power. Master the split-solve-merge pattern and the Master Theorem for complexity analysis in one focused post.
Greedy Algorithms: When Being Selfish Gives the Optimal Answer
Greedy algorithms are easy to implement but hard to justify. Learn the exchange argument, the 5 greedy patterns, and when greedy fails so you know to reach for DP instead.
Dynamic Programming: From Scary to Systematic in One Post
DP is not about memorizing problems. Recognize two signals — optimal substructure and overlapping subproblems — then apply 6 patterns that make DP finally approachable.
Comments