Recursion & Backtracking: Think in Trees, Code in Functions

The Mental Model That Changes Everything

Most candidates learn recursion by example: fibonacci, factorial, binary search. They learn backtracking by pattern-matching: "if the problem says 'all combinations,' use backtracking." Neither approach builds the mental model that makes recursion and backtracking feel intuitive rather than mysterious.

Here is that mental model, stated once and clearly:

Every recursive function call creates a node in a tree. The function's parameters are the node's label. The recursive calls are the node's children. The base cases are the leaves. When you write a recursive function, you are describing how to traverse a tree — even if no tree exists in the problem.

Once you see recursion as tree traversal, everything falls into place:

Base case = what to do at a leaf node (nothing left to recurse into)
Recursive case = what to do at an internal node (process current state, spawn children)
Backtracking = DFS on a tree where you undo state after visiting each child
Pruning = not visiting subtrees whose root already violates the constraints
Memoization = caching results at nodes you've visited before

Every technique in this post is a consequence of this one idea.

Part 1: Recursion Fundamentals

The Three Questions Before Writing Any Recursive Function

Before writing code, answer these three questions. They define the function completely.

Q1: What does this function return / compute? Be precise. Not "it solves the problem" — what specifically does it return given its parameters?

Q2: What is the base case? What is the smallest input where the answer is trivially known? (Empty array, n=0, n=1, reached a leaf, etc.)

Q3: How does the recursive case use smaller subproblems? Assume the function works correctly for smaller inputs. How do you combine those results to solve the current input?

# Example: maximum depth of a binary tree
# Q1: Returns the maximum depth of the subtree rooted at `node`
# Q2: If node is None, depth is 0
# Q3: depth = 1 + max(depth of left subtree, depth of right subtree)

def max_depth(node):
    if not node:             # base case
        return 0
    left  = max_depth(node.left)   # trust the recursion
    right = max_depth(node.right)
    return 1 + max(left, right)    # combine

The phrase "trust the recursion" is not hand-waving — it's a formal principle called strong induction. If the function is correct for all inputs of size < n, and you write the recursive case correctly in terms of smaller inputs, the function is correct for size n. You never need to mentally simulate more than one level of recursion.

The Call Stack — Why Recursion Has O(n) Space

Each recursive call occupies a stack frame — a block of memory holding the function's local variables and return address. If a function calls itself n times before hitting the base case, there are n stack frames alive simultaneously: O(n) space.

# This uses O(n) call stack space
def sum_recursive(n):
    if n == 0:
        return 0
    return n + sum_recursive(n - 1)   # n frames alive at max depth

# This uses O(1) space (tail recursion — Python doesn't optimize this,
# but the concept is important for interviews)
def sum_tail(n, acc=0):
    if n == 0:
        return acc
    return sum_tail(n - 1, acc + n)   # only current frame needed

Python's default recursion limit is 1000. For n > 1000, use an iterative approach or sys.setrecursionlimit(). In interviews, mention this tradeoff explicitly — it signals awareness of production constraints.

The Fibonacci Trap — Exponential vs Linear

The naive fibonacci demonstrates why recursion without memoization destroys performance:

# Naive: O(2^n) time — each call branches twice, creating 2^n leaves
def fib_naive(n):
    if n <= 1:
        return n
    return fib_naive(n-1) + fib_naive(n-2)

# With memoization: O(n) time, O(n) space
def fib_memo(n, cache={}):
    if n in cache:
        return cache[n]
    if n <= 1:
        return n
    cache[n] = fib_memo(n-1, cache) + fib_memo(n-2, cache)
    return cache[n]

# Bottom-up DP: O(n) time, O(1) space
def fib_dp(n):
    if n <= 1:
        return n
    a, b = 0, 1
    for _ in range(2, n+1):
        a, b = b, a + b
    return b

The recursion tree for fib(5):

                    fib(5)
                 /          \
           fib(4)            fib(3)
          /      \          /      \
       fib(3)  fib(2)    fib(2)  fib(1)
       ...

fib(3) is computed twice, fib(2) three times. Memoization stores each result the first time — every node in the tree is visited exactly once after that. O(n) total.

Part 2: Backtracking

The Core Idea

Backtracking is DFS on a decision tree. At each node, you make a choice (add an element, place a queen, turn left). You explore that choice by recursing into it. After returning, you undo the choice (backtrack) and try the next option.

The undo step is what makes backtracking differ from plain recursion. Without undoing, state from one branch "leaks" into sibling branches.

The Universal Backtracking Template

def backtrack(result, current_state, choices, ...):
    # Base case: current state is a complete solution
    if IS_COMPLETE(current_state):
        result.append(SNAPSHOT(current_state))  # record — don't just append reference
        return

    for choice in choices:
        # Pruning: skip invalid choices early
        if NOT_VALID(choice, current_state):
            continue

        # Make the choice
        APPLY(choice, current_state)

        # Recurse with updated state
        backtrack(result, current_state, NEXT_CHOICES(choices, choice), ...)

        # Undo the choice (backtrack)
        UNDO(choice, current_state)

The snapshot rule: When recording a solution, always append a copy (list(path), path[:]), never the list itself. The list will be modified by subsequent backtracking. This is the single most common backtracking bug in interviews.

The 4 Backtracking Problem Types

All backtracking interview problems fall into one of these four categories. Each has a slightly different template.

Type 1: Subsets (Power Set)

Problem: Generate all subsets of a given set. Decision at each step: Include or exclude the current element. Tree shape: Binary tree — two branches at every node.

def subsets(nums):
    result = []

    def backtrack(start, current):
        result.append(list(current))   # every node is a valid subset

        for i in range(start, len(nums)):
            current.append(nums[i])       # include nums[i]
            backtrack(i + 1, current)     # recurse with remaining elements
            current.pop()                 # exclude nums[i] (backtrack)

    backtrack(0, [])
    return result

# nums = [1, 2, 3]
# result = [[], [1], [1,2], [1,2,3], [1,3], [2], [2,3], [3]]

With duplicates — sort first, then skip duplicate values at the same depth:

def subsets_with_dup(nums):
    nums.sort()
    result = []

    def backtrack(start, current):
        result.append(list(current))
        for i in range(start, len(nums)):
            if i > start and nums[i] == nums[i-1]:   # skip duplicates at same level
                continue
            current.append(nums[i])
            backtrack(i + 1, current)
            current.pop()

    backtrack(0, [])
    return result

Type 2: Permutations

Problem: Generate all permutations of a given list. Decision at each step: Which unused element goes in the current position? Tree shape: n-ary tree — n branches at root, n-1 at depth 1, etc.

def permutations(nums):
    result = []

    def backtrack(current, remaining):
        if not remaining:
            result.append(list(current))
            return
        for i in range(len(remaining)):
            current.append(remaining[i])
            backtrack(current, remaining[:i] + remaining[i+1:])
            current.pop()

    backtrack([], nums)
    return result

# More efficient: swap in-place instead of slicing
def permutations_inplace(nums):
    result = []

    def backtrack(start):
        if start == len(nums):
            result.append(list(nums))
            return
        for i in range(start, len(nums)):
            nums[start], nums[i] = nums[i], nums[start]   # choose
            backtrack(start + 1)
            nums[start], nums[i] = nums[i], nums[start]   # undo

    backtrack(0)
    return result

With duplicates:

def permutations_unique(nums):
    nums.sort()
    result = []
    used = [False] * len(nums)

    def backtrack(current):
        if len(current) == len(nums):
            result.append(list(current))
            return
        for i in range(len(nums)):
            if used[i]:
                continue
            # Skip: same value as previous, and previous was not used
            # (means we already explored this choice at this depth)
            if i > 0 and nums[i] == nums[i-1] and not used[i-1]:
                continue
            used[i] = True
            current.append(nums[i])
            backtrack(current)
            current.pop()
            used[i] = False

    backtrack([])
    return result

Type 3: Combinations

Problem: Generate all combinations of k elements from n elements. Decision at each step: Which element to add next (from remaining elements, maintaining order). Key difference from subsets: Fixed target length k, not all lengths.

def combinations(n, k):
    result = []

    def backtrack(start, current):
        if len(current) == k:
            result.append(list(current))
            return

        # Pruning: remaining elements must be enough to complete the combination
        remaining_needed = k - len(current)
        remaining_available = n - start + 1
        if remaining_available < remaining_needed:
            return    # impossible to complete — prune this branch

        for i in range(start, n + 1):
            current.append(i)
            backtrack(i + 1, current)
            current.pop()

    backtrack(1, [])
    return result

def combination_sum(candidates, target):
    """Combinations that sum to target. Elements can be reused."""
    result = []

    def backtrack(start, current, remaining):
        if remaining == 0:
            result.append(list(current))
            return
        if remaining < 0:
            return    # pruning: exceeded target

        for i in range(start, len(candidates)):
            current.append(candidates[i])
            backtrack(i, current, remaining - candidates[i])  # i, not i+1: can reuse
            current.pop()

    candidates.sort()   # sort enables pruning when remaining < 0
    backtrack(0, [], target)
    return result

Type 4: Path Problems (Grid / Tree)

Problem: Find all paths satisfying a condition — in a grid, tree, or graph. Decision at each step: Which direction/neighbor to move to next. Key addition: Mark visited before recursing, unmark after (backtrack).

def word_search(board, word):
    rows, cols = len(board), len(board[0])

    def backtrack(r, c, idx):
        if idx == len(word):
            return True    # found the complete word

        if (r < 0 or r >= rows or c < 0 or c >= cols
                or board[r][c] != word[idx]):
            return False   # out of bounds or wrong character

        # Mark as visited (in-place, no extra space)
        temp = board[r][c]
        board[r][c] = '#'

        found = (backtrack(r+1, c, idx+1) or
                 backtrack(r-1, c, idx+1) or
                 backtrack(r, c+1, idx+1) or
                 backtrack(r, c-1, idx+1))

        board[r][c] = temp   # restore (backtrack)
        return found

    for r in range(rows):
        for c in range(cols):
            if backtrack(r, c, 0):
                return True
    return False

def all_paths_source_target(graph):
    """Find all paths from node 0 to node n-1 in a DAG."""
    n = len(graph)
    result = []

    def backtrack(node, path):
        if node == n - 1:
            result.append(list(path))
            return
        for neighbor in graph[node]:
            path.append(neighbor)
            backtrack(neighbor, path)
            path.pop()

    backtrack(0, [0])
    return result

Pruning: The Difference Between Passing and Timing Out

Backtracking without pruning is brute force. Pruning is what makes it efficient enough to pass interview problems within time limits. Every piece of pruning reduces the size of the decision tree you actually explore.

The three pruning strategies:

Strategy 1 — Constraint violation: If the current partial solution already violates a constraint, don't recurse further.

# Combination sum: if remaining < 0, no point going deeper
if remaining < 0:
    return

# N-Queens: if placing a queen here creates a conflict, skip
if not is_safe(board, row, col):
    continue

Strategy 2 — Impossibility: If it's mathematically impossible to complete the solution from the current state, prune.

# Combinations of k from n: if not enough elements left
remaining_needed = k - len(current)
if n - start + 1 < remaining_needed:
    return

Strategy 3 — Duplicate avoidance: Sort the input first, then skip duplicate values at the same level to avoid generating identical solutions via different orderings.

# Skip same value as previous at the same recursion depth
if i > start and nums[i] == nums[i-1]:
    continue

Classic Hard Problem: N-Queens

N-Queens is the canonical backtracking problem that combines all four ideas — state, choices, constraints, and pruning:

def solve_n_queens(n):
    result = []
    # Track which columns and diagonals are under attack
    cols       = set()
    diag_pos   = set()   # row - col
    diag_neg   = set()   # row + col

    board = [['.' ] * n for _ in range(n)]

    def backtrack(row):
        if row == n:
            result.append([''.join(r) for r in board])
            return

        for col in range(n):
            if col in cols or (row-col) in diag_pos or (row+col) in diag_neg:
                continue    # under attack — prune

            # Place queen
            cols.add(col); diag_pos.add(row-col); diag_neg.add(row+col)
            board[row][col] = 'Q'

            backtrack(row + 1)   # move to next row

            # Remove queen (backtrack)
            cols.discard(col); diag_pos.discard(row-col); diag_neg.discard(row+col)
            board[row][col] = '.'

    backtrack(0)
    return result

Why sets for attacked positions? O(1) lookup per queen placement check. Without sets, checking O(n) per column/diagonal per placement gives O(n²) per row — the sets reduce it to O(1) per column check, O(1) per diagonal.

The Decision Framework

PROBLEM ASKS FOR "ALL" SOMETHING?
         │
         ▼
What are you generating?
  ├─ All subsets / power set          → TYPE 1: Subsets
  ├─ All orderings / arrangements     → TYPE 2: Permutations
  ├─ All size-k selections            → TYPE 3: Combinations
  └─ All paths satisfying condition   → TYPE 4: Path search
         │
         ▼
COMMON SETUP STEPS:
  1. Sort input if duplicates need to be handled
  2. Start with empty current[] and call backtrack(start=0, current=[])
  3. At base case: append list(current) to result (COPY, not reference)
  4. In loop: append → recurse → pop
         │
         ▼
PRUNING CHECKLIST:
  □ Constraint violated? → return immediately
  □ Not enough elements left? → return immediately
  □ Same value as previous at same depth? → skip with continue
  □ Already visited this cell/node? → skip (path problems)

The 9 Must-Know Problems

#	Problem	Type	Difficulty
1	Subsets	Subsets	Medium
2	Subsets II (with duplicates)	Subsets + dedup	Medium
3	Permutations	Permutations	Medium
4	Permutations II (with duplicates)	Permutations + dedup	Medium
5	Combinations	Combinations	Medium
6	Combination Sum	Combinations (reuse)	Medium
7	Word Search	Path (grid)	Medium
8	N-Queens	Constraints + all types	Hard
9	Sudoku Solver	Constraints + path	Hard

Work through 1–6 in order — they build on each other directly. Problem 7 introduces the grid path type. Problems 8 and 9 are the capstones — both combine constraint checking with the backtracking template in non-obvious ways.

Bottom Line

Recursion is tree traversal. Backtracking is DFS with undo. Pruning is not visiting subtrees that are already invalid. State the three questions, draw the decision tree, write the template, add pruning. That is the complete algorithm for every backtracking problem you will ever encounter.

The candidates who find backtracking hard are usually skipping the drawing step. They try to hold the entire decision tree in their head while writing code. Draw it. Even just the first two or three levels. Once the tree is visible, the code is obvious.

🧭 What's Next in Chapter 3

Post 12: Dynamic Programming — from scary to systematic in one post, with the 6 patterns that cover every DP problem in interviews