Arrays & Strings: The Interview Bread and Butter

The Most Common Topic in Technical Interviews

If you could only study one topic for your upcoming coding interview, it would be arrays and strings. They appear in over 40% of all technical interview questions. They're the foundation that every other data structure sits on. And they're deceptively easy to underestimate.

Most candidates feel comfortable with arrays. They know how to access elements, iterate with a loop, slice a subarray. What they don't know — and what interviewers are actually testing — are the five core techniques that turn naive O(n²) solutions into clean, optimal O(n) solutions.

Those five techniques are:

Prefix Sum — answer range queries in O(1) after O(n) preprocessing
Sliding Window — solve contiguous subarray problems without nested loops
Two Pointers — pair-finding and in-place manipulation in O(n) on sorted arrays
Hash Map + Array — eliminate inner loops by trading space for time
In-Place Techniques — modify arrays without extra space using index tricks

This post goes deep on all five. By the end, you'll have the templates, the trigger signals, and enough worked examples that these techniques feel automatic.

The Memory Model: Why Arrays Are Fast

Before patterns, understand why arrays are the fastest data structure for sequential access. Arrays store elements in contiguous memory locations. Element at index i lives at memory address base + i * element_size. That math takes one step — no pointer chasing, no node traversal.

This means:

Access by index: O(1) — calculate the address, read the value
Search (unsorted): O(n) — must check every element
Search (sorted): O(log n) with binary search
Insert/delete at end: O(1) amortized (dynamic arrays)
Insert/delete in middle: O(n) — must shift all subsequent elements

In Python, strings are immutable sequences of characters — they behave like read-only arrays of characters. Every string modification (concatenation, slicing) creates a new string object. This has a major interview implication: building a string with += in a loop is O(n²) in the worst case. Always use ''.join(list) instead.

# SLOW — O(n²) due to repeated string copies
result = ""
for char in chars:
    result += char      # creates a new string each time

# FAST — O(n)
result = "".join(chars) # one allocation, one copy

This is a real trap in interviews. Know it before you walk in.

Technique 1: Prefix Sum

The Core Idea

A prefix sum array stores the cumulative sum up to each index. Once built in O(n), it answers "what is the sum of elements from index i to j?" in O(1) — instead of O(n) for a naive loop.

# Build prefix sum: O(n) time, O(n) space
def build_prefix(nums):
    prefix = [0] * (len(nums) + 1)
    for i, num in enumerate(nums):
        prefix[i + 1] = prefix[i] + num
    return prefix

# Query range sum [l, r] inclusive: O(1)
def range_sum(prefix, l, r):
    return prefix[r + 1] - prefix[l]

Why the extra 0 at the front? It makes the formula clean. prefix[r+1] - prefix[l] works for all cases including l = 0 without special handling.

nums:   [3, 1, 4, 1, 5, 9, 2]
prefix: [0, 3, 4, 8, 9, 14, 23, 25]

Sum from index 2 to 5 (4+1+5+9 = 19):
  prefix[6] - prefix[2] = 23 - 4 = 19  ✓

When Prefix Sum Beats Sliding Window

This is the most important distinction to make in an interview. Both techniques turn O(n²) into O(n). But they solve different problems:

Situation	Use
Positive numbers only, find contiguous subarray	Sliding Window
Negative numbers present, find subarray sum = k	Prefix Sum + Hash Map
Multiple range queries on same array	Prefix Sum
Dynamic window that shrinks/expands	Sliding Window

The key rule: if the array can contain negative numbers, sliding window breaks. Prefix sum handles negatives cleanly.

The Prefix Sum + Hash Map Pattern

This combination solves "find subarray with sum equal to k" in O(n) — even with negative numbers.

def subarray_sum_equals_k(nums, k):
    count = 0
    current_sum = 0
    seen = {0: 1}        # prefix sum → how many times seen

    for num in nums:
        current_sum += num

        # If (current_sum - k) was seen before,
        # there's a subarray ending here with sum = k
        if current_sum - k in seen:
            count += seen[current_sum - k]

        seen[current_sum] = seen.get(current_sum, 0) + 1

    return count

Why does this work? If prefix[j] - prefix[i] = k, then the subarray from i to j-1 sums to k. Rearranged: prefix[i] = prefix[j] - k. So for each new prefix sum, we check if current_sum - k was seen before.

Canonical Problems

Technique 2: Sliding Window

The Core Idea

A window of elements slides across the array. Instead of recomputing the window's value from scratch at each position, we update it incrementally — add the new element entering the window, remove the element leaving it. This converts an O(n·k) computation to O(n).

There are two variants: fixed window (constant size k) and dynamic window (size changes based on a condition).

# ── FIXED WINDOW ──────────────────────────────────────────
def max_sum_fixed(nums, k):
    """Maximum sum of any subarray of size k."""
    window_sum = sum(nums[:k])         # initial window
    max_sum = window_sum

    for i in range(k, len(nums)):
        window_sum += nums[i]          # add incoming element
        window_sum -= nums[i - k]      # remove outgoing element
        max_sum = max(max_sum, window_sum)

    return max_sum


# ── DYNAMIC WINDOW ────────────────────────────────────────
def longest_substring_no_repeat(s):
    """Longest substring without repeating characters."""
    char_index = {}    # char → last seen index
    left = 0
    max_len = 0

    for right, char in enumerate(s):
        if char in char_index and char_index[char] >= left:
            left = char_index[char] + 1   # shrink: skip past duplicate

        char_index[char] = right
        max_len = max(max_len, right - left + 1)

    return max_len

The Dynamic Window Template

The structure is always the same — vary only the condition and the data structure you maintain inside:

def sliding_window_template(arr):
    left = 0
    result = 0
    window_state = {}    # or a counter, sum, set — depends on problem

    for right in range(len(arr)):
        # 1. Expand: add arr[right] to window
        window_state[arr[right]] = window_state.get(arr[right], 0) + 1

        # 2. Shrink: while window violates condition, move left
        while WINDOW_IS_INVALID(window_state):
            window_state[arr[left]] -= 1
            if window_state[arr[left]] == 0:
                del window_state[arr[left]]
            left += 1

        # 3. Update result with current valid window
        result = max(result, right - left + 1)

    return result

The only thing that changes between sliding window problems is the validity condition and what you track in the window. Once you see that, most sliding window problems collapse into the same template.

A Harder Example: Minimum Window Substring

def min_window_substring(s, t):
    """Find smallest substring of s that contains all chars of t."""
    from collections import Counter

    need = Counter(t)      # chars we need
    missing = len(t)       # total chars still needed
    left = 0
    result = ""
    start = 0

    for right, char in enumerate(s):
        if need[char] > 0:
            missing -= 1          # this char satisfies a requirement
        need[char] -= 1           # track count (can go negative)

        if missing == 0:          # valid window found
            # shrink from left while still valid
            while need[s[left]] < 0:
                need[s[left]] += 1
                left += 1

            # update result if this window is smaller
            if not result or right - left + 1 < len(result):
                result = s[left:right + 1]

            # prepare to find next window: remove leftmost char
            need[s[left]] += 1
            missing += 1
            left += 1

    return result

Canonical Problems

Maximum Average Subarray I (Easy)
Longest Substring Without Repeating Characters (Medium)
Minimum Window Substring (Hard)

Technique 3: Two Pointers

The Core Idea

Two index variables move through the array, either toward each other from opposite ends (for sorted arrays and pair-finding) or in the same direction at different speeds (for in-place manipulation). Either way, total movement is O(n) — each element is visited at most twice.

# ── OPPOSITE DIRECTION: pair sum in sorted array ──────────
def two_sum_sorted(nums, target):
    left, right = 0, len(nums) - 1

    while left < right:
        s = nums[left] + nums[right]
        if s == target:
            return [left, right]
        elif s < target:
            left += 1      # sum too small → move left forward
        else:
            right -= 1     # sum too large → move right back

    return []


# ── SAME DIRECTION: remove duplicates in-place ───────────
def remove_duplicates(nums):
    """Remove duplicates from sorted array in-place, return new length."""
    if not nums:
        return 0

    slow = 0    # write pointer: next position for unique element
    for fast in range(1, len(nums)):
        if nums[fast] != nums[slow]:   # found a new unique element
            slow += 1
            nums[slow] = nums[fast]    # write it at next unique position

    return slow + 1

Why Sorting + Two Pointers Is a Superpower

Many O(n²) problems on unsorted arrays become O(n log n) after sorting + O(n) two pointers = O(n log n) overall. This is the playbook for 3Sum, 4Sum, and their variants.

def three_sum(nums):
    """Find all unique triplets that sum to zero."""
    nums.sort()    # O(n log n) — enables two pointers
    result = []

    for i in range(len(nums) - 2):
        if i > 0 and nums[i] == nums[i - 1]:
            continue    # skip duplicate values for i

        left, right = i + 1, len(nums) - 1

        while left < right:
            s = nums[i] + nums[left] + nums[right]

            if s == 0:
                result.append([nums[i], nums[left], nums[right]])
                while left < right and nums[left] == nums[left + 1]:
                    left += 1    # skip left duplicates
                while left < right and nums[right] == nums[right - 1]:
                    right -= 1   # skip right duplicates
                left += 1
                right -= 1

            elif s < 0:
                left += 1
            else:
                right -= 1

    return result

The Palindrome Check Pattern

Two pointers from opposite ends — the classic string application:

def is_palindrome(s):
    left, right = 0, len(s) - 1
    while left < right:
        if s[left] != s[right]:
            return False
        left += 1
        right -= 1
    return True

def is_palindrome_ignoring_non_alphanumeric(s):
    left, right = 0, len(s) - 1
    while left < right:
        while left < right and not s[left].isalnum():
            left += 1
        while left < right and not s[right].isalnum():
            right -= 1
        if s[left].lower() != s[right].lower():
            return False
        left += 1
        right -= 1
    return True

Canonical Problems

Technique 4: Hash Map + Array

The Core Idea

When a problem requires O(n²) time to compare elements, a hash map often reduces it to O(n) by enabling O(1) lookup. The trade: O(n) extra space for the hash map. In interviews, this trade is almost always worth it.

The original Two Sum is the canonical example:

# NAIVE: O(n²) — check all pairs
def two_sum_naive(nums, target):
    for i in range(len(nums)):
        for j in range(i + 1, len(nums)):
            if nums[i] + nums[j] == target:
                return [i, j]

# OPTIMAL: O(n) — hash map lookup
def two_sum(nums, target):
    seen = {}    # value → index

    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]
        seen[num] = i

    return []

The Frequency Counter Pattern

Count occurrences of elements to answer frequency-based questions in O(n):

from collections import Counter

def is_anagram(s, t):
    """Are s and t anagrams of each other?"""
    return Counter(s) == Counter(t)

def group_anagrams(strs):
    """Group strings that are anagrams of each other."""
    groups = {}
    for s in strs:
        key = tuple(sorted(s))    # canonical form: sorted chars
        groups.setdefault(key, []).append(s)
    return list(groups.values())

def top_k_frequent(nums, k):
    """Return the k most frequent elements."""
    count = Counter(nums)
    return [item for item, _ in count.most_common(k)]

The Seen-Before Pattern

Track which elements you've already processed to detect duplicates, find complements, or skip revisited states:

def contains_duplicate(nums):
    seen = set()
    for num in nums:
        if num in seen:
            return True
        seen.add(num)
    return False

def longest_consecutive_sequence(nums):
    """Find length of longest consecutive sequence. O(n) time."""
    num_set = set(nums)
    best = 0

    for num in num_set:
        if num - 1 not in num_set:    # only start sequences from their beginning
            length = 1
            while num + length in num_set:
                length += 1
            best = max(best, length)

    return best

Canonical Problems

Technique 5: In-Place Array Techniques

The Core Idea

Some problems ask for O(1) space — meaning you must modify the input array directly instead of allocating new data structures. Three sub-techniques handle most in-place problems:

Swap to final position, use sign as a flag, and rotate with reversal.

Rotate Array Without Extra Space

def rotate(nums, k):
    """Rotate array to the right by k steps. O(n) time, O(1) space."""
    n = len(nums)
    k = k % n    # handle k > n

    def reverse(left, right):
        while left < right:
            nums[left], nums[right] = nums[right], nums[left]
            left += 1
            right -= 1

    reverse(0, n - 1)      # reverse entire array
    reverse(0, k - 1)      # reverse first k elements
    reverse(k, n - 1)      # reverse remaining elements

# Example: [1,2,3,4,5,6,7], k=3
# After reverse all:  [7,6,5,4,3,2,1]
# After reverse [0,2]: [5,6,7,4,3,2,1]
# After reverse [3,6]: [5,6,7,1,2,3,4] ✓

Use Index as a Flag (Find Missing/Duplicate)

When numbers are in range [1, n], you can use the sign of nums[nums[i] - 1] as a boolean flag — no extra space needed:

def find_all_duplicates(nums):
    """Find all duplicates in array [1..n]. O(n) time, O(1) space."""
    result = []

    for num in nums:
        idx = abs(num) - 1    # map value to index
        if nums[idx] < 0:     # already marked → it's a duplicate
            result.append(abs(num))
        else:
            nums[idx] *= -1   # mark as seen by negating

    return result

def find_disappeared_numbers(nums):
    """Find all numbers [1..n] missing from array. O(n), O(1) space."""
    for num in nums:
        idx = abs(num) - 1
        nums[idx] = -abs(nums[idx])    # mark as present

    return [i + 1 for i, val in enumerate(nums) if val > 0]  # positive = missing

Move Zeros In-Place

def move_zeros(nums):
    """Move all zeros to end, maintain relative order. O(n), O(1) space."""
    write = 0    # position for next non-zero element

    for num in nums:
        if num != 0:
            nums[write] = num
            write += 1

    while write < len(nums):
        nums[write] = 0
        write += 1

Canonical Problems

Rotate Array (Medium)
Find All Duplicates in an Array (Medium)
Move Zeroes (Easy)

The Decision Framework: Which Technique to Use?

When you see an array or string problem, work through this decision in order:

Array/String problem received
         │
         ▼
Need RANGE SUM queries or handle NEGATIVE numbers in subarray sum?
  └─ Yes → Prefix Sum (+ Hash Map for "count subarrays with sum k")
         │
         ▼
Problem about CONTIGUOUS subarray/substring with a condition?
  ├─ Positive numbers only, find longest/shortest → Sliding Window
  ├─ Need to track character frequency inside window → Sliding Window + Hash Map
  └─ No
         │
         ▼
Input is SORTED (or sorting first is allowed)?
  ├─ Find pairs/triplets summing to target → Two Pointers
  ├─ Palindrome / two-end comparison → Two Pointers
  └─ No
         │
         ▼
Need O(1) LOOKUP to eliminate inner loop?
  └─ Yes → Hash Map (Two Sum, anagram, frequency)
         │
         ▼
Problem requires O(1) SPACE and asks to MODIFY in-place?
  └─ Yes → In-Place Technique (swap, sign flag, reverse)

The 10 Array/String Problems Every Candidate Must Know

These 10 problems cover every technique in this post. Master them in the order listed — each one builds on the previous.

#	Problem	Technique	Difficulty
1	Two Sum	Hash Map	Easy
2	Best Time to Buy and Sell Stock	Single Pass	Easy
3	Contains Duplicate	Hash Set	Easy
4	Move Zeroes	Two Pointers (same dir)	Easy
5	Maximum Average Subarray I	Sliding Window (fixed)	Easy
6	3Sum	Sort + Two Pointers	Medium
7	Subarray Sum Equals K	Prefix Sum + Hash Map	Medium
8	Longest Substring Without Repeating Characters	Sliding Window (dynamic)	Medium
9	Group Anagrams	Hash Map + Sort	Medium
10	Minimum Window Substring	Sliding Window + Counters	Hard

Work through these in order. Don't move to the next until you can solve the current one from scratch without hints in under 20 minutes.

Three String-Specific Traps

Before moving on, three common interview traps that are specific to strings:

Trap 1: Forgetting strings are immutable in Python

# This is O(n²) — avoid in interviews
result = ""
for c in s:
    result += c

# This is O(n) — use this
result = "".join(list(s))

Trap 2: ASCII vs Unicode assumptions

Many solutions assume 26 lowercase letters and use a fixed-size array of 26. If the interviewer says "all lowercase English letters," this is fine and slightly faster than a hash map. But ask first — the problem might include uppercase, digits, or Unicode.

# O(1) space if input is only lowercase a-z
def char_frequency(s):
    freq = [0] * 26
    for c in s:
        freq[ord(c) - ord('a')] += 1
    return freq

Trap 3: Off-by-one in sliding window

Window length is right - left + 1, not right - left. When reporting the starting index of the best window: left, not right - window_size. Burn these formulas in — they cost candidates entire problems when they get them wrong.

Bottom Line

Arrays and strings are not a warm-up topic. They are the topic. Almost every data structure problem you encounter — linked lists, trees, graphs — eventually involves array or string manipulation at some level. The five techniques in this post are not array-specific tricks; they're fundamental problem-solving patterns that appear everywhere.

The signal that distinguishes a strong candidate in an array/string question isn't that they get the right answer (most do on easy problems). It's that they immediately say "this has negative numbers, so sliding window won't work — I'll use prefix sum plus a hash map." That instantaneous pattern recognition is what gets offers. Build it now.

🧭 What's Next in This Series

Arrays are your first tool. The tool that makes them dramatically faster on a huge class of problems is coming next:

Post 05: Hash Maps & Hash Sets — how they work under the hood, when O(1) breaks down, and the 5 patterns that cover 80% of hash-related interview problems

The Most Common Topic in Technical Interviews

The Memory Model: Why Arrays Are Fast

Technique 1: Prefix Sum

The Core Idea

When Prefix Sum Beats Sliding Window

The Prefix Sum + Hash Map Pattern

Canonical Problems

Technique 2: Sliding Window

The Core Idea

The Dynamic Window Template

A Harder Example: Minimum Window Substring

Canonical Problems

Technique 3: Two Pointers

The Core Idea

Why Sorting + Two Pointers Is a Superpower

The Palindrome Check Pattern

Canonical Problems

Technique 4: Hash Map + Array

The Core Idea

The Frequency Counter Pattern

The Seen-Before Pattern

Canonical Problems

Technique 5: In-Place Array Techniques

The Core Idea

Rotate Array Without Extra Space

Use Index as a Flag (Find Missing/Duplicate)

Move Zeros In-Place

Canonical Problems

The Decision Framework: Which Technique to Use?

The 10 Array/String Problems Every Candidate Must Know

Three String-Specific Traps

Bottom Line

🧭 What's Next in This Series

Related

Divide & Conquer: The Strategy Behind the Fastest Algorithms

Greedy Algorithms: When Being Selfish Gives the Optimal Answer

Dynamic Programming: From Scary to Systematic in One Post

Leave a comment

Comments