Prefix Sums

Answer “what is the sum of elements from index 3 to 7?” in O(1) — after a one-time O(n) setup. Prefix sums trade a little memory for the ability to answer unlimited range queries instantly. Once you understand the trick, you will see it in financial dashboards, databases, image processing, and competitive programming alike.

The Core Idea

Build a prefix array where prefix[i] holds the sum of all elements from index 0 up to and including index i.

arr    =  [3, 1, 4, 1, 5, 9, 2, 6]
index  =   0  1  2  3  4  5  6  7

prefix =  [3, 4, 8, 9, 14, 23, 25, 31]

Now, the sum of any range [l, r] is:

range_sum(l, r) = prefix[r] - prefix[l-1]   (when l > 0)
range_sum(0, r) = prefix[r]

From Array to Prefix Array to Range Query

flowchart TD
    A["Original array\n[3, 1, 4, 1, 5, 9, 2, 6]"] --> B

    B["Build prefix array — O(n)\nprefix[i] = prefix[i-1] + arr[i]\n[3, 4, 8, 9, 14, 23, 25, 31]"] --> C

    C{"Range query\nsum(l, r)?"} --> D
    D["prefix[r] − prefix[l−1]\nO(1) per query"]

    D --> E["sum(2,5) = prefix[5]−prefix[1]\n= 23 − 4 = 19\n✓  (4+1+5+9=19)"]

Problem 1 — Range Sum Query (The Classic)

def build_prefix(arr):
    prefix = [0] * len(arr)
    prefix[0] = arr[0]
    for i in range(1, len(arr)):
        prefix[i] = prefix[i - 1] + arr[i]
    return prefix

def range_sum(prefix, l, r):
    if l == 0:
        return prefix[r]
    return prefix[r] - prefix[l - 1]

arr    = [3, 1, 4, 1, 5, 9, 2, 6]
prefix = build_prefix(arr)

print(f"Array:  {arr}")
print(f"Prefix: {prefix}\n")

queries = [(0, 7), (2, 5), (3, 3), (0, 3), (5, 7)]
for l, r in queries:
    total = range_sum(prefix, l, r)
    print(f"sum({l},{r}) = {arr[l:r+1]}  →  {total}")

Time: O(n) to build, O(1) per query Space: O(n) for the prefix array

Why this beats recomputing each time

import time
import random

arr    = [random.randint(1, 100) for _ in range(10_000)]
prefix = build_prefix(arr)

queries = [(random.randint(0, 4999), random.randint(5000, 9999)) for _ in range(10_000)]

def build_prefix(a):
    p = [0] * len(a)
    p[0] = a[0]
    for i in range(1, len(a)):
        p[i] = p[i-1] + a[i]
    return p

# Brute force: O(n) per query
t0 = time.time()
for l, r in queries:
    _ = sum(arr[l:r+1])
brute_ms = (time.time() - t0) * 1000

# Prefix sums: O(1) per query
prefix = build_prefix(arr)
t1 = time.time()
for l, r in queries:
    _ = prefix[r] - (prefix[l-1] if l > 0 else 0)
prefix_ms = (time.time() - t1) * 1000

print(f"10 000 queries on 10 000-element array:")
print(f"  Brute force:  {brute_ms:.1f} ms")
print(f"  Prefix sums:  {prefix_ms:.1f} ms")
print(f"  Speedup:      {brute_ms/max(prefix_ms,0.001):.0f}×")

Problem 2 — Number of Subarrays with Sum Equal to k

This is where prefix sums become genuinely clever. We want to count subarrays whose elements sum to exactly k.

The insight: a subarray arr[l..r] sums to k if and only if prefix[r] - prefix[l-1] == k, which means prefix[l-1] == prefix[r] - k. Use a hashmap to count how many times each prefix sum has appeared so far.

from collections import defaultdict

def count_subarrays_with_sum(arr, k):
    count = 0
    prefix_sum = 0
    freq = defaultdict(int)
    freq[0] = 1    # empty prefix has sum 0

    for x in arr:
        prefix_sum += x
        # How many previous prefix sums equal (current - k)?
        count += freq[prefix_sum - k]
        freq[prefix_sum] += 1

    return count

arr = [1, 1, 1]
print(f"Array: {arr},  k=2  →  {count_subarrays_with_sum(arr, 2)} subarrays")
# Subarrays: [1,1] at (0,1) and (1,2)  →  2

arr2 = [1, 2, 3]
print(f"Array: {arr2},  k=3  →  {count_subarrays_with_sum(arr2, 3)} subarrays")
# [3] at index 2, and [1,2] at (0,1)  →  2

arr3 = [3, 4, 7, 2, -3, 1, 4, 2]
print(f"Array: {arr3},  k=7  →  {count_subarrays_with_sum(arr3, 7)} subarrays")

Trace to see the hashmap fill up

from collections import defaultdict

def count_subarrays_trace(arr, k):
    count = 0
    prefix_sum = 0
    freq = defaultdict(int)
    freq[0] = 1

    print(f"k = {k}\n")
    print(f"{'i':>3}  {'arr[i]':>6}  {'prefix':>7}  {'need':>5}  {'found':>6}  count")
    print("-" * 50)

    for i, x in enumerate(arr):
        prefix_sum += x
        need  = prefix_sum - k
        found = freq[need]
        count += found
        freq[prefix_sum] += 1
        print(f"{i:>3}  {x:>6}  {prefix_sum:>7}  {need:>5}  {found:>6}  {count}")

    print(f"\nTotal subarrays with sum={k}: {count}")
    return count

count_subarrays_trace([1, 2, 3, 0, 3], k=3)

Problem 3 — Product of Array Except Self

For each index i, compute the product of all other elements without division and in O(n).

The trick: build a left prefix product (product of everything to the left of i) and a right suffix product (product of everything to the right). Multiply them together.

def product_except_self(arr):
    n = len(arr)
    result = [1] * n

    # Pass 1: result[i] = product of arr[0..i-1]
    prefix = 1
    for i in range(n):
        result[i] = prefix
        prefix *= arr[i]

    # Pass 2: multiply in product of arr[i+1..n-1]
    suffix = 1
    for i in range(n - 1, -1, -1):
        result[i] *= suffix
        suffix *= arr[i]

    return result

arr = [1, 2, 3, 4]
print(f"Array:   {arr}")
print(f"Product: {product_except_self(arr)}")
# [24, 12, 8, 6]
# index 0: 2*3*4=24, index 1: 1*3*4=12, etc.

arr2 = [-1, 1, 0, -3, 3]
print(f"\nArray:   {arr2}")
print(f"Product: {product_except_self(arr2)}")

Visualising the two passes

def product_except_self_trace(arr):
    n = len(arr)
    result = [1] * n

    prefix = 1
    print("Left pass (prefix products):")
    for i in range(n):
        result[i] = prefix
        prefix *= arr[i]
        print(f"  i={i}  result[{i}]={result[i]}  prefix becomes {prefix}")

    suffix = 1
    print("\nRight pass (multiply suffix):")
    for i in range(n - 1, -1, -1):
        result[i] *= suffix
        suffix *= arr[i]
        print(f"  i={i}  result[{i}]={result[i]}  suffix becomes {suffix}")

    print(f"\nFinal: {result}")
    return result

product_except_self_trace([1, 2, 3, 4])

Real-World Applications

Cumulative Sales Totals

Every business dashboard shows “sales from date A to date B”. Build the prefix once at report-generation time; every date-range query is then O(1).

def build_prefix(arr):
    p = [0] * len(arr)
    p[0] = arr[0]
    for i in range(1, len(arr)):
        p[i] = p[i-1] + arr[i]
    return p

def range_sum(prefix, l, r):
    return prefix[r] - (prefix[l-1] if l > 0 else 0)

# Daily sales for January (31 days, indexed 0–30)
daily_sales = [
    120, 95, 140, 200, 180, 160, 175,   # week 1
    130, 145, 190, 210, 170, 155, 165,   # week 2
     90, 100, 220, 195, 185, 175, 160,   # week 3
    140, 155, 200, 180, 170, 165, 175,   # week 4
    190, 195, 210                         # final days
]

prefix = build_prefix(daily_sales)

print(f"Week 1 total:  ${range_sum(prefix, 0, 6):,}")
print(f"Week 2 total:  ${range_sum(prefix, 7, 13):,}")
print(f"Week 3 total:  ${range_sum(prefix, 14, 20):,}")
print(f"Full January:  ${range_sum(prefix, 0, 30):,}")
print(f"Days 10–20:    ${range_sum(prefix, 10, 20):,}")

2D Prefix Sums — Image Integral (Computer Vision)

The same idea extends to two dimensions. The integral image (or summed-area table) is used in computer vision to compute the sum of any rectangular region of pixel values in O(1), enabling real-time face detection (Viola-Jones algorithm).

def build_2d_prefix(grid):
    rows = len(grid)
    cols = len(grid[0])
    prefix = [[0] * (cols + 1) for _ in range(rows + 1)]

    for r in range(1, rows + 1):
        for c in range(1, cols + 1):
            prefix[r][c] = (
                grid[r-1][c-1]
                + prefix[r-1][c]
                + prefix[r][c-1]
                - prefix[r-1][c-1]
            )
    return prefix

def rect_sum(prefix, r1, c1, r2, c2):
    """Sum of rectangle from (r1,c1) to (r2,c2), inclusive, 0-indexed."""
    return (
        prefix[r2+1][c2+1]
        - prefix[r1][c2+1]
        - prefix[r2+1][c1]
        + prefix[r1][c1]
    )

# 5x5 grayscale pixel values (0-255)
image = [
    [10, 20, 30, 40, 50],
    [15, 25, 35, 45, 55],
    [20, 30, 40, 50, 60],
    [25, 35, 45, 55, 65],
    [30, 40, 50, 60, 70],
]

prefix = build_2d_prefix(image)

# Sum of a 3x3 region starting at (1,1)
total = rect_sum(prefix, 1, 1, 3, 3)
print(f"Sum of rectangle (1,1)→(3,3): {total}")

# Verify by brute force
brute = sum(image[r][c] for r in range(1, 4) for c in range(1, 4))
print(f"Brute force verification:      {brute}")
print(f"Match: {total == brute}")

# Full image sum in O(1)
print(f"Full image sum: {rect_sum(prefix, 0, 0, 4, 4)}")

Complexity Summary

Problem	Build	Query	Space
Range sum query	O(n)	O(1)	O(n)
Subarrays with sum = k (hashmap)	O(n)	—	O(n)
Product except self	O(n)	—	O(1)*
2D range sum (integral image)	O(n·m)	O(1)	O(n·m)

* Two extra O(1) variables for prefix/suffix; result array is required output, not overhead.

Prefix sums embody a core engineering trade-off: spend O(n) time and O(n) space once, then answer queries in O(1) forever. Whenever you find yourself recomputing the same running total across many queries, ask whether a prefix array can replace the loop.

Keyboard shortcuts

Note Book