Search Range

What if the array has duplicates and you need the first or last occurrence?

Standard binary search returns some matching index — but not necessarily the leftmost or rightmost one. For example, in [1, 2, 2, 2, 3, 4], searching for 2 might return index 1, 2, or 3 depending on where the midpoint happens to fall.

Finding a precise boundary — or searching over a numeric range rather than an array — is a powerful generalisation of binary search that unlocks a whole family of problems.

The Boundary Search Idea

When you find a match, instead of stopping immediately, keep searching in one direction to see if there is a closer match:

First occurrence (left boundary): on a hit, record the index and continue searching left (right = mid - 1).
Last occurrence (right boundary): on a hit, record the index and continue searching right (left = mid + 1).

graph TD
    subgraph Standard["Standard binary search — stops on first hit"]
        S1["left=0, right=5\nmid=2, nums[2]=2 → HIT, return 2"]
    end

    subgraph Left["find_first — keeps going left after a hit"]
        L1["left=0, right=5\nmid=2, nums[2]=2 → record 2, right=1"]
        L2["left=0, right=1\nmid=0, nums[0]=1 → too small, left=1"]
        L3["left=1, right=1\nmid=1, nums[1]=2 → record 1, right=0"]
        L4["left > right → return 1 ✓"]
        L1 --> L2 --> L3 --> L4
    end

    subgraph Right["find_last — keeps going right after a hit"]
        R1["left=0, right=5\nmid=2, nums[2]=2 → record 2, left=3"]
        R2["left=3, right=5\nmid=4, nums[4]=3 → too big, right=3"]
        R3["left=3, right=3\nmid=3, nums[3]=2 → record 3, left=4"]
        R4["left > right → return 3 ✓"]
        R1 --> R2 --> R3 --> R4
    end

Array used: [1, 2, 2, 2, 3, 4], target = 2

Implementing find_first and find_last

def find_first(nums, target):
    """
    Return the index of the FIRST occurrence of target, or -1 if not found.
    Time: O(log n)  Space: O(1)
    """
    left, right = 0, len(nums) - 1
    answer = -1

    while left <= right:
        mid = left + (right - left) // 2

        if nums[mid] == target:
            answer = mid        # record this hit...
            right = mid - 1    # ...but keep searching left
        elif nums[mid] < target:
            left = mid + 1
        else:
            right = mid - 1

    return answer


def find_last(nums, target):
    """
    Return the index of the LAST occurrence of target, or -1 if not found.
    Time: O(log n)  Space: O(1)
    """
    left, right = 0, len(nums) - 1
    answer = -1

    while left <= right:
        mid = left + (right - left) // 2

        if nums[mid] == target:
            answer = mid        # record this hit...
            left = mid + 1     # ...but keep searching right
        elif nums[mid] < target:
            left = mid + 1
        else:
            right = mid - 1

    return answer


def search_range(nums, target):
    """Return [first_index, last_index] of target, or [-1, -1] if absent."""
    return [find_first(nums, target), find_last(nums, target)]


# Tests
arr = [1, 2, 2, 2, 3, 4]
print(search_range(arr, 2))   # [1, 3]
print(search_range(arr, 1))   # [0, 0]
print(search_range(arr, 4))   # [5, 5]
print(search_range(arr, 5))   # [-1, -1]

# All duplicates
print(search_range([7, 7, 7, 7], 7))   # [0, 3]

# Target absent from middle
print(search_range([1, 1, 3, 3], 2))   # [-1, -1]

Binary Search on a Numeric Range

Binary search does not have to operate on an array at all. You can apply it to any monotonic condition over a numeric range.

Classic example: finding the integer square root of n — the largest integer k such that k² ≤ n.

Instead of a sorted array, the “array” is the range [0, n], and the condition k² ≤ n is True for small k and False for large k — a perfect sorted boundary to binary-search on.

graph LR
    subgraph "find_sqrt(26)"
        A["left=0, right=26\nmid=13, 13²=169 > 26 → right=12"]
        B["left=0, right=12\nmid=6, 6²=36 > 26 → right=5"]
        C["left=0, right=5\nmid=2, 2²=4 ≤ 26 → answer=2, left=3"]
        D["left=3, right=5\nmid=4, 4²=16 ≤ 26 → answer=4, left=5"]
        E["left=5, right=5\nmid=5, 5²=25 ≤ 26 → answer=5, left=6"]
        F["left > right → return 5 ✓"]
        A --> B --> C --> D --> E --> F
    end

def find_sqrt(n):
    """
    Return the integer square root of n (floor(sqrt(n))) using binary search.
    Time: O(log n)  Space: O(1)
    """
    if n < 2:
        return n

    left, right = 1, n // 2   # sqrt(n) <= n//2 for n >= 4
    answer = 1

    while left <= right:
        mid = left + (right - left) // 2

        if mid * mid <= n:
            answer = mid       # mid is a valid candidate; try larger
            left = mid + 1
        else:
            right = mid - 1   # mid² too big; try smaller

    return answer


# Verify against math.isqrt
import math
test_cases = [0, 1, 4, 8, 9, 26, 100, 1000, 999_999]

print(f"{'n':<12} {'find_sqrt':<14} {'math.isqrt':<14} {'match'}")
print("-" * 46)
for n in test_cases:
    result = find_sqrt(n)
    expected = math.isqrt(n)
    print(f"{n:<12} {result:<14} {expected:<14} {'✓' if result == expected else '✗'}")

The General Pattern

These three problems share the same skeleton — binary search on a condition:

def binary_search_condition(lo, hi, condition):
    """
    Find the LAST position where condition is True in the range [lo, hi].

    Assumes condition is True for some prefix [lo..k] and False for [k+1..hi]
    — i.e., the condition is monotonically non-increasing.

    Returns the boundary index k, or lo-1 if condition is False everywhere.
    """
    answer = lo - 1  # sentinel: condition never True

    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if condition(mid):
            answer = mid
            lo = mid + 1   # record hit, search right for a later hit
        else:
            hi = mid - 1   # search left

    return answer


# Example 1: find last index where nums[i] <= target (like upper_bound)
nums = [1, 3, 3, 3, 5, 7, 9]
target = 3
last_le = binary_search_condition(0, len(nums) - 1, lambda i: nums[i] <= target)
print(f"Last index where nums[i] <= {target}: {last_le}")   # 3

# Example 2: integer square root via the same template
n = 50
sqrt_n = binary_search_condition(1, n, lambda k: k * k <= n)
print(f"Integer sqrt of {n}: {sqrt_n}")   # 7  (7² = 49 ≤ 50, 8² = 64 > 50)

Complexity Summary

Problem	Time	Space
find_first	O(log n)	O(1)
find_last	O(log n)	O(1)
search_range	O(log n)	O(1)
find_sqrt	O(log n)	O(1)

Each problem performs at most two passes over log n positions, or a single pass over a numeric range of size n — so the complexity stays logarithmic.

Real-World Connections

Finding date ranges in logs — given millions of log lines sorted by timestamp, finding all lines between two timestamps is exactly a find_first / find_last pair on the timestamp column.

Python bisect module — bisect.bisect_left(a, x) returns the first index where x could be inserted to keep a sorted (equivalent to find_first), and bisect.bisect_right(a, x) returns the last such index. These are used internally by Python’s SortedList containers.

import bisect

events = [
    "2024-03-01 login",
    "2024-03-02 purchase",
    "2024-03-02 refund",
    "2024-03-02 logout",
    "2024-03-05 login",
]

# All events on 2024-03-02
lo = bisect.bisect_left(events,  "2024-03-02")
hi = bisect.bisect_right(events, "2024-03-02\xff")  # \xff sorts after all printable chars

print("Events on 2024-03-02:")
for e in events[lo:hi]:
    print(" ", e)

C++ STL lower_bound / upper_bound — C++’s <algorithm> header provides lower_bound (first position ≥ target) and upper_bound (first position > target) on any sorted range. They are the direct equivalents of find_first and the index after find_last, and they are used extensively in competitive programming and systems code.

Keyboard shortcuts