Merge Sort
Divide and conquer — split the problem in half until it’s trivial, then combine. Merge sort is the clearest real-world example of this technique: break the array in half, sort each half recursively, then merge the two sorted halves back together. The magic is in the merge step.
The Core Idea
A single-element array is already sorted. That’s the base case. Merge sort works backwards from there: if you have two sorted arrays, you can combine them into one sorted array in O(n) time by just comparing the front elements and taking the smaller one each time.
flowchart TD
subgraph Strategy["Divide and Conquer Strategy"]
D["DIVIDE — split in half until size = 1"]
C["CONQUER — each size-1 array is trivially sorted"]
M["MERGE — combine sorted pairs bottom-up"]
D --> C --> M
end
Full Split and Merge Tree for [38, 27, 43, 3, 9, 82, 10]
flowchart TD
Root["[38, 27, 43, 3, 9, 82, 10]"]
subgraph SplitLeft["Split left half"]
L1["[38, 27, 43, 3]"]
L2["[38, 27]"]
L3["[43, 3]"]
L4["[38]"]
L5["[27]"]
L6["[43]"]
L7["[3]"]
end
subgraph SplitRight["Split right half"]
R1["[9, 82, 10]"]
R2["[9]"]
R3["[82, 10]"]
R4["[82]"]
R5["[10]"]
end
subgraph MergeLeft["Merge left half back up"]
M1["[27, 38]"]
M2["[3, 43]"]
M3["[3, 27, 38, 43]"]
end
subgraph MergeRight["Merge right half back up"]
M4["[10, 82]"]
M5["[9, 10, 82]"]
end
Done["[3, 9, 10, 27, 38, 43, 82]"]
Root --> L1
Root --> R1
L1 --> L2 --> L4
L2 --> L5
L1 --> L3 --> L6
L3 --> L7
L4 --> M1
L5 --> M1
L6 --> M2
L7 --> M2
M1 --> M3
M2 --> M3
R1 --> R2
R1 --> R3 --> R4
R3 --> R5
R4 --> M4
R5 --> M4
R2 --> M5
M4 --> M5
M3 --> Done
M5 --> Done
The Merge Step in Detail
The merge step is where the real work happens. Given two sorted arrays, we pick the smaller front element each time:
flowchart LR
subgraph Input["Two sorted halves"]
Left["Left: [3, 27, 38, 43]\n ^\n i"]
Right["Right: [9, 10, 82]\n ^\n j"]
end
subgraph Steps["Merge steps"]
S1["Compare 3 vs 9 → take 3 → result: [3]"]
S2["Compare 27 vs 9 → take 9 → result: [3, 9]"]
S3["Compare 27 vs 10→ take 10 → result: [3, 9, 10]"]
S4["Compare 27 vs 82→ take 27 → result: [3, 9, 10, 27]"]
S5["Compare 38 vs 82→ take 38 → result: [3, 9, 10, 27, 38]"]
S6["Compare 43 vs 82→ take 43 → result: [3, 9, 10, 27, 38, 43]"]
S7["Left exhausted → append 82 → result: [3, 9, 10, 27, 38, 43, 82]"]
end
Input --> S1 --> S2 --> S3 --> S4 --> S5 --> S6 --> S7
Implementation
def merge_sort(nums):
# Base case: a list of 0 or 1 elements is already sorted
if len(nums) <= 1:
return nums[:]
# DIVIDE — find the midpoint and split
mid = len(nums) // 2
left = merge_sort(nums[:mid]) # Recursively sort the left half
right = merge_sort(nums[mid:]) # Recursively sort the right half
# MERGE — combine two sorted halves into one sorted list
return merge(left, right)
def merge(left, right):
result = []
i = 0 # Pointer into left
j = 0 # Pointer into right
# Compare front elements; always take the smaller one
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
# One side is exhausted — append the remainder of the other side
result.extend(left[i:])
result.extend(right[j:])
return result
# --- Demo ---
data = [38, 27, 43, 3, 9, 82, 10]
print("Before:", data)
print("After: ", merge_sort(data))
# Merge sort handles all cases equally well
print("\nReverse sorted:", merge_sort([6, 5, 4, 3, 2, 1]))
print("Already sorted:", merge_sort([1, 2, 3, 4, 5, 6]))
print("Duplicates: ", merge_sort([3, 1, 4, 1, 5, 9, 2, 6]))
Why O(n log n)?
The key insight is that the recursion tree has log n levels (because we halve the array each time), and each level does O(n) total work in the merge step.
flowchart TD
subgraph Level0["Level 0 — 1 merge of n elements → n work"]
L0["merge([3,27,38,43], [9,10,82]) → n comparisons"]
end
subgraph Level1["Level 1 — 2 merges of n/2 elements each → n work"]
L1A["merge([27,38], [3,43])"]
L1B["merge([82], [9,10])"]
end
subgraph Level2["Level 2 — 4 merges of n/4 elements each → n work"]
L2A["merge([38],[27])"]
L2B["merge([43],[3])"]
L2C["merge([82],[10])"]
end
subgraph Summary["Total"]
S["log₂(n) levels × n work per level = O(n log n)"]
end
Level2 --> Level1 --> Level0 --> Summary
For n=7: log₂(7) ≈ 3 levels × 7 work per level = ~21 operations. Compare that to insertion sort’s worst case of 7² = 49.
Time and Space Complexity
| Case | Time | Why |
|---|---|---|
| Best | O(n log n) | Always splits and merges the same way |
| Average | O(n log n) | Same structure regardless of input |
| Worst | O(n log n) | Guaranteed — no bad pivot problem |
Space: O(n) — merge sort needs an auxiliary array of size n to hold the merged result. This is the trade-off vs. in-place algorithms.
Stable: Yes — when left[i] <= right[j], we take from the left. Equal elements preserve their original order.
When to use Merge Sort
Use it when:
- You need a guaranteed O(n log n) worst case (Quick Sort can degrade to O(n²)).
- Stability matters — you need equal elements to stay in their original order.
- You are sorting a linked list (merge sort works naturally; Quick Sort does not).
- Data doesn’t fit in RAM — external merge sort splits a huge file into sorted chunks, writes them to disk, then merges the chunk files. This is how databases sort tables too large to fit in memory.
Trade-off: Requires O(n) extra space for the temporary merged array.
Real-world appearances
- Python’s Timsort (used by
sorted()andlist.sort()) is a hybrid of merge sort and insertion sort. It finds naturally ordered runs in the data and merges them — making it extremely fast on real-world data. - External sorting — merge sort is the foundation of every large-scale sort that uses disk or distributed storage. MapReduce’s shuffle-and-sort phase is essentially a distributed merge sort.
- Inversion counting — a classic interview problem solved by modifying the merge step to count how many times we pick from the right before the left.