Edit Distance
Difficulty: Hard Source: NeetCode
Problem
Given two strings
word1andword2, return the minimum number of operations required to convertword1toword2. The three allowed operations are: Insert a character, Delete a character, or Replace a character.Example 1: Input:
word1 = "horse",word2 = "ros"Output:3Explanation: horse → rorse (replace h→r), rorse → rose (delete r), rose → ros (delete e)Example 2: Input:
word1 = "intention",word2 = "execution"Output:5Constraints:
0 <= word1.length, word2.length <= 500- Both strings consist of lowercase English letters.
Prerequisites
Before attempting this problem, you should be comfortable with:
- LCS — edit distance is closely related to the longest common subsequence
- 2D DP — the classic table-filling approach for two-string problems
1. Brute Force / Recursive
Intuition
Compare the last characters of both strings. If they match, no operation needed — just recurse on word1[:i-1] and word2[:j-1]. If they don’t match, we can insert, delete, or replace, and we take the minimum cost of all three options plus 1.
- Insert: insert
word2[j-1]at end ofword1→ nowword2[j-1]is matched, recurse on(i, j-1). - Delete: delete
word1[i-1]→ recurse on(i-1, j). - Replace: replace
word1[i-1]withword2[j-1]→ recurse on(i-1, j-1).
Algorithm
- Define
dfs(i, j)= min edits to convertword1[:i]toword2[:j]. - Base:
i == 0→ returnj(insert all of word2). - Base:
j == 0→ returni(delete all of word1). - If characters match:
dfs(i-1, j-1). - Else:
1 + min(dfs(i, j-1), dfs(i-1, j), dfs(i-1, j-1)).
Solution
def minDistance(word1, word2):
def dfs(i, j):
if i == 0:
return j
if j == 0:
return i
if word1[i - 1] == word2[j - 1]:
return dfs(i - 1, j - 1)
return 1 + min(dfs(i, j - 1), # insert
dfs(i - 1, j), # delete
dfs(i - 1, j - 1)) # replace
return dfs(len(word1), len(word2))
print(minDistance("horse", "ros")) # 3
print(minDistance("intention", "execution")) # 5
print(minDistance("", "abc")) # 3
print(minDistance("abc", "")) # 3
Complexity
- Time:
O(3^(m+n))— exponential without memoization - Space:
O(m+n)— recursion depth
2. Bottom-Up DP (Tabulation)
Intuition
Build a (m+1) x (n+1) table. dp[i][j] = minimum edit distance between word1[:i] and word2[:j].
Base cases:
dp[0][j] = j— convert empty string toword2[:j]requiresjinsertions.dp[i][0] = i— convertword1[:i]to empty string requiresideletions.
Transition (same as recursive):
- If
word1[i-1] == word2[j-1]:dp[i][j] = dp[i-1][j-1](no cost). - Else:
dp[i][j] = 1 + min(dp[i][j-1], dp[i-1][j], dp[i-1][j-1]).
Algorithm
- Create
(m+1) x (n+1)dp table. - Initialise first row and column with
0..nand0..m. - Fill row by row using the transition.
- Return
dp[m][n].
Solution
def minDistance(word1, word2):
m, n = len(word1), len(word2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1):
dp[i][0] = i
for j in range(n + 1):
dp[0][j] = j
for i in range(1, m + 1):
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
dp[i][j] = dp[i - 1][j - 1]
else:
dp[i][j] = 1 + min(
dp[i][j - 1], # insert
dp[i - 1][j], # delete
dp[i - 1][j - 1] # replace
)
return dp[m][n]
print(minDistance("horse", "ros")) # 3
print(minDistance("intention", "execution")) # 5
print(minDistance("", "abc")) # 3
print(minDistance("abc", "abc")) # 0
# Space-optimised: two rows
def minDistance_opt(word1, word2):
m, n = len(word1), len(word2)
prev = list(range(n + 1))
for i in range(1, m + 1):
curr = [i] + [0] * n
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
curr[j] = prev[j - 1]
else:
curr[j] = 1 + min(curr[j - 1], prev[j], prev[j - 1])
prev = curr
return prev[n]
print(minDistance_opt("horse", "ros")) # 3
Complexity
- Time:
O(m * n) - Space:
O(m * n)for 2D;O(n)for optimised version
Common Pitfalls
Confusing insert/delete direction. dp[i][j-1] is “insert into word1” (we matched one more char of word2 without advancing word1). dp[i-1][j] is “delete from word1”. The diagonal dp[i-1][j-1] is “replace”.
Not initialising the base cases. Row 0 and column 0 must be filled with 0, 1, 2, ..., n and 0, 1, 2, ..., m respectively. Leaving them at 0 will silently produce wrong answers for strings with empty prefixes.
Thinking replace costs 0 when characters match. When chars match, we use dp[i-1][j-1] directly (0 cost). When they don’t match, replace costs exactly 1 + dp[i-1][j-1]. Don’t add 1 for the matching case.