Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Distinct Subsequences

Difficulty: Hard Source: NeetCode

Problem

Given two strings s and t, return the number of distinct subsequences of s that equal t.

Example 1: Input: s = "rabbbit", t = "rabbit" Output: 3 Explanation: Three ways to choose which b to drop.

Example 2: Input: s = "babgbag", t = "bag" Output: 5

Constraints:

  • 1 <= s.length, t.length <= 1000
  • s and t consist of lowercase English letters.

Prerequisites

Before attempting this problem, you should be comfortable with:

  • Longest Common Subsequence — similar 2D DP table structure
  • Subsequences — understanding that elements must maintain relative order

1. Brute Force / Recursive

Intuition

Match t against s from left to right. At each position i in s, we decide whether to use s[i] to match the current character t[j]. If s[i] == t[j], we have two options: use this character (advance both i and j) or skip it (advance only i). If they don’t match, we can only skip. We count all ways that exhaust t completely.

Algorithm

  1. Define dfs(i, j) = number of ways to match t[j:] using characters from s[i:].
  2. Base case: j == len(t) → return 1 (matched all of t).
  3. Base case: i == len(s) → return 0 (ran out of s).
  4. If s[i] == t[j]: return dfs(i+1, j+1) + dfs(i+1, j).
  5. Else: return dfs(i+1, j).

Solution

def numDistinct(s, t):
    def dfs(i, j):
        if j == len(t):
            return 1
        if i == len(s):
            return 0
        if s[i] == t[j]:
            return dfs(i + 1, j + 1) + dfs(i + 1, j)
        return dfs(i + 1, j)

    return dfs(0, 0)


print(numDistinct("rabbbit", "rabbit"))  # 3
print(numDistinct("babgbag", "bag"))     # 5
print(numDistinct("a", "a"))            # 1
print(numDistinct("a", "b"))            # 0

Complexity

  • Time: O(2^m) — exponential without memoization
  • Space: O(m+n) — recursion depth

2. Bottom-Up DP (Tabulation)

Intuition

Build a (m+1) x (n+1) table where dp[i][j] = number of distinct subsequences of s[:i] that equal t[:j]. Base cases: dp[i][0] = 1 for all i (empty t is a subsequence of any prefix of s exactly once). dp[0][j] = 0 for j > 0 (can’t match non-empty t with empty s).

Transition:

  • If s[i-1] == t[j-1]: dp[i][j] = dp[i-1][j-1] + dp[i-1][j] (use or skip this character).
  • Else: dp[i][j] = dp[i-1][j] (must skip, character doesn’t match).

Algorithm

  1. Create (m+1) x (n+1) dp table.
  2. Set dp[i][0] = 1 for all i in 0..m.
  3. Fill row by row, applying the transition.
  4. Return dp[m][n].

Solution

def numDistinct(s, t):
    m, n = len(s), len(t)
    dp = [[0] * (n + 1) for _ in range(m + 1)]

    for i in range(m + 1):
        dp[i][0] = 1  # empty t is always a subsequence

    for i in range(1, m + 1):
        for j in range(1, n + 1):
            dp[i][j] = dp[i - 1][j]  # always carry down (skip s[i-1])
            if s[i - 1] == t[j - 1]:
                dp[i][j] += dp[i - 1][j - 1]  # also use s[i-1]

    return dp[m][n]


print(numDistinct("rabbbit", "rabbit"))  # 3
print(numDistinct("babgbag", "bag"))     # 5
print(numDistinct("a", "a"))            # 1
print(numDistinct("a", "b"))            # 0
print(numDistinct("", ""))              # 1


# Space-optimised: 1D rolling array
def numDistinct_1d(s, t):
    m, n = len(s), len(t)
    dp = [0] * (n + 1)
    dp[0] = 1

    for i in range(m):
        for j in range(n - 1, -1, -1):  # iterate right-to-left to avoid overwriting
            if s[i] == t[j]:
                dp[j + 1] += dp[j]

    return dp[n]


print(numDistinct_1d("rabbbit", "rabbit"))  # 3
print(numDistinct_1d("babgbag", "bag"))     # 5

Complexity

  • Time: O(m * n)
  • Space: O(m * n) for 2D; O(n) for 1D rolling array

Common Pitfalls

Mixing up rows and columns. dp[i][j] represents s[:i] vs t[:j]. Row i is indexed into s, column j into t. Getting this backwards gives wrong results.

Forgetting the “carry down” part. Even when s[i-1] == t[j-1], we still add dp[i-1][j] (the case where we skip s[i-1]). Only adding dp[i-1][j-1] undercounts.

Wrong 1D iteration direction. In the space-optimised version, iterate j from right to left. If you go left to right, dp[j] will already be updated for the current i when you compute dp[j+1], causing overcounting.