Alien Dictionary

Difficulty: Hard Source: NeetCode

Problem

There is a new alien language that uses the English alphabet. However, the order among the letters is unknown to you.

You are given a list of strings words from the alien language’s dictionary, where the strings in words are sorted lexicographically by the rules of this new language.

Return a string of the unique letters in the new alien language sorted in the order as they appear in the alien language. If there is no solution, return "". If there are multiple solutions, return any of them.

Example 1: Input: words = ["wrt","wrf","er","ett","rftt"] Output: "wertf"

Example 2: Input: words = ["z","x"] Output: "zx"

Example 3: Input: words = ["z","x","z"] Output: "" (invalid — cycle detected)

Constraints:

1 <= words.length <= 100

1 <= words[i].length <= 100

All characters are lowercase English letters

The dictionary is supposedly sorted

Prerequisites

Before attempting this problem, you should be comfortable with:

Topological Sort — ordering nodes in a DAG such that all edges point forward
Cycle Detection in Directed Graphs — using DFS with color states
Graph Building from Constraints — inferring edges from pairwise comparisons

1. Brute Force (BFS Topological Sort — Kahn’s Algorithm)

Compare adjacent words to discover ordering constraints: the first character where two consecutive words differ tells us which letter comes first in the alien alphabet. Build a directed graph from these constraints, then run a topological sort. If the graph has a cycle, no valid ordering exists.

Algorithm

Collect all unique characters.
Compare each pair of adjacent words: find the first differing character, add directed edge (first[i] → second[i]).
If words[i] is a prefix of words[i-1] (and longer), return "" — invalid.
Run Kahn’s BFS topological sort (by in-degree).
If all nodes are in the result, return the order; else return "" (cycle).

Solution

from collections import defaultdict, deque

def alienOrder_kahn(words):
    # Collect all unique characters
    adj = defaultdict(set)
    in_degree = {c: 0 for word in words for c in word}

    for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]
        min_len = min(len(w1), len(w2))
        # Edge case: w1 is longer and is a prefix of w2 — invalid
        if len(w1) > len(w2) and w1[:min_len] == w2[:min_len]:
            return ""
        for j in range(min_len):
            if w1[j] != w2[j]:
                if w2[j] not in adj[w1[j]]:
                    adj[w1[j]].add(w2[j])
                    in_degree[w2[j]] += 1
                break

    # Kahn's BFS topological sort
    queue = deque([c for c in in_degree if in_degree[c] == 0])
    result = []
    while queue:
        c = queue.popleft()
        result.append(c)
        for neighbor in adj[c]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    return "".join(result) if len(result) == len(in_degree) else ""


print(alienOrder_kahn(["wrt","wrf","er","ett","rftt"]))  # "wertf"
print(alienOrder_kahn(["z","x"]))                         # "zx"
print(alienOrder_kahn(["z","x","z"]))                     # ""
print(alienOrder_kahn(["abc","ab"]))                      # ""

Complexity

Time: O(C) where C is total number of characters across all words
Space: O(U + E) where U is unique characters and E is edges

2. DFS Topological Sort with Cycle Detection

Intuition

Same graph-building step, but use DFS for the topological sort. DFS-based topological sort appends a node to the result after all its descendants are processed (post-order). To detect cycles, track three states: unvisited, currently in stack (being visited), and fully processed.

If we ever reach a node that is “currently in stack”, we found a cycle — return "".

Algorithm

Build the same directed graph as above.
For each unvisited node, run DFS.
Color nodes: False = in current path (visiting), True = done.
If we visit a node that is in the current path, there’s a cycle.
Reverse the post-order result for the correct topological order.

Solution

from collections import defaultdict

def alienOrder(words):
    adj = defaultdict(set)
    chars = {c for word in words for c in word}

    for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]
        min_len = min(len(w1), len(w2))
        if len(w1) > len(w2) and w1[:min_len] == w2[:min_len]:
            return ""
        for j in range(min_len):
            if w1[j] != w2[j]:
                adj[w1[j]].add(w2[j])
                break

    # visited: None = not seen, False = in current DFS path, True = done
    visited = {}
    result = []

    def dfs(node):
        if node in visited:
            return visited[node]  # True = no cycle, False = cycle found
        visited[node] = False  # mark as in current path
        for neighbor in adj[node]:
            if not dfs(neighbor):
                return False
        visited[node] = True
        result.append(node)
        return True

    for c in chars:
        if not dfs(c):
            return ""

    return "".join(reversed(result))


print(alienOrder(["wrt","wrf","er","ett","rftt"]))  # "wertf"
print(alienOrder(["z","x"]))                         # "zx"
print(alienOrder(["z","x","z"]))                     # ""
print(alienOrder(["abc","ab"]))                      # ""
print(alienOrder(["z"]))                             # "z"

Complexity

Time: O(C) — building graph and DFS both visit each character once
Space: O(U + E) — graph, visited map, result

Common Pitfalls

Comparing the wrong pair of words. Only adjacent words in the sorted list give ordering information. Non-adjacent words don’t directly tell you anything.

Missing the invalid prefix case. If words[i] is longer than words[i+1] and words[i+1] is a prefix of words[i], the input is invalid. For example, ["abc", "ab"] should return "".

Only one edge per adjacent pair. Stop comparing characters after the first difference — subsequent differences are not meaningful ordering constraints.

Keyboard shortcuts

Note Book