Group Anagrams
Difficulty: Medium Source: NeetCode
Problem
Given an array of strings
strs, group the anagrams together. You can return the answer in any order.An anagram is a word formed by rearranging the letters of a different word using all the original letters exactly once.
Example 1: Input:
strs = ["eat","tea","tan","ate","nat","bat"]Output:[["bat"],["nat","tan"],["ate","eat","tea"]]Example 2: Input:
strs = [""]Output:[[""]]Example 3: Input:
strs = ["a"]Output:[["a"]]Constraints:
1 <= strs.length <= 10^40 <= strs[i].length <= 100strs[i]consists of lowercase English letters.
Prerequisites
Before attempting this problem, you should be comfortable with:
- Hash Maps — grouping items by a shared key
- Sorting — using sorted strings as canonical keys
- Tuples as dictionary keys — immutable sequences that can be hashed
1. Brute Force (Sort Each String as Key)
Intuition
Two strings are anagrams if and only if their sorted versions are identical. For example, "eat", "tea", and "ate" all sort to "aet". We can use the sorted string as a dictionary key and group all strings that share the same key together.
This is actually quite efficient in practice — sorting each individual word is cheap, and the overall grouping work is linear in total characters.
Algorithm
- Create a hash map
groupswherekey → list of strings. - For each string
sinstrs:- Compute
key = "".join(sorted(s)). - Append
stogroups[key].
- Compute
- Return the values of
groups.
Solution
from collections import defaultdict
def groupAnagrams(strs):
groups = defaultdict(list)
for s in strs:
key = "".join(sorted(s))
groups[key].append(s)
return list(groups.values())
print(groupAnagrams(["eat", "tea", "tan", "ate", "nat", "bat"]))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
print(groupAnagrams([""])) # [['']]
print(groupAnagrams(["a"])) # [['a']]
Complexity
- Time:
O(n * k log k)wherenis the number of strings andkis the maximum string length (sorting each string) - Space:
O(n * k)— storing all strings in the map
2. Character Count Tuple as Key
Intuition
Instead of sorting (which costs O(k log k) per string), we can count the frequency of each of the 26 letters in the string. Two strings are anagrams if and only if they have the same character frequencies. Represent these 26 counts as a tuple and use that as the dictionary key. Tuple creation is O(k), which is better than sorting.
Algorithm
- Create a hash map
groupswherekey → list of strings. - For each string
sinstrs:- Create a count array of 26 zeros.
- For each character
cins, incrementcount[ord(c) - ord('a')]. - Convert
countto a tuple (tuples are hashable; lists are not). - Append
stogroups[tuple(count)].
- Return the values of
groups.
flowchart LR
A(["'eat' → count[e]=1, a=1, t=1 → tuple (1,0,0,...,1,...,1,...)"])
B(["'tea' → same counts → same tuple"])
C(["'tan' → count[t]=1, a=1, n=1 → different tuple"])
A --> G["same group"]
B --> G
C --> H["different group"]
Solution
from collections import defaultdict
def groupAnagrams(strs):
groups = defaultdict(list)
for s in strs:
count = [0] * 26
for c in s:
count[ord(c) - ord('a')] += 1
groups[tuple(count)].append(s)
return list(groups.values())
print(groupAnagrams(["eat", "tea", "tan", "ate", "nat", "bat"]))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
print(groupAnagrams([""])) # [['']]
print(groupAnagrams(["a"])) # [['a']]
Complexity
- Time:
O(n * k)— for each ofnstrings, we do O(k) work to build the count - Space:
O(n * k)— storing all strings and keys
Common Pitfalls
Using a list as a dictionary key. Lists are mutable and therefore not hashable in Python. You must convert the count array to a tuple before using it as a key — this is a very common mistake.
Not using defaultdict. With a regular dict, you need to check if key not in groups: groups[key] = [] before appending. Using defaultdict(list) handles this automatically and makes the code cleaner.
Assuming the output order matters. The problem says the answer can be in any order — so do not worry about sorting the groups or the strings within each group unless explicitly asked.