import logging
from typing import List
Problem Description
Difficulty: Medium
Given an array of strings strs
, group all anagrams together into sublists. You may return the output in any order.
Example 1:
Input: strs = ["act","pots","tops","cat","stop","hat"]
Output: [["hat"],["act", "cat"],["stop", "pots", "tops"]]
Example 2:
Input: strs = ["x"]
Output: [["x"]]
Example 3:
Input: strs = [""]
Output: [[""]]
Initial Solution
The key to this problem was to recognize that when anagrams are sorted alphabetically, they become the same ‘word’. With that in mind, a decent solution seemed relatively easy to come up with. Let’s just sort the words and place the original word in a dictionary with the keys being the unique sorted/garbled word.
There are a few parts that could potentially trip up a beginner:
Firstly, calling sorted()
on a string returns a list of the sorted characters and not a whole sorted word:
word = 'hello'
sorted_word = sorted(word)
print(sorted_word)
>>> ['e', 'h', 'l', 'l', 'o']
So we must join()
the list back together afterwards.
Secondly, adding values to a pre-existing key in the dictionary, instead of overwriting them, takes a little care, but there are a few options noted in this post. For example, using append()
and extend()
. The TLDR is to use extend()
when you have multiple values to append, rather than using append()
multiple times. E.g.
a = {}
a.setdefault('abc', []).append(1) # {'abc': [1]}
a.setdefault('abc', []).extend([2, 3]) # a is now {'abc': [1, 2, 3]}
setdefault()
will avoid a KeyError when we try to access a key that does not exist yet. It inserts the key with the specified default: []
Let’s put this all together in our initial attempt at solving the anagram problem.
class InitialSolution:
def __init__(self):
# Set up logging config
=logging.DEBUG, format="%(levelname)s: %(message)s")
logging.basicConfig(levelself.logger = logging.getLogger(__name__)
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
# works fine without the if statement, but let's save some later computation:
if strs == [""]:
return [[""]]
= {}
words for word in strs:
self.logger.debug(f"Processing word: {word}")
= "".join(sorted(word))
sorted_word # n⋅log(n) character sorting * m strings in the list
self.logger.debug(f"Sorted 'word': {sorted_word}")
words.setdefault(sorted_word, []).append(word)self.logger.debug(
f"Updated dict value(s): '{sorted_word}': {words[sorted_word]}"
)
return list(words.values())
# Example for testing purposes.
= ["act", "pots", "tops", "cat", "stop", "hat"]
strs # Output: [["hat"],["act", "cat"],["stop", "pots", "tops"]]
# Note that output order does not matter.
= InitialSolution()
solution solution.groupAnagrams(strs)
DEBUG: Processing word: act
DEBUG: Sorted 'word': act
DEBUG: Updated dict value(s): 'act': ['act']
DEBUG: Processing word: pots
DEBUG: Sorted 'word': opst
DEBUG: Updated dict value(s): 'opst': ['pots']
DEBUG: Processing word: tops
DEBUG: Sorted 'word': opst
DEBUG: Updated dict value(s): 'opst': ['pots', 'tops']
DEBUG: Processing word: cat
DEBUG: Sorted 'word': act
DEBUG: Updated dict value(s): 'act': ['act', 'cat']
DEBUG: Processing word: stop
DEBUG: Sorted 'word': opst
DEBUG: Updated dict value(s): 'opst': ['pots', 'tops', 'stop']
DEBUG: Processing word: hat
DEBUG: Sorted 'word': aht
DEBUG: Updated dict value(s): 'aht': ['hat']
[['act', 'cat'], ['pots', 'tops', 'stop'], ['hat']]
Initial Results
IntialSolution passes NeetCode submission (if we comment out the logging).
- Time complexity: \(O(m⋅n⋅log(n))\) due to sorting each string.
- Space complexity: \(O(m⋅n)\) due to storing the strings in the dictionary keys and values.
NeetCode Solution
from collections import defaultdict
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
= defaultdict(list)
result
for s in strs:
= [0] * 26 # one for each character in a-z
count for c in s:
# use ord to get unicode value of the character
ord(c) - ord("a")] += 1
count[# convert count list to tuple, as lists can't be dict keys
tuple(count)].append(s)
result[return result.values()
= Solution()
solution list(solution.groupAnagrams(strs))
[['act', 'cat'], ['pots', 'tops', 'stop'], ['hat']]
I used list()
above to display the result nicely in the notebook. The code seems to work fine in NeetCode both with and without the conversion to a list; no TypeError is given.
Conclusion
Neetcode’s solution uses a hashmap (defaultdict) called result
that stores lists of anagrams, where the key is a tuple representing the character counts of the strings. Note that a defaultdict never raises a KeyError, but instead provides a default value for a key that does not exist.
- Time complexity \(O(m⋅n)\) where m is the total number of strings and n is the length of each string (multiplied by 26 possible letters)
- Space complexity: \(O(m⋅n)\) due to the space needed for storing the strings in the defaultdict values.
Given an extremely long string, the above solution would be optimal: \(O(m⋅n⋅log(n))\) vs. \(O(m⋅n⋅26)\)