collections.namedtuple
namedtuple
can make your code a lot more readable. In an interview, that's
helpful for a few reasons. First, it can help you demonstrate a good
understanding of some of Python's standard libraries. Second, it helps you show
off that you place importance on writing readable code. Third, it makes writing
your code easier. If you're passing around tuples, it can be easy to forget
what the object at each index into the tuple is. Using a namedtuple
can help
you avoid that.
Consider the case where you need to represent colors. You could choose to do so
with a 3-tuple of the form (i, j, k)
(where i
, j
, and k
are integers on
the range 0-255). This representation seems intuitive and natural enough. i
could be the value for red, j
for green, and k
for blue. A problem with this
approach is that you may forget which of the three numbers represents which
primary color of light. Using a namedtuple
could help with this:
Color = namedtuple('Color', ['red', 'green', 'blue'])
What does this change? Well, building a Color
is almost the same as building
that tuple you were previously building. Instead of doing (i, j, k)
, you'll
now write Color(i, j, k)
. This is perhaps a little more readable, and it adds
some more semantic meaning to your code. We're no longer just building a tuple;
we're building a Color
.
The real win for namedtuple
is in access to its elements. Before, to get the
red value for a color c
, we would use brackets: c[0]
. By comparison, if we
have a Color
called c
, we could use a more friendly dot syntax: c.red
. In
my experience, while not having to remember the index of the red element is
nice, the real win is in how much more readable c.red
is in contrast to
c[0]
.
collections.defaultdict
and collections.Counter
Suppose your interviewer asks you to find the most common string in a list of
strings. We can solve this problem using a defaultdict
(let's call it d
). We
could loop through the list, incrementing d[elem]
for each element. Then, we
just find the one we saw most. The implementation would look like this:
def most_common_dd(lst):
d = defaultdict(int)
for e in lst:
d[e] += 1
return max(d.iteritems(), key=lambda t: t[1])
Apparently, users and maintainers of Python saw this pattern enough that they
decided to create Counter
. Counter
lets us write a much more succinct
version of this function, because Counter
encapsulates the process of counting
the number of ocurrences of elements in an iterable. Implementing this
functionality with a `Counter object would look like this:
def most_common_ctr(lst):
return Counter(lst).most_common(1)[0]
These both have the same result:
from collections import Counter, defaultdict
strings = ['bear', 'down', 'you', 'bears', 'of', 'old', 'baylor', 'u', "we're",
'all', 'for', 'you', "we're", 'gonna', 'show', 'dear', 'old', 'baylor',
'spirit', 'through', 'and', 'through', 'come', 'on', 'and', 'fight',
'them', 'with', 'all', 'your', 'might', 'you', 'bruins', 'bold', 'and',
'win', 'all', 'our', 'victories', 'for', 'the', 'green', 'and', 'gold']
'''
definitions for most_common_ctr and most_common_dd
'''
assert most_common_dd(strings) == most_common_ctr(strings)
But the version using Counter
is more concise.
I love list comprehensions. They can make code much more concise and readable. Consider a problem where we have a start point and an end point on a grid:
|S|_|_|_|
|_|_|_|_|
|_|_|_|_|
| | | |E|
Let's further say that from a given cell, you can travel up, down, left or right into another cell (but not diagonally). We may want to do a bread-first search to find the minimum cost to get from the start to the end. At some point, we'll need to push the neighbors of the current cell onto the queue we're using for the BFS. This could look something like this:
for neigh in neighbors(cell):
# validate neigh
queue.append(neigh)
How should neighbors(cell)
work? Well, we could use a double for loop to
generate the neighbors:
def neighbors(cell):
for i in range(-1, 2):
for j in range(-1, 2):
if i == 0 and j == 0 or abs(i) + abs(j) > 1:
continue
yield (cell[0] + i, cell[0] + j)
This works, but it's ugly. Instead, we could use a list comprehension:
DIRS = [(0, 1), (0, -1), (1, 0), (-1, 0)]
def neighbors(cell):
return [(cell[0] + d[0], cell[1] + d[1]) for d in DIRS]
We're also probably going to want to keep track of which cells we've already
visited (so we don't try to go back through them). We could create a matrix of
bool
s the same size as our original grid (let's call it visited
) and set
visited[r][c]
when we visit the cell located at row r
and column c
. But
how should we initialize this matrix? We could do something like this:
visited = []
for i in range(n):
visited.append([])
for j in range(n):
visited[i].append(False)
But list comprehensions can make this much more concise:
visited = [[False for _ in range(n)] for _ in range(n)]
The possibilities with list comprehensions are just about endless, so I'll leave it at that!