Are you familiar with the __call__
method in Python? By defining this method, an instance of your class can be called
as though it were a function. Here's a contrived example solely to demonstrate how it works:
class Foo:
def __init__(self, name):
self.name = name
self.call_ct = 0
print(self.name, 'initialized')
def __call__(self):
self.call_ct += 1
print(self.name, self.call_ct)
bar = Foo('bar')
bar()
baz = Foo('baz')
baz()
bar()
Output:
bar initialized
bar 1
baz initialized
baz 1
bar 2
A more interesting use case is given by Brandon Rhodes,
that of swapping out an http_get(url)
method for an object that caches pages. Say for instance that we are maintaining
a project that includes the following web crawling code:
import urllib.request
from urllib.error import HTTPError, URLError
def http_get(url):
with urllib.request.urlopen(url) as page:
return page.getcode(), page.read().decode('utf-8')
def get_links(page_content):
loc = page_content.find('href')
while loc != -1:
start = loc + len('href="')
quote_char = page_content[start - 1]
end = page_content.find(quote_char, start)
yield page_content[start:end]
loc = page_content.find('href', loc + 1)
def crawl(start_page, max_depth, on_page):
stack = [(0, start_page)]
while stack:
depth, url = stack.pop()
try:
code, content = http_get(url)
except (ValueError, HTTPError, URLError):
continue
on_page(url, code, content)
if depth < max_depth and code == 200:
stack.extend((depth + 1, u) for u in get_links(content))
if __name__ == '__main__':
crawl('https://www.python.org', max_depth=1, on_page=lambda url, code, content: print(code, url))
Now say we become unsatisfied with the performance of this code and want to stop getting the same page multiple times.
The standard library provides a caching
mechanism that we could
decorate our http_get
function with.
from functools import lru_cache
# ...
@lru_cache()
def http_get(url):
with urllib.request.urlopen(url) as page:
return page.getcode(), page.read().decode('utf-8')
But another option is an object that implements __call__(self)
. What might that look like?
# ...
class CachedHttpGet:
def __init__(self):
self.cache = {}
def __call__(self, url):
if url not in self.cache:
with urllib.request.urlopen(url) as page:
self.cache[url] = (page.getcode(), page.read().decode('utf-8'))
return self.cache[url]
http_get = CachedHttpGet()
# ...
While lru_cache
is probably better in this contrived example, I hope this article gives you another tool for your
toolbox. The official docs are here.
Keep this in mind the next time you're refactoring something; it may be the right choice.