Advanced Topic: Design Patterns
Overview
Teaching: 30 min
Exercises: 30 minQuestions
How can we avoid re-inventing the wheel when designing code?
How can we transfer known solutions to our code?
Objectives
Recognize much-used patterns in existing code
Re-purpose existing patterns as solutions to new problems
What is a design pattern?
software design patterns: typical solutions to common problems
Pros
- It is easier to reuse known solutions than to invent them
- Makes the code easier to understand for collaborators
- Makes the code easier to maintain since patterns ought to be best-in-class solutions
Cons
- Shoehorning: not all patterns fit everywhere
- Patterns paper-over inadequacies that exist in one language but not another
- the visitor pattern is popular in Java or C++, but useless in Julia, thanks to multiple dispatch
- Haskell’s functional style makes the strategy pattern so obvious, it’s not longer a pattern, it’s the way things are done
Examples:
- Iterator pattern (see below) separates how to loop from what to do in a loop
- Dependency injection (see below) makes it easier to create modular algorithms
- Resource allocation is acquisition idiom promotes creating fully functional objects in one go
- Factory pattern separates creating object from the class of the object. It is used to unify or simplify the creation of similar objects into a single function.
- Adapter pattern interfaces one class to be used in an another
- More and more and more patterns
- And let’s not forget anti-patterns, i.e. patterns that should not be used
Resources
There are plenty of resources out there that you can follow to better understand design patterns. Here you have a couple that we have found useful:
- Refactoring.guru: Page with general description of patterns, with pseudocode, applicable to any programming language. It also has a section on (surprise!) refactoring your code.
- Design patterns in Python: Series of blog posts on the application of several design patterns in Python specifically, with example code.
Iterator Pattern
Iterators separates generating items over which to loop from doing the body of the loop. It separates looping from computing.
For instance, we want to loop over all items in xs
and ys
:
for x in xs:
for y in ys:
compute(x, y)
The code above can be transformed to:
from itertools import product
for x, y in product(xs, ys):
compute(x, y)
Behind the scenes,
itertool’s product
returns an
iterator, i.e.
something we can loop over.
Using product
is both simpler and more general. Now the number of nested loops
can be determined at runtime. It is no longer hard-coded.
>>> from itertools import product
>>> list_of_lists = [[1, 2], [3, 4]]
>>> for args in product(*list_of_lists):
... print(args)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
>>> list_of_lists = [[1, 2], [3, 4], [5, 6]]
>>> for args in product(*list_of_lists):
... print(args)
(1, 3, 5)
(1, 3, 6)
(1, 4, 5)
(1, 4, 6)
(2, 3, 5)
(2, 3, 6)
(2, 4, 5)
(2, 4, 6)
Generators: iterators made easy
Generators create iterators using syntax that is similar to the standard loop:
for x in xs:
for y in ys:
print(f"some complicated calculation with {x} and {y}")
We can lift the loops out of the code and create a generator:
def my_generator(xs, ys):
for x in xs:
for y in ys:
yield x, y
print(f"I am in my_generator {x}, {y}")
And then loop with an iterator which python creates auto-magically:
>>> for x, y in my_generator([1, 2], ["a", "b"]):
... print(f"some complicated calculation with {x} and {y}")
some complicated calculation with 1 and a
I am in my_generator 1, a
some complicated calculation with 1 and b
I am in my_generator 1, b
some complicated calculation with 2 and a
I am in my_generator 2, a
some complicated calculation with 2 and b
I am in my_generator 2, b
In practice, Python runs through the code in my_generator
and returns a new
element each time it hits yield. The next time it is called, it restarts from
the line right after yield
.
When to use generators?
- To separate complex loops from complex calculations in the loop
- Complex loops that occur multiple times in the code (copy-paste is a foot gun).
- When running out of memory, generators allow you be lazy
- Use iterators when the language does not allow for generators (e.g. c++).
Exercise
Power of two iterator
What does this print out?
def powers_of_two(xs): for x in xs: yield 2**x for x in powers_of_two([2, 4, 3, 2]): print(x)
Solution
4 16 9 4
Interleaved power of two and squares
How about this? Where does the function return two on after the first element, what about after the second element?
def interleaved(xs): for x in xs: yield 2 ** x yield x ** 2 for x in interleaved([2, 4, 3, 2]): print(x)
Solution
4 4 16 16 8 9 4 4
Sequential power of two and squares
What is the output sequence in this case?
def sequential(xs): for x in xs: yield 2 ** x for x in xs: yield x ** 2 for x in sequential([2, 4, 3, 2]): print(x)
Solution
4 16 8 4 4 16 9 4
How to use the iterator pattern
What to look for:
- Complex loops
- Complex loops that occur multiple times in the code
- Complex loops that occur multiple times in the code with only slight variations
What to do:
- Lift the loop into a generator function and make any parameter an input argument of the generator function.
Other languages
- c++: create a class with the following methods:
class Iterator { // pre-increment // note: const is optional, but a priori, its a good idea. Iterator operator++() const; // or post-increment. or both. // note: const is optional, but a priori, its a good idea. Iterator operator++(int) const; // dereference, to access the "yielded" values T operator*(); // optionally, const dereference const T operator*() const; // comparison to other iterator bool operator!=(const Iterator &) const; }
Then use in loop:
for(Iterator i(...), end(...); i != end; ++i) compute(*i);
- R: CRAN package iterators
- Julia: iterator interface
- Fortran: of course not
Dependency Injection, or how to make algorithms tweakable
It’s not unusual to want to change an algorithm, but only in one or two places:
def my_algorithm(some_input):
return [
uncornify(webby)
for webby in deconforbulate(some_input)
]
Say that rather than loop over deconforbulate
objects, you need to loop over
undepolified
objects.
Here’s one bad solution:
def my_awful_copy_paste_non_solution(some_input):
return [
uncornify(webby)
for webby in undepolified(some_input)
]
Here’s a slightly better one:
def my_somewhat_better_solution(some_input, is_deconfobulated: bool = True):
if is_deconfobulated:
generator = deconforbulate
else:
generator = undepolified
return [
uncornify(webby) for webby in generator(some_input)
]
But it doesn’t scale!
Your supervisor just popped in and wants you to try and loop over
unsoupilate
d, resoupilate
d, and gunkifucate
d objects, and probably others
as well. Using the pattern above, we have to modify the algorithm each and every
time we add a new tweak.
This scenario is based on multiple past and future real-life stories.
Thankfully, the dependency-injection design pattern can come to the rescue!
from typing import Callable, Optional
def the_bees_knees_solution(some_input, generator: Optional[Callable] = None):
if generator is None:
generator = deconforbulate
return [
uncornify(webby) for webby in generator(some_input)
]
Now the algorithm is independent of the exact generator (Yay! Separation of
concerns!). It can be used without modification with any generator that takes
obstrucated
-like objects.
Other languages
- c++: The tweakable element is any function object or function. The algorithm takes std::function as argument (simpler code, faster compilation, possibly slower performance). Or use a template argument (slower compilation, no performance hit, often leads to complicated code).
- R: just pass the function, like in Python.
- Julia: just pass the function, like in Python.
- Fortran: The Fortran 2003 standard introduced procedure pointers. Since it has only been 17 years, Fortran 2003 compilers can be patchy, buggy, and under-performant. Use at your own risk.
How to use dependency-injection
What to look for:
- algorithms that have been copy-pasted
- slight variations of the same algorithm
What to do:
- Separate levels of details and concerns in the algorithm
- Write the algorithm as seperate functions/class for each level of details and concern
- Identify the “tweakable” concerns that need to be changed easily
- Make the “tweakable” concerns an argument of the algorithm
Iterating over points in a ring
Create an iterator and/or a generator that lifts the loop over points in a two-dimensional ring:
Take this code:
from math import sqrt points = [[1, 2], [0, 0], [-2, 0], [-2, 3], [-3, -4], [4, 0], [5, 5]] radius = 3.5 width = 1.0 inner = radius - 0.5 * width outer = radius + 0.5 * width for point in points: distance = sqrt(point[0] * point[0] + point[1] * point[1]) if distance >= inner and distance <= outer: print(f"Some complicated calculation at {point}")
and create a generator function
points_in_ring
:for point in points_in_ring(points, radius, width): print(f"Some complicated calculation at {point}")
At this point,
points_in_ring
uses the Euclidian distance to figure out what is in the ring. Using dependency-injection, makepoints_in_ring
capable of using any sensible distance (say the Manhattan norm):def manhattan(point: List[float]) -> float: return sum(abs(x) for x in point) for point in points_in_ring(points, radius, width, norm=manhattan): print(f"Some complicated calculation at {point}")
NOTE:
points_in_ring
can be used over and over across the code- it is parametrized by radius and width
- it make the loop self-descriptive
- it is more memory efficient (lazy evaluation) than a list
- it is almost always better than creating and keeping in sync a second list holding only points in a ring (compute is cheap)
- it makes it possible to test/debug the loop alone, without worrying about the compute inside the loop
Ring generator
from math import sqrt from typing import Iterable def points_in_ring(points: Iterable, radius: float, width: float) -> Iterable: inner = radius - 0.5 * width outer = radius + 0.5 * width for point in points: distance = sqrt(point[0] * point[0] + point[1] * point[1]) if distance >= inner and distance <= outer: yield point points = [[1, 2], [0, 0], [-2, 0], [-2, 3], [-3, -4], [4, 0], [5, 5]] for point in points_in_ring(points, radius=3.5, width=1.0): print(f"Some complicated calculation at {point}")
Ring generator with tweakable norm
from math import sqrt from typing import Iterable, List, Optional, Callable def euclidean(point: List[float]) -> float: return sqrt(sum((x * x) for x in point)) def manhattan(point: List[float]) -> float: return sum(abs(x) for x in point) def points_in_ring( points: Iterable, radius: float, width: float, norm: Optional[Callable] = None, ) -> Iterable: if norm is None: norm = euclidean inner = radius - 0.5 * width outer = radius + 0.5 * width for point in points: distance = norm(point) if distance >= inner and distance <= outer: yield point points = [[1, 2], [0, 0], [-2, 0], [-2, 3], [-3, -4], [4, 0], [5, 5]] for point in points_in_ring(points, radius=3.5, width=1.0, norm=manhattan): print(f"Some complicated calculation at {point}")
Key Points
Many coders have come before
Transferable solutions to common problems have been identified
It is easier to apply a known pattern than to reinvent it, but first you have to spend some time learning about patterns.
Iterators and generators are convenient patterns to separate loops from compute and to avoid copy-pasting.
Dependency injections is a pattern to create modular algorithms