Essential Software Engineering for Researchers

Tools I: Packaging and virtual-environments


Teaching: 4 min
Exercises: 6 min
  • How to use a package manager to install third party tools and libraries

  • Use conda to install a reproducible environment

Python packages

Python virtual environments

Package managers

Package managers help you install packages. Some help you install virtual environments as well. Better known python package managers include conda, pip, poetry

  conda pip poetry
audience research all developers
manage python packages
manage non-python packages
choose python version
manage virtual envs
easy interface

Rules for choosing a package manager

  1. Choose one
  2. Stick with it

We chose conda because it is the de facto standard in science, and because it can natively install libraries such as fftw, vtk, or even Python, R, and Julia themselves.

It is also the de facto package manager on Imperial’s HPC cluster systems.


Installing and using an environment

  1. If you haven’t already, see the setup guide for instructions on how to install conda, Visual Studio Code and Git.

  2. Create a new folder to use for this course. Avoid giving it a name that includes spaces. If you’re using an ICT managed PC the folder must be located in your user area on the C: drive i.e. C:\Users\UserName (Note that files placed here are not persistent so you must remember to take a copy before logging out). Start Visual Studio Code and select “Open folder…” from the welcome screen. Navigate to the folder you just created and press “Select Folder”.

  3. Press “New file” and copy the below text. Save the file as environment.yml, the location should default to your newly created folder.

    name: course
      - python>=3.6
      - flake8
      - pylint
      - black
      - mypy
      - requests
      - pip
      - pip:
        - -e git+
  4. Create a new virtual environment using conda:

    Windows users will want to start the app Anaconda Prompt from the Start Menu.

    Linux and Mac users should use a terminal app of their choice. You may see a warning with instructions. Please follow the instructions.

    conda env create -f [path to environment.yml]

    You can obtain [path to environment.yml] by right clicking the file tab near the top of Visual Studio Code and selecting “Copy Path” from the drop-down menu. Right click on the window for your command line interface to paste the path.

  5. We can now activate the environment:

    conda activate course
  6. And check python knows about the installed packages. Start a Python interpreter with the command python then:

    import requests

    We expect this to run and not fail. You can see the location of the installed package with:


    The file path you see will vary but note that it is within a directory called course that contains the files for the virtual environment you have created. Exit the Python interpreter:

  7. Finally, feel free to remove requests from environment.yml, then run

    conda env update -f [path to environment.yml]

    and see whether the package has been updated or removed.

Selecting an environment in Visual Studio Code

If you haven’t already, see the setup guide for instructions on how to install Visual Studio (VS) Code.

On Linux and Mac, one option is to first activate conda, and then start VS Code:

> conda activate name_of_environment
> code .

The simplest option for all platforms is to set the interpreter is via the Command Palette:

If you already have a Python file open then it’s also possible to set the interpreter using the toolbar at the bottom of the window.

Installing an editable package

Editable packages are packages that you can modify for development and have python immediately recognize your changes.

Look at the last few lines of environment.yml. It installs r2t2 in editable mode. The package is automatically downloaded from the web and installed next to environment.yml in the subfolder src/r2t2.

Try and add print("Hello!") to src/r2t2/r2t2/

Then start python and do

import r2t2

Your greeting should appear: python did indeed take the modified file into account.

Note that r2t2 was setup as a python package with a standard directory structure and a file. It’s well worth investing 10 minutes into transforming a python script into a package just to make it a shareable development environment.

Choosing the installation directory for R2T2

It would be nice if we could choose the directory where the editable package goes, i.e. rather than have r2t2 install in src/r2t2 we might want to install it directly in an r2t2 subfolder.

Nominally, pip does allow us to do that with –src.

However, it is not (yet) possible to tell conda to tell to use a given option, as highlighted in this issue. But that’s where the fun begins, because conda is an open-source effort, you could pitch in and try and add a feature or a fix. There is a lot to learn just from lurking around issues of open-source projects, whether it is about the project itself, or even about language design. There is even more to learn from participating.

Key Points

  • There are tens of thousands of Python packages

  • The choice is between reinventing the square wheel or reusing existing work

  • The state of an environment can be stored in a file

  • This stored environment is then easy to audit and recreate

Data Structures


Teaching: 15 min
Exercises: 15 min
  • How can data structures simplify codes?

  • Understand why different data structures exist

  • Recall common data structures

  • Recognize where and when to use them

What is a data structure?

data structure: represents information and how that information can be accessed

Choosing the right representation for your data can vastly simplify your code and increase performance.

Choosing the wrong representation can melt your brain. Slowly.


The number 2 is represented with the integer 2

>>> type(2)

Acceptable behaviors for integers include +, -, *, /

>>> 1 + 2
>>> 1 - 2

On the other hand, text is represented by a string

>>> type("text")

It does not accept all the same behaviours as an integer:

>>> "a" + "b"
>>> "a" * "b"
TypeError: can't multiply sequence by non-int of type 'str'

Integers can be represented as strings, but their behaviours would be unexpected:

>>> "1" + "2"

With integers, + is an addition, for strings it’s a concatenation.

Health impact of choosing the wrong data structure

Stay healthy. Stop and choose the right data structure for you!

Basic data structures


Lists are containers of other data:

# List of integers
[1, 2, 3]
# List of strings
["a", "b", "b"]
# List of lists of integers
[[1, 2], [2, 3]]

Lists make sense when:

Beware! The following might indicate a list is the wrong data structure:

Other languages

  • C++:
    • std::vector, fast read-write to element i and fast iteration operations. Slow insert operation.
    • std::list. No direct access to element i. Fast insert, append, splice operations.
  • R: list
  • Julia: Array, also equivalent to numpy arrays.
  • Fortran: array, closer to numpy arrays than python lists


Tuples are short and immutable containers of other data.

(1, 2)
("a", b")

Immutable means once that once created, elements cannot be added, removed or replaced:

>>> shape = 2, 4
>>> shape
(2, 4)
>>> shape[1] = 4
TypeError: 'tuple' object does not support item assignment

Modifying a tuple vs modifying the element of a tuple

>>> something = ["a", "b"], 4
>>> something[0].append("c")
>>> something
(['a', 'b', 'c'], 4)
>>> something[0] = ["a", "b", "c"]
TypeError: 'tuple' object does not support item assignment

The tuple itself cannot be modified, but its elements can be if they themselves are mutable. The container is immutable, but the contents might not be.

Tuple make sense when:

Beware! The following might indicate a tuple is the wrong data structure:

Other languages


Sets are containers where each element is unique:

>>> set([1, 2, 2, 3, 3])
{1, 2, 3}

They make sense when:

Other languages


Dictionaries are mappings between a key and a value (e.g. a word and its definition).

# mapping of animals to legs
{"horse": 4, "kangaroo": 2, "millipede": 1000}

They make sense when:

Beware! The following might indicate a dict is the wrong data structure:

Other languages

Advanced data structures

Custom data structures: Data classes

Python (>= 3.7) makes it easy to create custom data structures.

>>> from typing import List, Text
>>> from dataclasses import dataclass
>>> @dataclass
... class MyData:
...   a_list: List[int]
...   b_string: Text = "something something"
>>> data = MyData([1, 2])
>>> data
MyData(a_list=[1, 2], b_string='something something')
>>> data.a_list
[1, 2]

Data classes make sense when:

Beware! The following might indicate a dataclass is the wrong data structure:

Exploring data structures

In your own time, find out all the thing data structures can do for you:

As well as other standard data structures:

All modern languages will have equivalents, outside of “Modern Fortran”.

Don’t reinvent the square wheel.

Digital Oxford Dictionary, the wrong way and the right way

  1. Implement an oxford dictionary with two lists, one for words, one for definitions:

     barf: (verb) eject the contents of the stomach through the mouth
     morph: (verb) change shape as via computer animation
     scarf: (noun) a garment worn around the head or neck or shoulders for warmth
       or decoration
     snarf: (verb) make off with belongings of others
     sound: |
       (verb) emit or cause to emit sound.
       (noun) vibrations that travel through the air or another medium
     surf: |
       (verb) switch channels, on television
       (noun) waves breaking on the shore
  2. Given a word, find and modify its definition
  3. Do the same with a dict
  4. Create a subset dictionary (including definitions) of words rhyming with “arf” using either the two-list or the dict implementation
  5. If now we want to also encode “noun” and “verb”, what data structure could we use?
  6. What about when there are multiple meanings for a verb or a noun?

Dictionary implemented with lists

from typing import List, Text, Tuple

def modify_definition(
    word: Text, newdef: Text, words: List[Text], definitions: List[Text]
) -> List[Text]:
    from copy import copy

    index = words.index(word)
    definitions = copy(definitions)
    definitions[index] = newdef
    return definitions

def find_rhymes(
    rhyme: Text, words: List[Text], definitions: List[Text]
) -> Tuple[List[Text], List[Text]]:
    result_words = []
    result_definitions = []
    for word, definition in zip(words, definitions):
        if word.endswith(rhyme):
    return result_words, result_definitions

def testme():

    words = ["barf", "morph", "scarf", "snarf", "sound", "surf"]
    definitions = [
        "(verb) eject the contents of the stomach through the mouth",
        "(verb) change shape as via computer animation",
            "(noun) a garment worn around the head or neck or shoulders for"
            "warmth or decoration"
        "(verb) make off with belongings of others",
            "(verb) emit or cause to emit sound."
            "(noun) vibrations that travel through the air or another medium"
            "(verb) switch channels, on television"
            "(noun) waves breaking on the shore"

    newdefs = modify_definition("morph", "aaa", words, definitions)
    assert newdefs[1] == "aaa"

    rhymers = find_rhymes("arf", words, definitions)
    assert set(rhymers[0]) == {"barf", "scarf", "snarf"}
    assert rhymers[1][0] == definitions[0]
    assert rhymers[1][1] == definitions[2]
    assert rhymers[1][2] == definitions[3]

if __name__ == "__main__":

    # this is one way to include tests.
    # the second session will introduce a better way.

Dictionary implemented with a dictionary

from typing import List, Text, Tuple, Mapping

def modify_definition(
    word: Text, newdef: Text, dictionary: Mapping[Text, Text]
) -> List[Text]:
    from copy import copy

    result = copy(dictionary)
    result[word] = newdef
    return result

def find_rhymes(
    rhyme: Text, dictionary: Mapping[Text, Text]
) -> Tuple[List[Text], List[Text]]:
    return {
        word: definition
        for word, definition in dictionary.items()
        if word.endswith(rhyme)

def testme():

    dictionary = {
        "barf": "(verb) eject the contents of the stomach through the mouth",
        "morph": "(verb) change shape as via computer animation",
        "scarf": (
            "(noun) a garment worn around the head or neck or shoulders for"
            "warmth or decoration"
        "snarf": "(verb) make off with belongings of others",
        "sound": (
            "(verb) emit or cause to emit sound."
            "(noun) vibrations that travel through the air or another medium"
        "surf": (
            "(verb) switch channels, on television"
            "(noun) waves breaking on the shore"

    newdict = modify_definition("morph", "aaa", dictionary)
    assert newdict["morph"] == "aaa"

    rhymers = find_rhymes("arf", dictionary)
    assert set(rhymers) == {"barf", "scarf", "snarf"}
    for word in {"barf", "scarf", "snarf"}:
        assert rhymers[word] == dictionary[word]

if __name__ == "__main__":


More complex data structures for more complex dictionary

There can be more than one good answer. It will depend on how the dictionary is meant to be used later throughout the code.

Below we show three possibilities. The first is more deeply nested. It groups all definitions together for a given word, whether that word is a noun or a verb. If more often than not, it does not matter so much what a word is, then it might be a good solution. The second example flatten the dictionary by making “surf” the noun different from “surf” the verb. As a result, it is easier to access a word with a specific semantic category, but more difficult to access all definitions irrespective of their semantic category.

One pleasing aspect of the second example is that together things that are unlikely to change one one side (word and semantic category), and a less precise datum on the other (definitions are more likely to be refined).

The third possibility is a pandas DataFrame with three columns. It’s best suited to big-data problems where individual words (rows) are seldom accessed one at a time. Instead, most operations are carried out over subsets of the dictionary.

from typing import Text
from enum import Enum, auto
from dataclasses import dataclass
from pandas import DataFrame, Series

class Category(Enum):
    NOUN = auto
    VERB = auto

class Definition:
    category: Text
    text: Text

first_example = {
    "barf": [
            Category.VERB, "eject the contents of the stomach through the mouth"
    "morph": [
            Category.VERB, "(verb) change shape as via computer animation"
    "scarf": [
            "a garment worn around the head or neck or shoulders for"
            "warmth or decoration",
    "snarf": Definition(Category.VERB, "make off with belongings of others"),
    "sound": [
        Definition(Category.VERB, "emit or cause to emit sound."),
            "vibrations that travel through the air or another medium",
    "surf": [
        Definition(Category.VERB, "switch channels, on television"),
        Definition(Category.NOUN, "waves breaking on the shore"),

# frozen makes Word immutable (the same way a tuple is immutable)
# One practical consequence is that dataclass will make Word work as a
# dictionary key: Word is hashable
class Word:
    word: Text
    category: Text

second_example = {
        "barf", Category.VERB
    ): "eject the contents of the stomach through the mouth",
    Word("morph", Category.VERB): "change shape as via computer animation",
    Word("scarf", Category.NOUN): (
        "a garment worn around the head or neck or shoulders for"
        "warmth or decoration"
    Word("snarf", Category.VERB): "make off with belongings of others",
    Word("sound", Category.VERB): "emit or cause to emit sound.",
        "sound", Category.NOUN
    ): "vibrations that travel through the air or another medium",
    Word("surf", Category.VERB): "switch channels, on television",
    Word("surf", Category.NOUN): "waves breaking on the shore",

# Do conda install pandas first!!!
import pandas as pd

third_example = pd.DataFrame(
        "words": [
        "definitions": [
            "eject the contents of the stomach through the mouth",
            "change shape as via computer animation",
                "a garment worn around the head or neck or shoulders for"
                "warmth or decoration"
            "make off with belongings of others",
            "emit or cause to emit sound.",
            "vibrations that travel through the air or another medium",
            "switch channels, on television",
            "waves breaking on the shore",
        "category": pd.Series(
            ["verb", "verb", "noun", "verb", "verb", "noun", "verb", "noun"],

Key Points

  • Data structures are the representation of information the same way algorithms represent logic and workflows

  • Using the right data structure can vastly simplify code

  • Basic data structures include numbers, strings, tuples, lists, dictionaries and sets.

  • Advanced data structures include numpy array and panda dataframes

  • Classes are custom data structures

Tools II: Code Formatters


Teaching: 3 min
Exercises: 2 min
  • How to format code with no effort on the part of the coder?

  • Know how to install and use a code formatter

Why does formatting matter?

Rules to choose a code formatter

  1. Choose one
  2. Stick with it

We chose black because it has very few options with which to fiddle.

Formatting example

Using Visual Studio Code:

  1. Put the following into a file and save it. If you are prompted to install the Python extension then be sure to do so.

    x = {  'a':37,'b':42,
    y = 'hello '+       'world'
    class foo  (     object  ):
       def f    (self   ):
           return       y **2
       def g(self, x :int,
           y : int=42
           ) -> int:
           return x--y
    def f  (   a ) :
       return      37+-a[42-a :  y*3]
  2. Ensure that you have activated your “course” conda environment using the selector in the bottom panel of VS Code
  3. Open Settings
    • macOS via ⌘ + , or menus: Code > Preferences > Settings
    • Windows/Linux via Ctrl + , or menus: File > Preferences > Settings
  4. Search for “python formatting provider” and choose “black”
  5. Search for “format on save” and check the box to enable
  6. Save the file again: it should be reformatted automagically
  7. Now paste the code again but before saving delete a ‘:’ somewhere. When saving, the code will likely not format. It is syntactically invalid. The formatter cannot make sense of the code and thus can’t format it.


After saving, the code should be automatically formatted to:

x = {"a": 37, "b": 42, "c": 927}
y = "hello " + "world"

class foo(object):
    def f(self):
        return y ** 2

    def g(self, x: int, y: int = 42) -> int:
        return x - -y

Ah! much better!

Still, the sharp-eyed user will notice at least one issue with this code. Formatting code does not make it less buggy!

Key Points

  • Code formatting means how the code is typeset

  • It influences how easily the code is read

  • It has no impact on how the code runs

  • Almost all editors and IDEs have some means to set up an automatic formatter

  • 5 minutes to set up the formatter is redeemed across the time of the project i.e. the cost is close to nothing

Structuring code


Teaching: 10 min
Exercises: 20 min
  • How can we create simpler and more modular codes?

  • Explain the expression “separation of concerns”

  • Explain the expression “levels of abstractions”

  • Explain the expression “dataflow”

  • Analyze an algorithm for levels of abstractions, separable concerns and dataflow

  • Create focussed, modular algorithms

Intelligible code?

Intelligible: Able to be understood; comprehensible

Code is meant to be read by an audience who has infinite knowledge and an infinite capacity for misunderstanding:

This audience is future-you reading past-you’s code.

Intelligible code aims to:

Levels of abstraction

Scientific papers come with separate levels of details:

Similarly, code should be written with a structure separating different levels of details:

def abstract():
  result_I = body_section_I()
  result_II = body_section_II(result_I)

  result_IIa = appendix_IIa(result_II)  # BAD!! do not mix levels of details
  return conclusion(result_IIa)

def body_section_I():
  low_level_thing = appendix_Ia()

  other_thing = body_section_III(low_level_thing)  # BAD!! do not mix levels of details

Separable concerns

Scientific papers come with separate sections dealing with separate concerns, e.g.:

Similarly, code should be written with a structure separating separable concerns:

def main() -> None:
  # reading input files is one thing
  config = read_input(filename)

  # creating complex objects is another
  section_I = SectionI(config['some_array'])

  # compute is still something else
  result_I =

  # Saving data is another
  save(result_I, config['section_I save path'])

def read_input(filename: Text) -> Dict:
  """ Reads input data from file. """

def save(data: tf.Tensor, filename: Text = "saveme.h5") -> None:
  """ Saves output data to file. """

class SectionI
  """ Runs experiment """
  def __init__(self, some_array: tf.Tensor):
      self.some_array = some_array

  def run(self, y: tf.Tensor) -> tf.Tensor:
      """ Just does compute, nothing more, nothing less. """

In the code above we have made some choices:

A few examples of what not to do.

Mixing reading files and creating objects

Objects that do need files to be created are easy to create and re-create, especially during testing.

class SectionI
    def __init__(self, some_array: tf.Tensor):
        self.some_array = array
        # BAD!! Now you need to carry this file around every time you want to
        # instantiate SectionI = read_aaa("")

Mixing computing stuff and IO

Below, it the compute has a hidden baggage: it can’t operate without reading from file. It’s not a pure function of self, b, and filename. Run it twice with exactly the same self and the same b and the same filename, and the results might still be different.

It creates file artifacts. Littering is a crime and hidden files are litter.

class SectionI
    def run(self, b: tf.Tensor, filename: Text = "") -> tf.Tensor:
        # BAD!! hidden dependency on the content of the file
        aaa = read_aaa(filename)
        # BAD! Compute functions should not litter.
        save(result, "somefile")
        return result

Splitting concerns too far

It’s unfortunately common to find objects that need to be created and setup in several steps. This is an anti-pattern (We’ll discuss patterns in a bit), i.e. something that should be avoided. Creating an object is one thing and one thing only. It should be fully usable right after creation.

sectionI = SectionI(...)

# BAD! just put the initialization in SectionI.__init__
results =

Paper vs code

Code should be organized the same way as the paper it will produce. If a concept X is described in one section, and concept Y in another, then X should be one function or class, and Y another.

That’s because the primary purpose of paper and code is to communicate with other people, including your future self.


A code is a sequence of transformations on data, e.g.:

  1. data measurements is read from input
  2. data a and b are produced from measurements, independently
  3. data result is produced from a and b

The code should reflect that structure:

def read_experiment(filename: Text) -> Dict:
    return measurements

def compute_a(measurements) -> np.ndarray:
    return a

def compute_b(measurements) -> np.ndarray:
    return b

def compute_result(a, b) -> np.ndarray:
    return result
  1. One function to read Experiment
  2. One function/class to compute a: It takes measurements as input and returns the result a
  3. One function/class to compute b: It takes measurements as input and returns the result b
  4. One function/class to compute Result: It takes a and b as input and returns the result result:

Here are things that should not happen:

Unnecessary arguments and tangled dependencies.

Did we not say the result depends on a and b alone? The next programmer to look at the code (e.g. future you) won’t know that. Computing result is no longer separate from measurements.

# BAD! Unnecessary arguments. Now result depends on a, b, experiment
def compute_result(a, b, experiment) -> np.ndarray:

Modifying an input argument

It’s often a bad idea for functions to modify their arguments.

def compute_a(measurements) -> np.ndarray:
    measurements[1] *= 2

def compute_b(measurements) -> np.ndarray:
    measurements[1] *= 0.5

Now compute_a has to take place before compute_b because compute_a chose to modify it’s argument, and thus compute_b was hacked to undo the damage. To get b the data is now forced to flow first through compute_a.

Global variables

NEVER EVER EVER use global variables. They make the dataflow complex by essence.


def compute_a(measurements):

    return result

Now the result of compute_a has a hidden dependency. It’s never clear whether calling it twice with the same input (measurements) will yield the same result.

Dear Fortran 90 users

Module variables are global. They can be modified anywhere, anytime. It’s best not to use them.

Disentangling a recipe

Disentangle the recipe below into separable concerns and level of details. Ensure the flow of ingredients from transform to transform to final dish is clear.

  1. Write the solution as a recipe. Be sure to delete steps and information irrelevant to the recipe. Deleting code is GOOD! (if it’s under version control)
  2. Can you identify different levels of abstractions?
  3. Can you identify different concerns?
  4. Can you identify what is data and what are transformations of the data?
  5. Write the solution as pseudo-code in your favorite language. Pseudo-code doesn’t have to run, but it has to make sense. Or write a solution as a diagram, if that’s your thing (e.g. UML, Sequence diagram).
  6. Can you spot inconsistencies in the original recipe? That’s what happens when code is copy-pasted. Invariably, versions diverge until each has set of unique bugs, as well as bugs in common.
### Tarte au Nutella

Poor the preparation onto the dough. Wait for it to cool. Once it is cool,
sprinkle with icing sugar.

The dough is composed of 250g of flour, 125g of butter and one egg yolk. It
should be blind-baked at 180°C until golden brown.

The preparations consists of 200g of Nutella, 3 large spoonfuls of crème
fraîche, 2 egg yolks and 20g of butter.

Add the Nutella and the butter to a pot where you have previously mixed the
crème fraîche, the egg yolk and the vanilla extract. Cook and mix on low heat
until homogeneous.

The dough, a pâte sablée, is prepared by lightly mixing by hand a pinch of
salt, 250g of flour and 125g of diced butter until the butter is mostly all
absorbed. Then add the egg yolk and 30g of cold water. The dough should look
like coarse sand, thus its name.

Spread the dough into a baking sheet. If using it for a pie with a pre-cooked
or raw filling, first cook the dough blind at 200°C until golden brown. If the
dough and filling will be cooked together you can partially blind-cook the
dough at 180°C for 15mn for added crispiness.

Enjoy! It's now ready to serve!

Key Points

  • Code is read more often than written

  • The structure of the code should follow a well written explanation of its algorithm

  • Separate level of abstractions (or details) in a problem must be reflected in the structure of the code, e.g. with separate functions

  • Separable concerns (e.g. reading an input file to create X, creating X, computing stuff with X, saving results to file) must be reflected in the structure of the code

Tools III: Linters


Teaching: 10 min
Exercises: 5 min
  • How to make the editor pro-actively find errors and code-smells

  • Remember how to install and use a linter in vscode without sweat

What is linting?

Linters enforce style rules on your code such as:

Consistent styles make a code more consistent an easier to read, whether or not you agree with the style. Using an automated linter avoids bike-shedding since the linter is the final arbiter.

Why does linting matter?

Rules for choosing linters

  1. Choose a few
  2. Stick with them

We chose:


Setup VS Code:

  1. Open Settings (see previous exercise if you’re not sure how):
    • Search for “linting enable” and check the box for “Python > Linting: Enabled”.
    • Search for “pylint enabled” and check the box for “Python > Linting: Pylint Enabled” (you may need to scroll down for this one).
    • Search for “pylint use” and uncheck the box for “Python > Linting: Pylint Use Minimal Checkers”.
    • Search for “flake8 enabled” and check the box for “Python > Linting: Flake8 Enabled”.
    • Search for “mypy enable” and check the box for “Python > Linting: Mypy Enabled”.
  2. Create a file with the following code and save it:

    from typing import List
    class printer:
    def ActionatePrinters(printers: List[printer]):
        # pylint: disable=missing-docstring
        printing_actions = []
        for p in printers:
            if p == None:
            def action():
            p = "something"
        for action in printing_actions:
    ActionatePrinters([1, 2, 2])
  3. Check the current errors (click on errors in status bar at the bottom)
  4. Try and correct them
  5. Alternatively, try and disable them (but remember: with great power…). We’ve already disabled-one at the function scope level. Check what happens if you move it to the top of the file at the module level.

Key Points

  • Linting is about discovering errors and code-smells before running the code

  • It shortcuts the “edit-run-debug and repeat” workflow

  • Almost all editors and IDEs have some means to setup automatic linting

  • 5 minutes to setup a linter is redeemed across the time of the project i.e. the cost is close to nothing

Design Patterns


Teaching: 15 min
Exercises: 15 min
  • How can we avoid re-inventing the wheel when designing code?

  • How can we transfer known solutions to our code?

  • Recognize much-used patterns in existing code

  • Re-purpose existing patterns as solutions to new problems

What is a design pattern?

software design patterns: typical solutions to common problems


  1. It is easier to reuse known solutions than to invent them
  2. Makes the code easier to understand for collaborators
  3. Makes the code easier to maintain since patterns ought to be best-in-class solutions


  1. Shoehorning: not all patterns fit everywhere
  2. Patterns paper-over inadequacies that exist in one language but not another


Iterator Pattern

Iterators separates generating items over which to loop from doing the body of the loop. It separates looping from computing.

For instance, we want to loop over all items in xs and ys:

for x in xs:
    for y in ys:
        compute(x, y)

The code above can be transformed to:

from itertools import product

for x, y in product(xs, ys):
    compute(x, y)

Behind the scenes, itertool’s product returns an iterator, i.e. something we can loop over.

Using product is both simpler and more general. Now the number of nested loops can be determined at runtime. It is no longer hard-coded.

>>> from itertools import product
>>> list_of_lists = [[1, 2], [3, 4]]
>>> for args in product(*list_of_lists):
...     print(args)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
>>> list_of_lists = [[1, 2], [3, 4], [5, 6]]
>>> for args in product(*list_of_lists):
...     print(args)
(1, 3, 5)
(1, 3, 6)
(1, 4, 5)
(1, 4, 6)
(2, 3, 5)
(2, 3, 6)
(2, 4, 5)
(2, 4, 6)

Generators: iterators made easy

Generators create iterators using syntax that is similar to the standard loop:

for x in xs:
    for y in ys:
        print(f"some complicated calculation with {x} and {y}")

We can lift the loops out of the code and create a generator:

def my_generator(xs, ys):
  for x in xs:
      for y in ys:
          yield x, y
          print(f"I am in my_generator {x}, {y}")

And then loop with an iterator which python creates auto-magically:

>>> for x, y in my_generator([1, 2], ["a", "b"]):
...     print(f"some complicated calculation with {x} and {y}")
some complicated calculation with 1 and a
I am in my_generator 1, a
some complicated calculation with 1 and b
I am in my_generator 1, b
some complicated calculation with 2 and a
I am in my_generator 2, a
some complicated calculation with 2 and b
I am in my_generator 2, b

In practice, Python runs through the code in my_generator and returns a new element each time it hits yield. The next time it is called, it restarts from the line right after yield.

When to use generators?

  1. To separate complex loops from complex calculations in the loop
  2. Complex loops that occur multiple times in the code (copy-paste is a foot gun).
  3. When running out of memory, generators allow you be lazy
  4. Use iterators when the language does not allow for generators (e.g. c++).


Power of two iterator

What does this print out?

def powers_of_two(xs):
    for x in xs:
        yield 2**x

for x in powers_of_two([2, 4, 3, 2]):



Interleaved power of two and squares

How about this? Where does the function return two on after the first element, what about after the second element?

def interleaved(xs):
    for x in xs:
        yield 2 ** x
        yield x ** 2

for x in interleaved([2, 4, 3, 2]):



Sequential power of two and squares

What is the output sequence in this case?

def sequential(xs):

  for x in xs:
      yield 2 ** x

  for x in xs:
      yield x ** 2

for x in sequential([2, 4, 3, 2]):



How to use the iterator pattern

What to look for:

What to do:

Other languages

Dependency Injection, or how to make algorithms tweakable

It’s not unusual to want to change an algorithm, but only in one or two places:

def my_algorithm(some_input):
    return [
        for webby in deconforbulate(some_input)

Say that rather than loop over deconforbulate objects, you need to loop over undepolified objects.

Here’s one bad solution:

def my_awful_copy_paste_non_solution(some_input):
    return [
        for webby in undepolified(some_input)

Here’s a slightly better one:

def my_somewhat_better_solution(some_input, is_deconfobulated: bool = True):
    if is_deconfobulated:
      generator = deconforbulate
      generator = undepolified

    return [
        uncornify(webby) for webby in generator(some_input)

But it doesn’t scale!

Your supervisor just popped in and wants you to try and loop over unsoupilated, resoupilated, and gunkifucated objects, and probably others as well. Using the pattern above, we have to modify the algorithm each and every time we add a new tweak.

This scenario is based on multiple past and future real-life stories.

Thankfully, the dependency-injection design pattern can come to the rescue!

from typing import Callable, Optional

def the_bees_knees_solution(some_input, generator: Optional[Callable] = None):
    if generator is None:
        generator = deconforbulate

    return [
        uncornify(webby) for webby in generator(some_input)

Now the algorithm is independent of the exact generator (Yay! Separation of concerns!). It can be used without modification with any generator that takes obstrucated-like objects.

Other languages

How to use dependency-injection

What to look for:

What to do:

Iterating over points in a ring

Create an iterator and/or a generator that lifts the loop over points in a two-dimensional ring:

Take this code:

from math import sqrt

points = [[1, 2], [0, 0], [-2, 0], [-2, 3], [-3, -4], [4, 0], [5, 5]]
radius = 3.5
width = 1.0

inner = radius - 0.5 * width
outer = radius + 0.5 * width
for point in points:
    distance = sqrt(point[0] * point[0] + point[1] * point[1])
    if distance >= inner and distance <= outer:
        print(f"Some complicated calculation at {point}")

and create a generator function points_in_ring:

for point in points_in_ring(points, radius, width):
    print(f"Some complicated calculation at {point}")

At this point, points_in_ring uses the Euclidian distance to figure out what is in the ring. Using dependency-injection, make points_in_ring capable of using any sensible distance (say the Manhattan norm):

def manhattan(point: List[float]) -> float:
    return sum(abs(x) for x in point)

for point in points_in_ring(points, radius, width, norm=manhattan):
    print(f"Some complicated calculation at {point}")


  • points_in_ring can be used over and over across the code
  • it is parametrized by radius and width
  • it make the loop self-descriptive
  • it is more memory efficient (lazy evaluation) than a list
  • it is almost always better than creating and keeping in sync a second list holding only points in a ring (compute is cheap)
  • it makes it possible to test/debug the loop alone, without worrying about the compute inside the loop

Ring generator

from math import sqrt
from typing import Iterable

def points_in_ring(points: Iterable, radius: float, width: float) -> Iterable:
    inner = radius - 0.5 * width
    outer = radius + 0.5 * width
    for point in points:
        distance = sqrt(point[0] * point[0] + point[1] * point[1])
        if distance >= inner and distance <= outer:
            yield point

points = [[1, 2], [0, 0], [-2, 0], [-2, 3], [-3, -4], [4, 0], [5, 5]]

for point in points_in_ring(points, radius=3.5, width=1.0):
    print(f"Some complicated calculation at {point}")

Ring generator with tweakable norm

from math import sqrt
from typing import Iterable, List, Optional, Callable

def euclidean(point: List[float]) -> float:
    return sqrt(sum((x * x) for x in point))

def manhattan(point: List[float]) -> float:
    return sum(abs(x) for x in point)

def points_in_ring(
    points: Iterable,
    radius: float,
    width: float,
    norm: Optional[Callable] = None,
) -> Iterable:
    if norm is None:
        norm = euclidean

    inner = radius - 0.5 * width
    outer = radius + 0.5 * width
    for point in points:
        distance = norm(point)
        if distance >= inner and distance <= outer:
            yield point

points = [[1, 2], [0, 0], [-2, 0], [-2, 3], [-3, -4], [4, 0], [5, 5]]

for point in points_in_ring(points, radius=3.5, width=1.0, norm=manhattan):
    print(f"Some complicated calculation at {point}")

Key Points

  • Many coders have come before

  • Transferable solutions to common problems have been identified

  • It is easier to apply a known pattern than to reinvent it, but first you have to spend some time learning about patterns.

  • Iterators and generators are convenient patterns to separate loops from compute and to avoid copy-pasting.

  • Dependency injections is a pattern to create modular algorithms

Testing Overview


Teaching: 15 min
Exercises: 5 min
  • Why test my software?

  • How can I test my software?

  • How much testing is ‘enough’?

  • Appreciate the benefits of testing research software

  • Understand what testing can and can’t achieve

  • Describe various approaches to testing, and relevant trade-offs

  • Understand the concept of test coverage, and how it relates to software quality and sustainability

  • Appreciate the benefits of test automation

Why Test?

There are a number of compelling reasons to properly test a research code:

Whilst testing might seem like an intimidating topic the chances are you’re already doing testing in some form. No matter the level of experience, no programmer ever just sits down and writes some code, is perfectly confident that sit works and proceeds to use it straight away in research. Instead development is in practice more piecemeal - you generally think about a simple input and the expected output then write some simple code that works. Then, iteratively, you think about more complicated example inputs and outputs and flesh out the code until those work as well. When developers talk about testing all this means is formalising the above process and making it automatically repeatable on demand.

This has numerous advantages over a more ad hoc approach:

As you’re performing checks on your code anyway it’s worth putting in the time to formalise your tests and take advantage of the above.

A Hypothetical Scenario

Your supervisor has tasked you with implementing an obscure statistical method to use for some data analysis. Wanting to avoid unnecessary work you check online to see if an implementation exists. Success! Another researcher has already implemented and published the code.

You move to hit the download button, but a worrying thought occurs. How do you know this code is right? You don’t know the author or his level of programming skill. Why should you trust the code?

Now turn this question on its head. Why should your colleagues or supervisor trust any implementation of the method that you write? Why should you trust work you did a year ago? What about a reviewer for a paper?

This scenario illustrates the sociological value of automated testing. If published code has tests then you have instant assurance that its authors have invested time in the checking the correctness of their code. You can even see exactly the tests they’ve tried and add your own if you’re not satisfied. Conversely, any code that lacks tests should be viewed with suspicion as you have no record of what quality assurance steps have been taken.

Types of Testing

This isn’t a topic we will discuss in much detail but is worth mentioning as the jargon here can be another factor that is intimidating. In fact there are entire websites dedicated to explaining the different types of testing. Ultimately, however there are only a few types of testing that we need to worry about.

Unit Testing

The main type of testing kind we will be dealing with in this course. Unit testing refers to taking a component of a program and testing it in isolation. Generally this means testing an individual class or function. This is part of the reason that testing encourages more modular and sustainable code development. You’re encouraged to write your code into functionally independent components that can be easily unit tested.

Functional Testing

Unlike unit testing that focuses on independent parts of the system, functional testing checks the compliance of the system overall against a defined set of criteria. In other words does the software as a whole do what it’s supposed to do?

Regression Testing

This refers to the practice of running previously written tests whenever a new change is introduced to the code. This is good to do even when making seemingly insignificant changes. Carrying out regression testing allows you to remain confident that your code is functioning as expected even as it grows in complexity and capability.

Testing Done Right

It’s important to be clear about what software tests are able to provide and what they can’t. Unfortunately it isn’t possible to write tests that completely guarantee that your code is bug free or provides a one hundred percent faithful implementation of a particular model. In fact it’s perfectly possible to write an impressive looking collection of tests that have very little value at all. What should be the aim therefore when developing software tests?

In practice this is difficult to define universally but one useful mantra is that good tests thoroughly exercise critical code. One way to achieve this is to design test examples of increasing complexity that cover the most general case the unit should encounter. Also try to consider examples of special or edge cases that your function needs to handle especially?

A useful quantitative metric to consider is test coverage. Using additional tools it is possible to determine, on a line-by-line basis, the proportion of a codebase that is being exercised by its tests. This can be useful to ensure, for instance, that all logical branching points within the code are being used by the test inputs.

Testing and Coverage

Consider the following Python function:

def recursive_fibonacci(n):
    """Return the n'th number of the fibonacci sequence"""
    if n <= 1:
        return n
        return recursive_fibonacci(n - 1) + recursive_fibonacci(n - 2)

Try to think up some test cases of increasing complexity, there are four distinct cases worth considering. What input value would you use for each case and what output value would you expect? Which lines of code will be exercised by each test case? How many cases would be required to reach 100% coverage?

For convenience, some initial terms from the Fibonacci sequence are given below:
0, 1, 1, 2, 3, 5, 8, 13, 21


Case 1 - Use either 0 or 1 as input

Correct output: Same as input
Coverage: First section of if-block
Reason: This represents the simplest possible test for the function. The value of this test is that it exercises only the special case tested for by the if-block.

Case 2 - Use a value > 1 as input

Correct output: Appropriate value from the Fibonacci sequence
Coverage: All of the code
Reason: This is a more fully fledged case that is representative of the majority of the possible range of input values for the function. It covers not only the special case represented by the first if-block but the general case where recursion is invoked.

Case 3 - Use a negative value as input

Correct output: Depends…
Coverage: First section of if-block
Reason: This represents the case of a possible input to the function that is outside of its intended usage. At the moment the function will just return the input value, but whether this is the correct behaviour depends on the wider context on which it will be used. It might be better for this type of input value to cause an error to be raised however. The value of this test case is that it encourages you to think about this scenario and what the behaviour should be. It also demonstrates to others that you’ve considered this scenario and the function behaviour is as intended.

Case 4 - Use a non-integer input e.g. 3.5

Correct output: Depends…
Coverage: Whole function
Reason: This is similar to case 3, but may not arise in more strongly typed languages. What should the function do here? Work as is? Raise an error? Round to the nearest integer?


The importance of automated testing for software development is difficult to overstate. As testing on some level is always carried out there is relatively low cost in formalising the process and much to be gained. The rest of this course will focus on how to carry out unit testing.

Key Points

  • Testing is the standard approach to software quality assurance

  • Testing helps to ensure that code performs its intended function: well-tested code is likely to be more reliable, correct and malleable

  • Good tests thoroughly exercise critical code

  • Code without any tests should arouse suspicion, but it is entirely possible to write a comprehensive but practically worthless test suite

  • Testing can contribute to performance, security and long-term stability as the size of the codebase and its network of contributors grows

  • Testing can ensure that software has been installed correctly, is portable to new platforms, and is compatible with new versions of its dependencies

  • In the context of research software, testing can be used to validate code i.e. ensure that it faithfully implements scientific theory

  • Unit (e.g. a function); Functional (e.g. a library); and Regression, (e.g. a bug) are three commonly used types of tests

  • Test coverage can provide a coarse- or fine-grained metric of comprehensiveness, which often provides a signal of code quality

  • Automated testing is another such signal: it lowers friction; ensures that breakage is identified sooner and isn’t released; and implies that machine-readable instructions exist for building and code and running the tests

  • Testing ultimately contributes to sustainability i.e. that software is (and remains) fit for purpose as its functionality and/or contributor-base grows, and its dependencies and/or runtime environments change

Writing unit tests


Teaching: 40 min
Exercises: 60 min
  • What is a unit test?

  • How do I write and run unit tests?

  • How can I avoid test duplication and ensure isolation?

  • How can I run tests automatically and measure their coverage?

  • Implement effective unit tests using pytest

  • Execute tests in Visual Studio Code

  • Explain the issues relating to non-deterministic code

  • Implement fixtures and test parametrisation using pytest decorators

  • Configure git hooks and pytest-coverage

  • Apply best-practices when setting up a new Python project

  • Recognise analogous tools for other programming languages

  • Apply testing to Jupyter notebooks


Unit testing validates, in isolation, functionally independent components of a program.

In this lesson we’ll demonstrate how to write and execute unit tests for a simple scientific code.

This involves making some technical decisions…

Test frameworks

We’ll use pytest as our test framework. It’s powerful but also user friendly.

For comparison: testing using assert statements:

from temperature import to_fahrenheit

assert to_fahrenheit(30) == 86

Testing using the built-in unittest library:

from temperature import to_fahrenheit
import unittest

class TestTemperature(unittest.TestCase):
    def test_to_farenheit(self):
        self.assertEqual(to_fahrenheit(30), 86)

Testing using pytest:

from temperature import to_fahrenheit

def test_answer():
    assert to_fahrenheit(30) == 86

Why use a test framework?

Projects that use pytest:

Learning by example

Reading the test suites of mature projects is a good way to learn about testing methodologies and frameworks

Code editors

We’ve chosen Visual Studio Code as our editor. It’s free, open source, cross-platform and has excellent Python (and pytest) support. It also has built-in Git integration, can be used to edit files on remote systems (e.g. HPC), and handles Jupyter notebooks (plus many more formats).

Demonstration of pytest + VS Code + coverage

  • Test discovery, status indicators and ability to run tests inline
  • Code navigation (“Go to Definition”)
  • The Test perspective and Test Output
  • Maximising coverage (assert recursive_fibonacci(7) == 13)
  • Test-driven development: adding and fixing a new test (test_negative_number)

A tour of pytest

Checking for exceptions

If a function invocation is expected to throw an exception it can be wrapped with a pytest raises block:

def test_non_numeric_input():
    with pytest.raises(TypeError):


Similar test invocations can be grouped together to avoid repetition. Note how the parameters are named, and “injected” by pytest into the test function at runtime:

@pytest.mark.parametrize("number,expected", [(0, 0), (1, 1), (2, 1), (3, 2)])
def test_initial_numbers(number, expected):
    assert recursive_fibonacci(number) == expected

This corresponds to running the same test with different parameters, and is our first example of a pytest decorator (@pytest). Decorators are a special syntax used in Python to modify the behaviour of the function, without modifying the code of the function itself.

Skipping tests and ignoring failures

Sometimes it is useful to skip tests (conditionally or otherwise), or ignore failures (for example if you’re in the middle of refactoring some code).

This can be achieved using other @pytest.mark annotations e.g.

@pytest.mark.skipif(sys.platform == "win32", reason="does not run on windows")
def test_linux_only_features():

def test_unimplemented_code():


Code refactoring is “the process of restructuring existing computer code without changing its external behavior” and is often required to make code more amenable to testing.


If multiple tests require access to the same data, or a resource that is expensive to initialise, then it can be provided via a fixture. These can be cached by modifying the scope of the fixture. See this example from Devito:

def grid(shape=(11, 11)):
    return Grid(shape=shape)

def test_forward(grid):[0, :] = 1.

def test_backward(grid):[-1, :] = 7.

This corresponds to providing the same parameters to different tests.


It’s common for scientific codes to perform estimation by simulation or other means. pytest can check for approximate equality:

def test_approximate_pi():
    assert 22/7 == pytest.approx(math.pi, abs=1e-2)

Random numbers

If your simulation or approximation technique depends on random numbers then consistently seeding your generator can help with testing. See random.seed() for an example or the pytest-randomly plugin for a more comprehensive solution.


pytest has automatic integration with the Python’s standard doctest module when invoked with the --doctest-modules argument. This is a nice way to provide examples of how to use a library, via interactive examples in docstrings:

def recursive_fibonacci(n):
    """Return the nth number of the fibonacci sequence

    >>> recursive_fibonacci(7)
    return n if n <=1 else recursive_fibonacci(n - 1) + recursive_fibonacci(n - 2)

Hands-on unit testing

Getting started

Setting up our editor

  1. If you haven’t already, see the setup guide for instructions on how to install Visual Studio Code and conda.
  2. Download and extract this zip file. If using an ICT managed PC please be sure to do this in your user area on the C: drive i.e. C:\Users\your_username
    • Note that files placed here are not persistent so you must remember to take a copy before logging out
  3. In Visual Studio Code go to File > Open Folder… and find the files you just extracted.
  4. If you see an alert “This workspace has extension recommendations.” click Install All and then switch back to the Explorer perspective by clicking the top icon on the left-hand toolbar
  5. Open Anaconda Prompt (Windows), or a terminal (Mac or Linux) and run:

    conda env create --file [path to environment.yml]

    The [path to environment.yml] can be obtained by right-clicking the file name in the left pane of Visual Studio Code and choosing “Copy Path”. Right click on the command line interface to paste.

  6. Important: After the environment has been created go to View > Command Palette in VS Code, start typing “Python: Select interpreter” and hit enter. From the list select your newly created environment “diffusion”

Running the tests

  1. Open
  2. You should now be able to click on Run Test above the test_heat() function and see a warning symbol appear, indicating that the test is currently failing. You may have to wait a moment for Run Test to appear.
  3. Switch to the Test perspective by clicking on the flask icon on the left-hand toolbar. From here you can Run All Tests, and Show Test Output to view the coverage report (see Lesson 1 for more on coverage)
  4. Important: If you aren’t able to run the test then please ask a demonstrator for help. It is essential for the next exercise.

Introduction to your challenge

You have inherited some buggy code from a previous member of your research group: it has a unit test but it is currently failing. Your job is to refactor the code and write some extra tests in order to identify the problem, fix the code and make it more robust.

The code solves the heat equation, also known as the “Hello World” of Scientific Computing. It models transient heat conduction in a metal rod i.e. it describes the temperature at a distance from one end of the rod at a given time, according to some initial and boundary temperatures and the thermal diffusivity of the material:

Metal Rod

The function heat() in attempts to implement a step-wise numerical approximation via a finite difference method:


This relates the temperature u at a specific location i and time point t to the temperature at the previous time point and neighbouring locations. r is defined as follows, where α is the thermal diffusivity: r=\frac{\alpha
\Delta t}{\Delta

The test_heat() function in compares this approximation with the exact (analytical) solution for the boundary conditions (i.e. the temperature at ends of the end being fixed at zero). The test is correct but failing - indicating that there is a bug in the code.

Testing (and fixing!) the code

Work by yourself or with a partner on these test-driven development tasks. Don’t hesitate to ask a demonstrator if you get stuck!

Separation of concerns

First we’ll refactor the code, increasing its modularity. We’ll extract the code that performs a single time step into a new function that can be verified in isolation via a new unit test:

  1. In move the logic that updates u within the loop in the heat() function to a new top-level function:

    def step(u, dx, dt, alpha):

    Hint: the loop in heat() should now look like this:

    for _ in range(nt - 1):
        u = step(u, dx, dt, alpha)
  2. Run the existing test to ensure that it executes without any Python errors. It should still fail.
  3. Add a test for our new step() function:

    def test_step():
        assert step() == 

    It should call step() with suitable values for u (the temperatures at time t), dx, dt and alpha. It should assert that the resulting temperatures (i.e. at time t+1) match those suggested by the equation above. Use approx if necessary. _Hint: step([0, 1, 1, 0], 0.04, 0.02, 0.01) is a suitable invocation. It will return an array of the form [0, ?, ?, 0]. You’ll need to calculate the missing values manually using the equation in order to compare the expected and actual values.

  4. Assuming that this test fails, fix it by changing the code in the step() function to match the equation - correcting the original bug. Once you’ve done this all the tests should pass.

Solution 1

Your test should look something like this:

def test_step():
    assert step([0, 1, 1, 0], 0.04, 0.02, 0.01) == [0, 0.875, 0.875, 0]

Your final (fixed!) step() function should look like this. The original error was a result of some over-zealous copy-and-pasting.

def step(u, dx, dt, alpha):
    r = alpha * dt / dx ** 2

    return (
        + [
            r * u[i + 1] + (1 - 2 * r) * u[i] + r * u[i - 1]
            for i in range(1, len(u) - 1)
        + u[-1:]

Now we’ll add some further tests to ensure the code is more suitable for publication.

Testing for exceptions

We want the step() function to raise an Exception when the following stability condition isn’t met: r\leq\frac{1}{2} Add a new test test_step_unstable, similar to test_step but that invokes step with an alpha equal to 0.1 and expects an Exception to be raised. Check that this test fails before making it pass by modifying to raise an Exception appropriately.

Solution 2

def test_step_unstable():
    with pytest.raises(Exception):
        step([0, 1, 1, 0], 0.04, 0.02, 0.1)

def step(u, dx, dt, alpha):
    r = alpha * dt / dx ** 2

    if r > 0.5:
        raise Exception


Adding parametrisation

Parametrise test_heat() to ensure the approximation is valid for some other combinations of L and tmax (ensuring that the stability condition remains true).

Solution 3

@pytest.mark.parametrize("L,tmax", [(1, 0.5), (2, 0.5), (1, 1)])
def test_heat(L, tmax):
    nt = 10
    nx = 20
    alpha = 0.01


After completing these two steps check the coverage of your tests via the Test Output panel - it should be 100%.

The full, final versions of and are available on GitHub.

Bonus task(s)

  • Write a doctest-compatible docstring for step() or heat()
  • Write at least one test for our currently untested linspace() function

Advanced topics

More pytest plugins

def test_fibonacci(benchmark):
    result = benchmark(fibonacci, 7)
    assert result == 13

pytest-benchmark example

Demonstration of performance regression via recursive and formulaic approaches to Fibonacci calculation (output)

from fibonacci import recursive_fibonacci
from hypothesis import given, strategies

def test_recursive_fibonacci(n):
    phi = (5 ** 0.5 + 1) / 2
    assert recursive_fibonacci(n) == int((phi ** n - -phi ** -n) / 5 ** 0.5)

Taking testing further

Testing in other languages

Further resources

Key Points

  • Testing is not only standard practice in mainstream software engineering, it also provides distinct benefits for any non-trivial research software

  • pytest is a powerful testing framework, with more functionality than Python’s built-in features while still catering for simple use cases

  • Testing with VS Code is straightforward and encourages good habits (writing tests as you code, and simplifying test-driven development)

  • It is important to have a set-up you can use for every project - so that it becomes as routine in your workflow as version control itself

  • pytest has a myriad of extensions that are worthy of mention such as Jupyter, benchmark, Hypothesis etc

  • Adding unit tests can verify the correctness of software and improve its structure: isolating logical distinct code for testing often involves untangling complex structures