Tools I: Packaging and virtual-environments

Overview

Teaching: 5 min
Exercises: 10 min
Questions
  • How to use a package manager to install third party tools and libraries

Objectives
  • Use conda to install a reproducible environment

Python packages

Virtual environments

Package managers

Package managers help you install packages. Some help you install virtual environments as well. Better known python package managers include conda, pip, poetry

  conda pip poetry
audience research all developers
manage python packages
manage non-python packages
choose python version
manage virtual envs
easy interface
fast

Package managers in other languages

Rules for choosing a package manager

  1. Choose one
  2. Stick with it

We chose conda because it is the de facto standard in science, and because it can natively install libraries such as fftw, vtk, or even Python, R, and Julia themselves.

It is also the de facto package manager on Imperial’s HPC cluster systems.

Example (10 min)

Installing and using an environment

  1. If you haven’t already, see the setup guide for instructions on how to install Conda and Visual Studio Code.

  2. Download this zip archive and extract it. Start Visual Studio Code and select “Open folder…” from the welcome screen. Navigate to the folder created from the archive and press “Select Folder”.

  3. From the side-bar select the file environment.yml. If you are prompted to install the Visual Studio Code Python extension then do so. The contents of environment.yml should match:

    name: course
    dependencies:
      - python==3.11
      - ruff
      - mypy
      - requests
    
  4. Create a new virtual environment using conda:

    Windows users will want to open the app Anaconda Prompt from the Start Menu.

    Linux and Mac users should use a terminal app of their choice. You may see a warning with instructions. Please follow the instructions.

    conda env create -f "path_to_environment.yml"
    

    You can obtain path_to_environment.yml by right clicking the file tab near the top of Visual Studio Code and selecting “Copy Path” from the drop-down menu. Make sure to include the quotation marks in the command. Right click on the window for your command line interface to paste the path.

  5. We can now activate the environment:

    conda activate course
    
  6. And check Python knows about the installed packages. Start a Python interpreter with the command python then:

    import requests
    

    We expect this to run without any messages. You can see the location of the installed package with:

    requests.__file__
    
    'C:\\ProgramData\\Anaconda3\\envs\\course\\lib\\site-packages\\requests\\__init__.py'
    

    The file path you see will vary but note that it is within a directory called course that contains all of the files for the virtual environment you have created. Exit the Python interpreter:

    exit()
    
  7. Finally, feel free to remove requests from environment.yml, then run

    conda env update -f "path_to_environment.yml"
    

    and see whether the package has been updated or removed.

Solution

After completing step 7 you will find that the package has not been removed from the environment even after being removed from environment.yml! If we run conda env update --help we get:

usage: conda-env update [-h] [-n ENVIRONMENT | -p PATH] [-f FILE] [--prune]
                        [--json] [-v] [-q]
                        [remote_definition]

Update the current environment based on environment file

Options:

positional arguments:
  remote_definition     remote environment definition / IPython notebook

optional arguments:
  -h, --help            Show this help message and exit.
  -f FILE, --file FILE  environment definition (default: environment.yml)
  --prune               remove installed packages not defined in environment.yml

Notice the --prune option. From the documentation it looks like we would need to run conda env update --prune -f "path_to_environment.yml". That flag is necessary to really uninstall unnecesary packages and update the environment. After running this command, import resquest will fail.

Note: If you are running conda<23.9.0, import request will still work as there was a bug before that. You can check your conda version with conda --version.

Selecting an environment in Visual Studio Code

If you haven’t already, see the setup guide for instructions on how to install Visual Studio (VS) Code.

The simplest option for all platforms is to set the interpreter via the Command Palette:

An entry should be present with the name course. It may take a few minutes to appear however.

If you already have a Python file open then it’s also possible to set the interpreter using the toolbar at the bottom of the window.

Key Points

  • There are tens of thousands of Python packages

  • The choice is between reinventing the square wheel or reusing existing work

  • The state of an environment can be stored in a file

  • This stored environment is then easy to audit and recreate