Research Computing Tips: Getting started with Conda

Conda is our recommended tool for managing research software on the RCS compute service. It’s complementary to the module system and provides a means to install and run a broad range of software - particularly if you’re working with Python or R. It also integrates well with our Jupyter service and editors such as Visual Studio Code.

We recommend using a fresh Conda environment for each of your projects - this helps to resolve a whole range of issues relating to reproducibility and portability.

To use Conda you’ll need to load it (the same applies inside job scripts):

module load anaconda3/personal

followed by a one-time initialisation step if you’re a new user:

anaconda-setup

Having loaded Conda you’ll typically create a new environment:

conda create --name myenv

or activate an existing one:

conda activate myenv

and then install some packages e.g. for Python:

conda install python=3 pytorch

or for R (note that R packages are prefixed with r-):

conda install r-base=4 r-dplyr

Wherever possible we recommend installing packages using Conda, rather than pip install for Python or install.packages for R.

Here are a few other useful or more advanced commands:

# Export an environment to a YAML file in order to document/share your project's dependencies
conda env export --from-history --no-builds

# Create/import an environment from a YAML file
conda env create --file environment.yml

# Create a Python environment from a traditional `requirements.txt` file
conda create --name myenv --file requirements.txt python=3

# List your environments
conda info --envs

# Completely remove an environment
conda remove --name myenv --all

# Clean up unused packages/downloads (this can free up a lot of disk space)
conda clean --all

# Run a single command in an environment - this can where `conda activate` doesn't work e.g. in CI scripts
conda run --name myenv python main.py

Further resources

Some example job scripts using Conda:
- rcs-snakemake-tutorial (Python)
- rcs-pacemakers (Python + Jupyter)
- minGPT (Python + Jupyter)
- Reticulate (R)
- covid-sim (Python + R)
More tips on Conda
The Research Computing Service’s Essential Software Engineering for Researchers course explains more about virtual environments and package management, including Conda