Imperial College Research Software Community Newsletter - June 2022

Coming to you this month from the Department of Metabolism, Digestion and Reproduction, where you find us in the throes of a relocation from SAF in South Kensington to the Hammersmith hospital campus. All packed up and waiting for the removal van, I am perched atop a wobbling pyramid of Teacrates™, optimistically labelled Burlington Danes E311, in the hope that this elegant-sounding destination does indeed exist somewhere. If not, then three GPU servers, two desktops, six monitors, a sackful of peripherals and my bodyweight in Post-Its and staples will doubtless find their way to landfill sooner rather than later.

So under the circumstances, it’s comforting to remember that in all other respects life goes on much the same, and there may even be fun things happening. Read on…

Dates for your diary

Research Software of the Month

Despite its old age (it can be traced back to 1972!), CSV is still a very popular format for storing tabular data used in many disciplines. Metadata concerning the contents of the file is often included in the header - lines commented with a particular character - but it rarely follows a format that is machine readable, and sometimes it is not even human readable! In some cases, such information is provided in a separate file, which is not ideal as it is easy for data and metadata to get separated.

PyCSVY is a small Python package to handle CSV files in which the metadata in the header is formatted in YAML. It supports reading/writing tabular data contained in Numpy arrays, Pandas data frames and nested lists, as well as metadata using a standard Python dictionary, ensuring that both data and metadata always stay together and are human and machine readable. Under the hood, it uses PyYAML and the mature and extensively tested functionality of Numpy and Pandas for reading/writing files.

While still a work in progress, PyCSVY aims to fully comply with the frontmatter specification described in CSVY, incorporating information about the CSV dialect used and a Table Schema specifying the contents of each column to aid the reading and interpretation of the data. A package supporting this same functionality already exists for R.

RSE Bytes

News

RSE Code Surgeries have arrived! The RSE Team is trialling this new service for the Imperial research community with slots available every other Monday over the summer, so if you have an issue you want to discuss, book an appointment and help shape the service! Please provide relevant information (eg. a link to your software repository) so the RSE Team can review the material, prepare an appropriate response, and compile the resources for you to use to improve your software.

Unlike the existing HPC Clinics, meant to provide immediate support in relation to the HPC and the RDS, the purpose of these Surgeries is to provide long-term impact for custom code bases. In particular, some of the topics they will cover are:

As this is a trial, your feedback will be invaluable to help shaping a useful service when Code Surgeries are launched in full later in the year.

On the 26th and 27th July 2022, we’re planning to bring the “Reproducible computational environments using containers” course, covering Docker and Singularity, to Imperial. This course is organised in collaboration with the ARCHER2 training team at EPCC, University of Edinburgh. Full details are still to be finalised but if you’d like to attend the course, or if you’re experienced at using Docker and/or Singularity and are available to help on either the 26th or 27th July contact Jeremy Cohen to express your interest.

Blog posts, tools & more

Some reminders…

RS Community coffee

…continues weekly via Teams - normally on Friday afternoons at 3pm but check our Slack workspace for exact times and connection details.

RS Community Slack

The Imperial Research Software Community Slack workspace is a place for general community discussion as well as featuring channels for individuals interested in particular tools or topics. If you’re an OpenFOAM user, why not join the #OpenFOAM channel where regular code review sessions are announced (amongst other CFD-related discussions…). Users of the Nextflow workflow tool can find other Imperial Nextflow users in #nextflow. You can find other R developers in #r-users and there is the #DeepLearners channel for our new AI/ML group. Take a look at the other available channels by clicking the “+” next to “Channels” in the Slack app and selecting “Browse channels”.

If you want to start your own group around a tool, programming language or topic not currently represented, feel free to create a new channel and advertise it in #general.

Research Computing Tips

See the Research Computing Service’s Research Computing Tips series for a variety of helpful tips for using RCS resources and related tools and services.

Research Software Directory

Imperial’s Research Software Directory provides details of a range of research software and tools developed by groups and individuals at the College. If you’d like to see your software included in the directory, you can open a pull request in the GitHubRepository or get in touch with the Research Software Community Committee.

Get in Touch, Get Involved!

Drop us a line with anything you’d like included in the newsletter, ideas about how it could be improved, or even offer to guest-edit a future edition! rse-committee@imperial.ac.uk.

If you’re reading this on the web and would like to receive the next newsletter directly to your inbox then please subscribe to our Research Software Community Mailing List.


This issue of the Research Software Community Newsletter was edited by Jazz Mack Smith. All previous newsletters are available in our online archive.