Imperial College Research Software Community Newsletter - January 2022

Hello everyone and welcome to January’s newsletter. This month we are highlighting SMolESY from Imperial’s National Phenome Centre, and continuing our Research Computing at Imperial series with a personal introduction from Michael Bearpark, Director of User Engagement in the Research Computing Service’s Academic Leadership Team.

Dates for your diary
RSE Bytes
- News
- Blog posts, tools & more
Research Software of the Month
Research Computing at Imperial
Some reminders…
Get in Touch, Get Involved!

Dates for your diary

Registration is open until the 1st February for an Open Research London event on FAIR data hosted by the Francis Crick Institute.
Registration is open for the SSI’s Collaborations Workshop 2022 which takes place online, on the 4th-7th April. There’s also a call for submissions for mini-workshops and demo sessions for which the deadline is 4th February.
18th February: deadline for submission of papers to the Software Engineering for Computational Science (SE4Science’22) workshop which will be held in London this year on the 21st-22nd June. See the call for papers for further details.
From 28 February to 4 March 2022: an invited Lecture Series on the Mathematics of Deep Learning at the Isaac Newton Institute in Cambridge, but also online. Registration is £25 - £50.

RSE Bytes

News

The Software Sustainability Institute have announced their 2022 Fellows!

Blog posts, tools & more

An article from the StackOverflow Blog on nanopore sequencing and the benefits of open sourcing your code: Sequencing your DNA with a USB dongle and open source code.
In newsletters past we’ve mentioned the Chan/Zuckerberg initiative to fund mainly healthcare related “Essential Open Source Software for Science”. The list of projects they chose to fund in the last round makes quite interesting reading.
In our September 2021 newsletter, we highlighted a blog post by Matthew Bluteau reviewing the SE4Science workshop at ICCS2021. In a follow-up post focusing on Software Testing, Matthew continues his review of SE4Science, looking at the speed blogging session that focused on testing, and then summarising a discussion on testing that took place at the later SeptembRSE conference.
Dave Horsfall’s article on Festive stress: a research software engineer’s perspective, which was published on the SSI blog just before the break, is still very much worth a read. The article contains some great advice and guidance.
Scaling up unit testing using parameterisation.
How to call Julia code from Python
From Perl to parallel Python, or why “the investment in RSE is worth it”. Christopher Woods publishes a case study on speeding up a search for genetic markers x5000, and reducing the environmental impact for good measure.
Can I safely assume we’re all playing Wordle? Kaggle are upping the ante with this statistical approach to nailing it in one but if that’s too much like work, you can always just inspect the Javascript, for all the answers in chronological order . (Just remember you’ll only be cheating yourself. Ed.)
A new season of Code for Thought just started, and January’s episode introduces seven of the new EPSRC RSE fellows.
An overview of the RSE movement by Vanessa Sochat.

Research Software of the Month

Our top pick this month is SMolESY, developed by Dr Pantelis Takis at Imperial’s National Phenome Centre. SMolESY (Small Molecule Enhancement SpectroscopY) is a computational solution for the assignment and integration of 1H nuclear magnetic resonance (NMR) signals from metabolites in types of biofluid containing macromolecules, for example blood plasma and serum.

For more than 15 years, the set experimental procedure for NMR analysis of biofluids such as these has consisted of a standard 1H-NMR experiment (e.g. 1D-NOESY), a pseudo 2D J-Res experiment for supporting small molecules assignment, and a spin-echo experiment (e.g. CPMG) to suppress broad and often dominant macromolecule-derived signals. This latter experiment is the only commonly applied method for mitigating the strongly confounding signals from proteins and other large biochemical structures, to reveal the underlying signals from small molecules. Yet despite its ubiquity, it is surprisingly ineffective at reliable suppression of broad signals.

SMolESY is designed to functionally supplant the traditional spin-echo experiment. It produces a spectrum which replaces and improves on weaknesses of, for example, CPMG, providing complete suppression of the macromolecular signals and enhanced spectral resolution, while retaining the quantitative nature of the original 1H-NMR data. The resulting spectra expedite the chemical assignment and quantification of several metabolites. More importantly, this approach shows no loss of metabolic information in the commonly employed biofluids with highly abundant macromolecules, and exhibits the required reproducibility for metabolomics and other analytical studies, greatly improving the quantitative and qualitative NMR analytical assets while dramatically reducing the instrument time and resource required. For example, replacing the traditional spin-echo experiment with SMolESY in the analysis of plasma samples from the UK Biobank 500k patient cohort, would save nearly four years of instrument time.

The source (MATLAB) is available under the GPL, and the repository also includes Windows and Mac binaries.

For a video demonstration, see page 2 of the Supporting Information in this paper from Analytical Chemistry.

Research Computing at Imperial

We continue our series of introductions from key members of the College community helping to run, manage and support research computing and research software services. This month we have an introduction from Michael Bearpark, Director of User Engagement within the recently created Academic Leadership Team for the Research Computing Service (RCS).

Mike writes:

I’ve used IC’s HPC service since it was set up in something like its present form over 15 years ago. I’ve been more involved in governance over the past five years as the service has continued to grow, into what would be a central national facility in many parts of the world. It’s easy to lose that perspective, and part of the reason for creating an academic leadership team in 2021 was to make sure the value of the service as a whole is recognised, both internally and externally, and that it can continue to develop with a secure foundation.

Part of my role is simply to ask, of any proposed changes to the service, ‘how is this going to benefit the user experience?’. It’s difficult to anticipate how things will actually work out of course, but that’s why there’s a leadership team, and it’s been encouraging to see this develop from a half thought out suggestion less than a year ago.

My research background is in computational chemistry and I’m one of the many authors of the Gaussian electronic structure code. Before the pandemic I travelled regularly with a team delivering workshops on the code and its scientific applications, which helped shape my thinking on the computing resources that non-specialists increasingly need, and of the importance of a service offered as a whole, beyond the hardware. I’ve also helped support RSE projects on MPI parallelisation for quantum dynamics and redeveloping a non-specialist user portal based on Open OnDemand, and we’re just about to start a third project to pilot FAIR data curation for Imperial.

Before the acronym ‘RSE’ was widely adopted, the central HPC team helped me to connect with an experienced technical expert with excellent communication skills and judgment who could be paid as a software consultant, without feeling under constant pressure to publish research papers. I won’t pretend this was a seed for RSE as any kind of career choice - an idea which was already developing elsewhere - but the project was a success on several levels, and it did eventually help make the case for IC’s central Research Computing Service to set up its own RSE team.

Some reminders…

RS Community coffee

…continues weekly via Teams - normally on Friday afternoons at 3pm but check our Slack workspace for exact times and connection details.

RS Community Slack

The Imperial Research Software Community Slack workspace is a place for general community discussion as well as featuring channels for individuals interested in particular tools or topics. If you’re an OpenFOAM user, why not join the #OpenFOAM channel where regular code review sessions are announced (amongst other CFD-related discussions…). Users of the Nextflow workflow tool can find other Imperial Nextflow users in #nextflow. You can find other R developers in #r-users and there is the #DeepLearners channel for our new AI/ML group. Take a look at the other available channels by clicking the “+” next to “Channels” in the Slack app and selecting “Browse channels”. If you want to start your own group around a tool, programming language or topic not currently represented, feel free to create a new channel and advertise it in #general.

Research Computing Tips

See the Research Computing Service’s Research Computing Tips series for a variety of helpful tips for using RCS resources and related tools and services.

Research Software Directory

Imperial’s Research Software Directory provides details of a range of research software and tools developed by groups and individuals at the College. If you’d like to see your software included in the directory, you can open a pull request in the GitHub Repository or get in touch with the Research Software Community Committee.

Get in Touch, Get Involved!

Drop us a line with anything you’d like included in the newsletter, ideas about how it could be improved… or even offer to guest-edit a future edition! rse-committee@imperial.ac.uk.

If you’re reading this on the web and would like to receive the next newsletter directly to your inbox then please subscribe to our Research Software Community Mailing List.

This issue of the Research Software Community Newsletter was edited by Jazz Mack Smith. All previous newsletters are available in our online archive.