Imperial College Research Software Community Newsletter - June 2023

Hello everyone and welcome to this month’s edition of the Imperial College Research Software Community Newsletter. With summer underway and scorching temperatures, it’s natural to feel the desire to step away from the computer screen for a while.

So, how about connecting with fellow members of the larger London Research Software community at the upcoming RSLondonSouthEast 2023 workshop? It’s a great chance to expand your network and exchange ideas. Or, alternatively, why not explore how Machine Learning works by creating a functional system, all without the need for a computer? Well, at least if you have a number of matchboxes around…

Has your curiosity been tickled? These are just a taste of the events, news, and highlights featured in this issue. So, grab a refreshing drink, find a comfortable spot, and immerse yourself in the content that awaits you.

Dates for your diary

Research Computing at Imperial

In this month’s edition of our Research Computing at Imperial segment, we are delighted to introduce two more members of the Research Software Champions team. As mentioned previously, these Champions are actively involved in a project dedicated to fostering a thriving research software culture. The project is also developing an updated Research Software Directory to promote the software that is developed at Imperial.

Hubert Mohr-Daurat:

I am a second-year PhD student in the Department of Computing in the Large-Scale Data and Systems Group. My interest is in designing systems to extend the scope of applications for database technologies to a more extensive range of data processing applications. I have been working on a data management system that can store and efficiently execute data imputation (e.g., for data cleaning in ML pipelines) and on CPU/GPU co-processing by the composition of existing systems and efficient data & query exchange format.

I spent much time coding in C++ when developing these systems. As a former programmer in the video game industry, I have been working in large teams on large codebases and learned how caring about code quality demands time and effort but is rewarding in the long term.

Code reliability, availability and reproducibility are essential in research but should not impede too much the research work. I believe that knowing the right tools and applying good practices, such as coding rules, code reviews, unit tests, benchmarks and static analysis, helps minimize this effort. This is why I joined the project as a Research Software Champion: we all have our own experience and resources. I want to share my knowledge about good software practices and hope to learn from others to grow our research software culture together.

Anthony Onwuli:

I am a 3rd year PhD student in the Department of Materials. I am a member of the Walsh Materials Design Group (https://wmd-group.github.io/). My research focuses on computational materials discovery by trying to develop ways in which we can find new materials through machine learning and first-principles quantum chemistry calculations (i.e., Density Functional Theory).

My experience with research software has primarily been taking over the development and maintenance of one of our group codes, SMACT (https://github.com/WMD-group/SMACT). This code enables us to generate compositions through chemical heuristics and combinatorics and more recently has been expanded to enable one to assign a crystal structure to the generated compositions based on chemical similarity with databases of known materials.

The Research Software Champions scheme has provided a chance to delve deeper into the field of research software problems.

Research Software of the Month

This month we thought we’d highlight an open source research tool that is not linked to Imperial but which we found very interesting and thought the community might be interested to check out: MapReader, a free, open-source software library written in Python for analysing large map collections.

According to the project’s GitHub repository:

MapReader was developed in the Living with Machines project to analyze large collections of historical maps but is a generalizable computer vision pipeline which can be applied to any images in a wide variety of domains.

The MapReader pipeline consists of a linear sequence of tasks which, together, can be used to train a computer vision (CV) classifier to recognise visual features within maps and identify patches containing these features across entire map collections.

MapReader allows users with little computer vision expertise to

  1. retrieve maps via web-servers
  2. preprocess and divide them into patches
  3. annotate patches
  4. train, fine-tune, and evaluate deep neural network models; and
  5. create structured data about map content.

The authors provide extensive documentation, including a section on input guidance that is of paramount importance when dealing with this type of technology. They also include details about project members, maintainer, and ways to collaborate. The Living with Machines project, through which the tool has been developed, is funded by UK Research and Innovation (UKRI).

The tool is released under a MIT License.

RSE Bytes

News

Blog posts, tools & more

Some reminders…

RS Community Slack

The Imperial Research Software Community Slack workspace is a place for general community discussion as well as featuring channels for individuals interested in particular tools or topics. If you’re an OpenFOAM user, why not join the #OpenFOAM channel where regular code review sessions are announced (amongst other CFD-related discussions…). Users of the Nextflow workflow tool can find other Imperial Nextflow users in #nextflow. You can find other R developers in #r-users and there is the #DeepLearners channel for AI/ML-related questions and discussion. Take a look at the other available channels by clicking the “+” next to “Channels” in the Slack app and selecting “Browse channels”.

If you want to start your own group around a tool, programming language or topic not currently represented, feel free to create a new channel and advertise it in #general.

Research Software Engineering support

If you need support with your code, seek no more! The Central RSE Team, within the Research Computing Service is here to help. Have a look at the variety of ways the team can work with you:

HPC documentation and tips

All the documentation, tutorials and howtos for using Imperial’s HPC are available in the HPC Wiki pages. See also the Research Computing Service’s Research Computing Tips series for a variety of helpful tips for using RCS resources and related tools and services.

Research Software Directory

Imperial’s Research Software Directory provides details of a range of research software and tools developed by groups and individuals at the College. If you’d like to see your software included in the directory, you can open a pull request in the GitHub repository or get in touch with the Research Software Community Committee.

Get in Touch, Get Involved!

Drop us a line with anything you’d like included in the newsletter, ideas about how it could be improved, or even offer to guest-edit a future edition! rse-committee@imperial.ac.uk.

If you’re reading this on the web and would like to receive the next newsletter directly to your inbox then please subscribe to our Research Software Community Mailing List.


This issue of the Research Software Community Newsletter was edited by Stefano Galvan. All previous newsletters are available in our online archive.