Imperial College Research Software Community Newsletter - February 2022

Hello everyone and welcome to February’s newsletter. This month, in addition to the regular dates for your diary, news and details of interesting blog posts and articles you might like to read, we are highlighting a group of Julia array and matrix packages in our Research Software of the Month feature. We are also continuing our Research Computing at Imperial series with an introduction from Paul Aylin, Co-Director of Research Data Strategy in the Research Computing Service’s Academic Leadership Team.

Dates for your diary

Research Computing at Imperial

We continue our series of introductions from key members of the College community helping to run, manage and support research computing and research software services. This month we have an introduction from Paul Aylin, Co-Director of Research Data Strategy within the recently created Academic Leadership Team for the Research Computing Service (RCS).

Paul writes:

I trained first as a medical doctor, and then in public health, and first came to Imperial College in 1997 to work in what is now the School of Public Health within the Faculty of Medicine. I’ve routinely collected clinical and administrative data to examine variations in quality and safety in healthcare for nearly 25 years. My early research on measuring and understanding healthcare outcomes informed two national public inquiries into the GP, Harold Shipman, who murdered over 200 of his patients and an investigation into a children’s heart surgery unit at Bristol Royal Infirmary. Working with industry to develop a national surveillance tool, I have helped support NHS Trusts in understanding variations in outcomes across a range of diagnosis and surgical procedure. My mortality alerting system was pivotal in alerting the national regulator to problems at the Mid Staffordshire NHS Foundation Trust, prompting a public inquiry and a raft of recommendations on healthcare organisation and delivery.

I share my role of Director of Research Data Strategy with David Colling, whom I have no doubt you will read about in a subsequent issue. Having made use of sensitive healthcare records to carry out this work, I am acutely aware of the need for strong information governance in research. I am also aware of the need for researchers to be able to store and analyse sensitive data sets in a secure and well-regulated environment, which assures data providers, researchers, the College and ultimately the patients or members of the public who provide their data that their personal information is safe and protected. The consequences of a data breach could be very damaging for the College, both reputationally and financially.

As part of my role, I am therefore leading a College initiative to establish a secure environment for the storage and analysis of sensitive data of all kinds, including personal data, commercially sensitive data and data relating to matters of national security. I am enjoying working closely with other academics, ICT, the College Data protection officer and of course colleagues from RCS.

Research Software of the Month

This month we’re highlighting a group of modules for the Julia programming language that focus on representing a variety of structured array and matrix data structures and associated linear algebra. The modules can be found in the JuliaArrays and JuliaMatrices GitHub organisations. The lead developer of all of the modules listed below, Sheehan Olver, is based in Imperial’s Department of Mathematics. The two groups covering arrays and matrices include a significant number of modules along with documentation. Here we highlight a small set of those available, just to provide a flavour of the type of capabilities these modules provide to Julia developers - there are many more to explore. If you’ve never programmed in Julia before, maybe we can inspire you to take a look at the language and explore some of these interesting, high performance libraries. Where equivalent modules are available in Julia’s core library, these implementations can offer higher performance and greater flexibility.

LazyArrays.jl: Provides more memory efficient and performant “lazy” implementations of array operations such as concatenation and multiplication. The functionality provided in this library can, for example, be useful when working with efficient matrix-free methods in the context of linear solvers, especially when handling very large matrices.

BlockArrays.jl: This module offers both an abstract interface for handling different block array types and an implementation of two different block array types - BlockArray and PseudoBlockArray. Building code that accesses block arrays via the AbstractBlockArray interface provides flexibility for using different underlying block array implementations, potentially dynamically, and provides a common interface that other block array implementations can choose to implement, reducing the barrier to adopting new implementations in existing code.

InfiniteArrays.jl: Building on the concept of “lazy” handling of arrays, InfiniteArrays provides the ability to efficiently represent arrays with infinite dimensions. The nature of representing data structures with infinite size means that processing has to be scoped to some manageable set of the represented data, often only calculating additional content when it is actually accessed, or when nearby values are accessed. The previously highlighted LazyArrays module is leveraged in the implementation of InfiniteArrays to provide this lazy behaviour and ensure effective management of memory use when handling infinite structures. InfiniteLinearAlgebra.jl builds on this package to add support for infinite analogues of QR and other decompositions.

BandedMatrices.jl: This is the first of three banded matrix-related modules we’re highlighting which provide support for efficient storage and processing of sparse banded matrices. The library leverages Julia’s concise and elegant syntax to easily define and create a range of banded matrix layouts.

BlockBandedMatrices.jl: BlockBandedMatrices builds on the banded matrix structure supported in BandedMatrices to provide support for creating and working with block banded matrices. A further specialisation provided in this module - BandedBlockBandedMatrix - supports the creation of banded block matrices which, in turn, contain banded blocks. Some support for distributed storage is also provided.

LazyBandedMatrices.jl: Finally we highlight the LazyBandedMatrices module which applies the approach of lazy evaluation in the context of banded matrices. This offers the potential to work effectively and efficiently with very large data structures that may be impractical to handle with the standard banded/blockbanded matrix modules.

RSE Bytes


Blog posts, tools & more

Some reminders…

RS Community Slack

The Imperial Research Software Community Slack workspace is a place for general community discussion as well as featuring channels for individuals interested in particular tools or topics. If you’re an OpenFOAM user, why not join the #OpenFOAM channel where regular code review sessions are announced (amongst other CFD-related discussions…). Users of the Nextflow workflow tool can find other Imperial Nextflow users in #nextflow. You can find other R developers in #r-users and there is the #DeepLearners channel for our new AI/ML group. Take a look at the other available channels by clicking the “+” next to “Channels” in the Slack app and selecting “Browse channels”. If you want to start your own group around a tool, programming language or topic not currently represented, feel free to create a new channel and advertise it in #general.

Research Computing Tips

See the Research Computing Service’s Research Computing Tips series for a variety of helpful tips for using RCS resources and related tools and services.

Research Software Directory

Imperial’s Research Software Directory provides details of a range of research software and tools developed by groups and individuals at the College. If you’d like to see your software included in the directory, you can open a pull request in the GitHub Repository or get in touch with the Research Software Community Committee.

Get in Touch, Get Involved!

Drop us a line with anything you’d like included in the newsletter, ideas about how it could be improved, or even offer to guest-edit a future edition!

If you’re reading this on the web and would like to receive the next newsletter directly to your inbox then please subscribe to our Research Software Community Mailing List.

This issue of the Research Software Community Newsletter was edited by Yasel Quintero. All previous newsletters are available in our online archive.