Hello everyone and welcome to February’s newsletter. This month, in addition to the regular dates for your diary, news and details of interesting blog posts and articles you might like to read, we are highlighting a group of Julia array and matrix packages in our Research Software of the Month feature. We are also continuing our Research Computing at Imperial series with an introduction from Paul Aylin, Co-Director of Research Data Strategy in the Research Computing Service’s Academic Leadership Team.
The Alan Turing Institute is having a launch event for their “Data science education interest group” on Wednesday 2nd March, 14:00-15:00. Further details about how to join the group and register for the event can be found on the event page.
There are still a few days remaining to make lightning talk submissions to the Collaborations Workshop 2022. The call closes on 4th March 2022. The workshop takes place on the 4th-7th April.
If you’re planning to use ARCHER2, the UK’s National Supercomputing Service, as part of your research or software work, or are keen to find out more about the platform, the ARCHER2 team are running an online session “The Hitchhikers’ Guide to ARCHER2” on Wednesday 2nd March, 15:00-16:00. You can find further details of this and other upcoming training sessions on the ARCHER2 website.
Do you work with genomic data? Do you struggle to process your data with resources that are available to you locally? The Cloud-SPAN collaboration are running a couple of multi-day courses during March covering the use of High Performance Computing / Cloud for working with genomic data. See details of the Prenomics and Genomics courses.
A series of online Documentation Workshops is taking place on the 10th, 17th and 24th of March. The workshops are organized as part of a 2021 Software Sustainability Institute fellowship and are open to all.
The Alan Turing Institute is hosting the UK’s national showcase of artificial intelligence and data science research and collaboration on the 22nd-23rd March. The virtual event will be an in-depth exploration of how AI and data science can be used to solve real-world challenges. Registration is open to anyone within the AI and data science community.
STUOD-MPE CDT Hackathon - 28th-31st March 2022. This online hackathon is being hosted by the ERC Stochastic transport in upper ocean dynamics (STUOD) project and the EPSRC Centre for Doctoral Training in the Mathematics of Planet Earth (MPE CDT). Submit an expression of interest to participate by 18th March 2022 via the link on the event web page.
We continue our series of introductions from key members of the College community helping to run, manage and support research computing and research software services. This month we have an introduction from Paul Aylin, Co-Director of Research Data Strategy within the recently created Academic Leadership Team for the Research Computing Service (RCS).
Paul writes:
I trained first as a medical doctor, and then in public health, and first came to Imperial College in 1997 to work in what is now the School of Public Health within the Faculty of Medicine. I’ve routinely collected clinical and administrative data to examine variations in quality and safety in healthcare for nearly 25 years. My early research on measuring and understanding healthcare outcomes informed two national public inquiries into the GP, Harold Shipman, who murdered over 200 of his patients and an investigation into a children’s heart surgery unit at Bristol Royal Infirmary. Working with industry to develop a national surveillance tool, I have helped support NHS Trusts in understanding variations in outcomes across a range of diagnosis and surgical procedure. My mortality alerting system was pivotal in alerting the national regulator to problems at the Mid Staffordshire NHS Foundation Trust, prompting a public inquiry and a raft of recommendations on healthcare organisation and delivery.
I share my role of Director of Research Data Strategy with David Colling, whom I have no doubt you will read about in a subsequent issue. Having made use of sensitive healthcare records to carry out this work, I am acutely aware of the need for strong information governance in research. I am also aware of the need for researchers to be able to store and analyse sensitive data sets in a secure and well-regulated environment, which assures data providers, researchers, the College and ultimately the patients or members of the public who provide their data that their personal information is safe and protected. The consequences of a data breach could be very damaging for the College, both reputationally and financially.
As part of my role, I am therefore leading a College initiative to establish a secure environment for the storage and analysis of sensitive data of all kinds, including personal data, commercially sensitive data and data relating to matters of national security. I am enjoying working closely with other academics, ICT, the College Data protection officer and of course colleagues from RCS.
This month we’re highlighting a group of modules for the Julia programming language that focus on representing a variety of structured array and matrix data structures and associated linear algebra. The modules can be found in the JuliaArrays and JuliaMatrices GitHub organisations. The lead developer of all of the modules listed below, Sheehan Olver, is based in Imperial’s Department of Mathematics. The two groups covering arrays and matrices include a significant number of modules along with documentation. Here we highlight a small set of those available, just to provide a flavour of the type of capabilities these modules provide to Julia developers - there are many more to explore. If you’ve never programmed in Julia before, maybe we can inspire you to take a look at the language and explore some of these interesting, high performance libraries. Where equivalent modules are available in Julia’s core library, these implementations can offer higher performance and greater flexibility.
LazyArrays.jl: Provides more memory efficient and performant “lazy” implementations of array operations such as concatenation and multiplication. The functionality provided in this library can, for example, be useful when working with efficient matrix-free methods in the context of linear solvers, especially when handling very large matrices.
BlockArrays.jl: This module offers both an abstract interface for handling different block array types and an implementation of two different block array types - BlockArray and PseudoBlockArray. Building code that accesses block arrays via the AbstractBlockArray interface provides flexibility for using different underlying block array implementations, potentially dynamically, and provides a common interface that other block array implementations can choose to implement, reducing the barrier to adopting new implementations in existing code.
InfiniteArrays.jl: Building on the concept of “lazy” handling of arrays, InfiniteArrays provides the ability to efficiently represent arrays with infinite dimensions. The nature of representing data structures with infinite size means that processing has to be scoped to some manageable set of the represented data, often only calculating additional content when it is actually accessed, or when nearby values are accessed. The previously highlighted LazyArrays module is leveraged in the implementation of InfiniteArrays to provide this lazy behaviour and ensure effective management of memory use when handling infinite structures. InfiniteLinearAlgebra.jl builds on this package to add support for infinite analogues of QR and other decompositions.
BandedMatrices.jl: This is the first of three banded matrix-related modules we’re highlighting which provide support for efficient storage and processing of sparse banded matrices. The library leverages Julia’s concise and elegant syntax to easily define and create a range of banded matrix layouts.
BlockBandedMatrices.jl: BlockBandedMatrices builds on the banded matrix structure supported in BandedMatrices to provide support for creating and working with block banded matrices. A further specialisation provided in this module - BandedBlockBandedMatrix - supports the creation of banded block matrices which, in turn, contain banded blocks. Some support for distributed storage is also provided.
LazyBandedMatrices.jl: Finally we highlight the LazyBandedMatrices module which applies the approach of lazy evaluation in the context of banded matrices. This offers the potential to work effectively and efficiently with very large data structures that may be impractical to handle with the standard banded/blockbanded matrix modules.
The weekly RS Community coffee is now open to attendees from the wider RSLondon community. Join us every Friday at 3pm for a chance to chat with RSEs from institutions around London and the South East region as well as your Imperial colleagues. Check our Slack workspace for connection details.
A new committee page has recently gone live on the Imperial Research Software Community web pages. Take a look to find out more about the community committee who support the running of the community and keep these newsletters arriving in your inbox each month.
A new platform called ResearchEquals for publishing your research step-by-step, as you carry it out, has been launched. The platform supports free open access publishing. Find more details in the Software Sustainability Institute’s recent news item highlighting this new platform.
Do you have experience with AI, genomics or the intersection between the two? The Ada Lovelace Institute has an open call for input for their new AI and genomics project. The project aims to better understand how the two fields interact and what are the societal and ethical ramifications of these interactions.
The first hidden REF competition took place last year and members of the Imperial community were involved in several submissions. An article about the hidden REF - Time to celebrate science’s ‘hidden’ contributors - has recently been published in nature’s careers column.
Developing and distributing in-house R-packages - a recording of the Nordic-RSE community’s January 2021 seminar.
The latest episode of the RSE Stories podcast is out. “I’m invisible, but I can see the differences” is an interview with Peter Vaillancourt, a Computational Scientist at the Cornell Center for Advanced Computing.
Three more episodes of the Code for Thought podcast were released this month. Listen to find out, among other things, how praying mantises could help improve the 3D vision of robots!
The new season of RedHats’s Command Line Heroes podcast just dropped. Tune-in to listen to the stories of cybersecurity experts who disinfect computers from viruses, defuse logic bombs and dismantle botnets.
Open source creates value, but how do you measure it? - this blog post by the GitHub Policy Team describes some of the questions that are still unanswered when it comes to measuring the value of open source projects.
It it now possible to include diagrams in your Markdown files on GitHub.
RS Community Slack
The Imperial Research Software Community Slack workspace is a place for general community discussion as well as featuring channels for individuals interested in particular tools or topics. If you’re an OpenFOAM user, why not join the #OpenFOAM channel where regular code review sessions are announced (amongst other CFD-related discussions…). Users of the Nextflow workflow tool can find other Imperial Nextflow users in #nextflow. You can find other R developers in #r-users and there is the #DeepLearners channel for our new AI/ML group. Take a look at the other available channels by clicking the “+” next to “Channels” in the Slack app and selecting “Browse channels”. If you want to start your own group around a tool, programming language or topic not currently represented, feel free to create a new channel and advertise it in #general.
Research Computing Tips
See the Research Computing Service’s Research Computing Tips series for a variety of helpful tips for using RCS resources and related tools and services.
Research Software Directory
Imperial’s Research Software Directory provides details of a range of research software and tools developed by groups and individuals at the College. If you’d like to see your software included in the directory, you can open a pull request in the GitHub Repository or get in touch with the Research Software Community Committee.
Drop us a line with anything you’d like included in the newsletter, ideas about how it could be improved, or even offer to guest-edit a future edition! rse-committee@imperial.ac.uk.
If you’re reading this on the web and would like to receive the next newsletter directly to your inbox then please subscribe to our Research Software Community Mailing List.
This issue of the Research Software Community Newsletter was edited by Yasel Quintero. All previous newsletters are available in our online archive.