HPC Story: Interview with K. Hoste on Easybuild: Part 2

EasyBuild logo

Don't forget to check out the first part of this interview here, if you haven't already.

How do you experience working on a project of this scale, having to spend time managing people, suggestions, issues, etc. instead of improving the software?

 

“Be nice” –the implicit code of conduct for the EasyBuild community

K. Hoste.

I don’t feel like I’m managing people at all. Everyone who contributes to EasyBuild, and helps with processing incoming contributions and questions, basically does so voluntarily. I’ve never “forced” anyone to do a particular task. I do ask specific people to look into something now and then, but if they can’t or won’t for whatever reason, that’s fine, and then we’ll try and find someone else, or tackle it ourselves (time permitting). The EasyBuild community is very much run as a democracy, where everyone can speak up and share their view, and which is welcoming to anyone, regardless of their background or expertise.

 

 Group picture from 5th EasyBuild User Meeting in Barcelona (Jan'20).
Group picture from 5th EasyBuild User Meeting in Barcelona (Jan'20).

I have always enjoyed helping people, and sharing my expertise where I can. I get to do this not only when working on user support tickets for the HPC-UGent and VSC infrastructure, but also in a slightly different context in EasyBuild. I often feel I’m playing detective when trying to figure out a puzzling problem that presents itself. Usually, we end up working out the underlying cause, and coming up with a solution or a workaround, often with the help of people that we know have particular expertise (like C++ compilation errors, GPUs, etc.), which is very rewarding.

How do you experience the wide adoption of EasyBuild?

Making EasyBuild a project that was used all over the world never was a specific goal for us, it kind of happened by accident. It feels very rewarding that a tool that was built from scratch by the HPC-UGent team has seen such wide adoption, and that people feel it’s worth their time to contribute back to its development, and to actively help out with processing incoming contributions.

 

 

Evolution of the contributors
Evolution of the number of unique contributors per year on the easyconfigs Github repository.

 

The popularity of EasyBuild has helped a lot to grow our network in the HPC community, and it has resulted in collaborations with some of the largest HPC sites in Europe that otherwise would not have happened.

Me personally, I have also become friends with people that I would have never met if it wasn’t for EasyBuild. Maybe “met” is a strong word, since I have never actually physically met some of the EasyBuild maintainers or frequent contributors, yet I do feel there’s a personal connection with them.

Is there a moment in history for which you are particularly proud?

There are many, actually, like the first release or others. The latest one is the choice of EasyBuild to manage the software stack on LUMI (i.e., the latest European Tier-0 pre-exascale supercomputer). This is a sign of the maturity and popularity of EasyBuild.

I was also very happy with the large attendance at the most recent physical EasyBuild User Meeting (see picture above), where we had over 50 people attending the event. I'm still a bit surprised by so many people being willing to travel for a niche event like this...

How do you think the way packaging/distributing software has evolved over the years?

There’s a growth in interest in using containers to make it easier for scientific researchers to self-manage their software stack and take it with them to whatever compute infrastructure they are working on (personal workstations, own servers, VSC infrastructure, cloud, etc.). Most people don’t consider that they’re often sacrificing performance for mobility though… By building a container image that works “anywhere”, you’reimplicitly making trade-offs that can result in significantly lower performance for the workloads they are running.

"I often say that containers are a workaround, not a solution to the problem. More and more people seem to be agreeing with me on that, especially now that alternative system architectures beyond the traditional Intel and AMD CPUs, like Arm and RISC-V, are quickly gaining adoption. A container image that was built for Intel/AMD CPUs is totally useless on a system with Arm or RISC-V processors."

K. Hoste

Another trend I’m seeing is that there’s explosive growth in the number of scientific software projects that researchers are interested in using. For the last couple of years, we have received about 250 software installation requests per year for the HPC-UGent Tier-2 infrastructure. A significant part of that is for new software, which we’ve never installed before. The expansion of scientific domains that can employ powerful computing infrastructure, like bioinformatics and machine learning/AI, is part of the reason behind this trend.

Is there any competition with other tools?

I don’t feel that the availability of other tools that are similar to EasyBuild, like Spack [1], is a problem. Having some form of competition is good, it keeps us on our toes concerning regularly questioning decisions we have made in the past, and listening to the needs and desires of our user base. In my view, Spack is also mostly targeting a different use case, since it has interesting features that are very useful to developers of large scientific software projects. EasyBuild on the other hand has always been more geared toward HPC teams who need to manage a central software stack, and to researchers who want to self-manage a software stack that mostly consists of “off the shelf” software. Neither tool is limited to these use cases, but there’s a clear bias (which is also confirmed in the user surveys that have been done in both the EasyBuild and Spack communities).

"I don’t see containers as a threat to EasyBuild at all. To me, they’re for a different use case: an easy short-term escape, but not a long-term viable solution. Containers are too static in my view, and challenges remain in terms of accessing specialized hardware like high-speed interconnects and accelerators like GPUs, because the disconnect from the host operating system is too strong."

K. Hoste

 

How do you see the future of EasyBuild?

I see various challenges ahead. The work on a project like EasyBuild never stops, since there will always be new software projects and versions to support, and the HPC landscape is ever evolving in terms of system architectures (new CPU families like Arm and RISC-V, increasing diversity in accelerators, the rise of the cloud, etc.). Sometimes this is relatively easy (like supporting Arm), since most of the burden there is actually in the scientific software packages themselves, but for other aspects (like new types of GPUs) more work will be needed, and often the maturity of those software ecosystems is lacking and things are evolving quickly.

I’m also not happy with the way the EasyBuild documentation is written. This is mostly because of the format we are using (ReStructuredText) which is not very intuitive. I feel this discourages people from contributing to our documentation. To mitigate this, we are now porting the documentation to MarkDown [2] + MkDocs [3] in our dedicated GitHub repository (https://github.com/easybuilders/easybuild-docs), which should make the process of maintaining and extending the documentation a lot easier going forward.

To some extent, I even think EasyBuild is not sufficient anymore. It’s a great tool, and without it managing the central software stack on the HPC-UGent Tier-2 and VSC Tier-1 systems would be way more difficult and time-consuming, but we need to take the next step to collaborate even more with other HPC sites. Many people, ourselves included, often run into problems with installing software that are specific to a particular system. Very often that’s due to something specific in the system configuration, for example, which OS packages are installed, or which (generation of) high-speed interconnect is available on that system.

What would you propose to tackle this last issue?

Recently, we got involved in a new project called the European Environment for Scientific Software Installations (EESSI), which has grown out of the EasyBuild community. The main goal of EESSI is to build a central software stack together that can be employed on a variety of systems, ranging from personal workstations, and HPC systems, to cloud infrastructure, across different Linux distributions and system architectures, without compromising on performance. The software installations included in EESSI are set up in such a way that they can be employed on any Linux distribution, by providing a so-called compatibility layer, and there is support for a large variety of different CPU architectures. The intention is to provide a set of software installations that just work, with good performance, regardless of the system you’re using it on.

We currently have a proof-of-concept setup already available for experimenting that includes complex software applications like TensorFlow and OpenFOAM, and we’re gradually working towards making EESSI production-ready and making sure there is funding available to provide a service that researchers can rely on. In fact, the European funding we applied for got accepted. We hope to see EESSI being adopted across HPC systems, around Europe and beyond, shortly since that would be a huge jump forward both for HPC support teams and scientific researchers.

For more information, people can visit https://eessi.github.io/docs and check out the open-access paper that was published recently at https://doi.org/10.1002/spe.3075. Another way to get introductory materials is to check the talks of our first EESSI Comunity Meeting (held in Amsterdam in September 2022) at https://eessi.github.io/docs/meetings/2022-09-amsterdam/.

You recently updated the EasyBuild logo for a more modern one. Can you explain the process behind this choice?

In the process of publicly releasing EasyBuild in 2012, we came up with a logo ourselves. After a lot of discussion (and a couple of strong Belgian beers), we ended up with a logo "design" on a whiteboard. One of our team members took a picture of it, and traced it to obtain a vectorized version of it. There, we had our logo.

 

2 EasyBuild Logos
On the left: the logo of Easybuild in the period 2012-2022. On the right: the new logo.

After 10 years, it was time for a new logo, to refresh a particular aspect of EasyBuild that was due refreshing and make the project look a bit more professional. We didn't just want any new EasyBuild logo: we wanted a fitting one, one that was easy to recognize. And if it didn't look like it was drawn by a 5-year old, that was a plus, of course. So, this time we hired some professional help. A couple of weeks ago, an initial design of a new logo was presented to us. After iterating over the initial design, we ended up with a new EasyBuild logo that had our full support. It has several characteristics (that are explained on https://easybuild.io/new-logo-2022.html) but for example, it features 3 rows of blocks that are stacked on top of each other (and on top of 'EasyBuild', and our slogan), which matches both with the 3-level design of EasyBuild, and with the fact it is used to install software stacks. Moreover, the rows of blocks can also be interpreted as progress bars that show how far an installation has progressed, and be a hint towards the performance of software that is central to the EasyBuild project. Everything combined, this new logo is a great fit for EasyBuild, and it respects both the key aspects and the history of the project.