News from RCIC and elsewhere

HPC3/HPC2 Downtime June 15, 2022 and Aug TBD, 20202

Thursday, May 19 2022 by Philip Papadopoulos

We periodically need to perform maintenance on HPC3 where all users are logged out and all jobs are stopped. We have two planned outages for this summer. The highlights of upcoming changes addressed by these two outages include:

  • DUO two-factor authentication will become standard on HPC2/3 login

  • Parallel File System and ZFS updated to latest stable releases

  • CentOS 7 will be sunset and Enterprise Linux 8 (EL8, Rocky Linux) will be the new OS The entire application software stack will be rebuilt for EL8. Older versions will be retained where possible, new application versions will be added

  • The version of Slurm will be updated to the latest stable release

While we normally prefer to have only a single downtime, the changes are large enough that we will handle the changes in two distinct phases.

Phase 1 ( June 15, 2022, All day)

  • BeeGFS and ZFS file system updates

  • Turn on Two-Factor (Duo) authentication

Phase 2 ( Aug TBD, 2022, All day)

  • Reinstall all nodes with EL8

  • Reinstall all rebuilt applications

  • Update Slurm

The first downtime (June 15) should only have the apparent effect on users that Duo authentication will now be required for password-based login to hpc3 and hpc3.

The second downtime (Aug) will be much more impactful In general, any user-compiled code will need to be rebuilt. Some conda environments may also need to be rebuilt to work properly with new system libraries. A few older applications simply will not build on EL8 Some widely-used versions of software - e.g. R version 3, older versions of R 4 are not buildable with all R modules under EL8. As the summer progresses, we will keep a list of sunset software.

Is keeping CentOS7 an option? Not really. It will be end-of-life in Mid-2024. We are already seeing commercial vendors ending support for CentOS7. The viability of CentOS7 as a functional OS will progressively diminish over the next 24 months.

Research Infrastructure Symposium - June 4, 2021

Monday, May 24 2021 by Philip Papadopoulos

You are invited to participate in the 2021 virtual symposium of UCI’s Research Cyberinfrastructure Center (RCIC). The symposium will take place on June 4th, 2021, via zoom (10:00am - 2:30pm). This event aims to bring together students, researchers, staff, instructors, and outreach partners who use or would like to use and/or contribute to the shared campus-wide hardware and software resources as well the human expertise provided by RCIC and the UCI libraries.

Participation is free and no registration is required, but only zoom users with a .uci.edu email address can participate.

Please see the details and agenda

We look forward to seeing you online!

Filipp Furche, Professor of Chemistry

Phil Papadopoulos, RCIC Director

HPC3 Production and HPC Shutdown on 5 Jan 2021

Wednesday, Nov 04 2020 by Philip Papadopoulos

We are pleased to announce that HPC3 is in production All existing HPC users have accounts on HPC3 and can get started right away. A short presentation answers some of the key questions up front.

The existing HPC cluster will run until 5 Jan 2021. On that day, RCIC will shut down the queuing system, kill all running jobs and begin the process of physically dismantling HPC, moving some hardware to HPC3, and starting the transition of selected "mid-life" nodes for a cluster called HPC2.

Users should begin their transition to HPC3 now. Please note that any files in your current HPC home area will be discarded sometime in January.

HPC3 Production Ramp Up

Wednesday, Jul 15 2020 by Philip Papadopoulos

We are pleased to announce that HPC3 will enter its production ramp up on 20 July 2020. A short presentation describes this phase of HPC3.

To handle the transition of a large number of users to HPC3 during the ramp up, we’re asking that research groups/labs submit a single request to email:hpc-support@uci.edu (please see the presentation of what to include).

Friendly users on HPC3 have consumed over 1 million core hours on 1+ million jobs..

We expect the production ramp up to last about two months.