Posts tagged maintenance
15:00 All maintenance tasks have been completed ahead of time and the cluster is back online. The job queue is resumed and all users can now log in and submit new jobs.
We are happy to announce that the move of Hydra to the new data center of VUB in Etterbeek campus is complete. The cluster is now back open to VSC users.
The start of the operation to move the Hydra cluster to VUB Etterbeek campus is upon us. Hydra will be closed to access from Monday, April 11th at 00:00 (CEST). The move is planned to last 1 week. We will continue to be reachable at email@example.com and we will communicate the progress of the move on this post.
Hydra will very soon move from ULB Solbosch campus to the recently upgraded VUB data center on the Etterbeek campus. This change will bring improved power supply, cooling and security measures to our HPC cluster. The move is scheduled in the week of April 11-15.
Warning We are carrying out a routine update of the operating system in Hydra. The nodes will be progressively updated without impacting running jobs, but queue times during the roll out might be longer.
The HPC team is proud to announce one of the biggest improvements for Hydra in recent years and certainly one of the most impactful for its users. Hydra is replacing its job scheduler for the Slurm Workload Manager.
The job scheduler in Hydra will be put on pause on Thursday, 16th September from 20:00 to 22:00. We will perform minor updates to the MPI stack in
intel toolchains. Running jobs will not be affected and users will be able to continue submitting jobs to the queue during the pause.
Login nodes updated successfully. The system update for the rest of the nodes continues as planned.
Warning On Tuesday June 15 at 20:00 the login nodes will be unavailable for 30 minutes.
The maintenance on the storage is done and jobs are again running as normal. We are still busy with (automatically) updating all worker nodes. Until that is done, the queuing time will be slightly longer than normal.
Due to maintenance work, it will not be possible to submit new jobs or view the status of submitted jobs on Saturday 29/8 between 20:00 and 23:00. Running jobs will not be impacted.
From 2 to 3 June 2020, works on an electric board of the SISC cooling system will be conducted. The operations will require to stop the cooling system for short periods of time. The computing power of the HPC clusters Hydra and Vega will be reduced to a minimal level to avoid heat problems in the data center. Therefore, waiting time for jobs in the queue will be higher then normal.
Due to the recent wave of cyberattacks on HPC centers throughout Europe, the SISC HPC team will take security hardening measures on Hydra, in consultation with the VSC.
There are a couple of hardware issues with the Hydra storage which we cannot safely fix while the cluster is running. We are going to shut down Hydra on Monday 18 May, starting at 09:00 to perform the needed repairs. We expect that we can resume normal operations by the late afternoon.