Kickoff of operation to move Hydra into campus#

../../../_images/hpc-cables-01.jpg

The start of the operation to move the Hydra cluster to VUB Etterbeek campus is upon us. Hydra will be closed to access from Monday, April 11th at 00:00 (CEST). The move is planned to last 1 week. We will continue to be reachable at hpc@vub.be and we will communicate the progress of the move on this post.

If you are not yet familiar with the implications of this move, you can find all the information in the announcement page: Hydra moves to VUB Etterbeek campus

During the week of the move, users can still contact us at VUB-HPC Support but user support will be limited:

  • Login to Hydra and other VSC sites (except Tier-1 Hortense) will be closed

  • Requests requiring access to the HPC cluster (e.g. software installations, data transfers) will only be possible once all systems are back online

  • Creation of new VSC user accounts and VOs will be paused until the new storage is fully integrated with the VSC account page

Timeline of events#

The timeline below shows the main goalposts during this operation and it will be updated regularly. We will also notify of any changes on the planned schedule.

Timeline of move operation#

Date

Time

Action

Status

11/04

00:00

Access of VUB users to all VSC clusters closed

✓ Completed

11/04

08:00

Final data sync to new storage

✓ Completed

12/04

08:00

Move of hardware to Etterbeek campus

✓ Completed

19/04

17:00

Login to non-VUB VSC clusters re-established

↩ Updated ETA ✓ Completed

19/04

17:00

Login to Hydra re-established

↩ Updated ETA ✓ Completed

25/04

18:00

Creation of new VSC accounts and VOs re-established

✓ Completed

Updated on 19/04/2022

13:00 We are happy to announce that all issues have been solved. Not only is the network working, but we also completed the integration of the new storage. We are now wrapping up the last bits before we open the cluster to the public. Final time for the re-opening is today at 17:00.

../../../_images/move-09.jpg

The new house of Hydra in the VUB data center.#

Updated on 15/04/2022

20:00 We deeply regret to announce that the re-opening of Hydra has to be postponed. Outstanding issues with the network connection of some critical systems (e.g. the login nodes) pushed us to cancel the re-establishment of services.

On the positive side, today we solved the main hardware issue encountered during the move and we recovered the non-operational InfiniBand network switch. As of this writing all hardware parts/nodes in Hydra are functional. Hence, we will be able to re-open the cluster to our users (even if partially) fairly quickly once all core systems have full access to the networks of Hydra.

Our apologies for the inconvenience.

Updated on 14/04/2022

20:00 Cabling of the all nodes is almost complete. We already powered up and tested the core systems in Hydra and no major problems have occurred. We currently have 2 active issues that are causing changes in our planning:

  1. Network access to the new storage is limited

  2. One InfiniBand network switch is non-operational

Tomorrow we plan to do the first global stress-test of the complete cluster. If there are no critical failures, Hydra will be brought online as planned. However, due to the aforementioned issues, access to other VSC sites will only be possible when Hydra is online. If these issues persist, the re-establishement of services might be partial and the kickoff of the new storage postponed to a later date.

../../../_images/move-08.jpg

Sam busy with the cabling. Those yellow cables behind are long fibers used in the InfiniBand network.#

../../../_images/move-05.jpg

All Skylake nodes with InfiniBand are already cabled. The blue glow light in the data center is for enhanced coolness.#

../../../_images/move-06.jpg

Nodes usually have a regular network connection for management and a high-speed interconnect. In this case, the white cables are 1 Gbps Ethernet and the black ones are fast InifiniBand.#

../../../_images/move-07.jpg

Yellow fibers for the 10 Gbps Ethernet already connected.#

Updated on 13/04/2022

19:00 All hardware parts of Hydra are already placed in their new location in the VUB data center. User data was successfully transferred to the new storage and $VSC_HOME and $VSC_DATA are in a good state. We are still working on re-opening access to clusters in other VSC sites, this requires making the storage accessible from external VSC sites which is in progress.

Updated on 13/04/2022

9:00 All hardware parts of Hydra have already arrived at VUB Etterbeek campus. Installation of the nodes is currently ongoing. The focus today will be on bringing the new storage for $VSC_HOME and $VSC_DATA online. This will allow all our users to use any of the other VSC clusters.

../../../_images/move-04.jpg

Some nodes are already placed in the new data center of VUB. On the left side, the improvement on power and cooling is clearly visible.#

../../../_images/move-03.jpg

All racks are already empty in the old data center. Good bye Solbosch and thanks for having hosted us all these years.#

Updated on 12/04/2022

09:00 All necessary preparations to move the hardware of the HPC cluster have been carried out succesfully and on schedule. Today Hydra will be transported piece by piece into its new location in VUB Etterbeek campus.

../../../_images/move-01.jpg

The HPC team hard at work removing and classifying the 500+ cables that hold Hydra together.#

../../../_images/move-02.jpg

Each node is labelled with color coded stickers that point to their new location.#