HPC: Emergency Shutoff

including Housing in L5|08


Late evening of the 25. January, the cooling system for all coldwater-cooled components has suffered a dead loss.

To avoid overheating (and to mitigate the risk of fire), the HPC cluster and all servers in the adjacent “housing” area had to be shut off without any further notice.

All running jobs have been set to requeue wherever possible, to enable their clean restart after the cluster becomes available again.