Hardware
Lichtenberg II Cluster Stage 1

Hardware of Cluster Stage 1 of Lichtenberg II

This new system is in regular operation since 1st December 2020.

Login (8 nodes)

  • 2x “Intel® Xeon® Platinum 9242 Processor” (Cascade Lake)
  • 96 CPU cores and 2x Intel® AVX-512 units per core
  • 768 GByte Main memory (DDR4-2933)
  • HPC interconnect: InfiniBand HDR100 (100 GBit/s)
  • Hostnames: logc0001 … logc0008
  • Accessible from outside as: lcluster13 … lcluster20.hrz.tu-darmstadt.de
  • 48 cores per processor – Hyperthreading is off
    • CPU clock 2,3 GHz, Turbo 3,8 GHz
    • AVX-512 (Advanced Vector eXtensions, 512 bits)
    • VNNI (Vector Neural Network Instructions)
    • TSX-NI (Transactional Synchronization eXtensions)

MPI – Section (630 nodes)

  • 2x “Intel® Xeon® Platinum 9242 Processor” (Cascade Lake)
  • 96 cores and 2x Intel® AVX-512 units per core
  • 384 GByte Main memory (DDR4-2933)
  • HPC interconnect: InfiniBand HDR100 (100 GBit/s)
  • Hostnames: mpsc0001 … mpsc0630
  • 48 cores per processor – Hyperthreading is off
    • CPU clock 2,3 GHz, Turbo 3,8 GHz
    • AVX-512 (Advanced Vector eXtensions, 512 bits)
    • VNNI (Vector Neural Network Instructions)
    • TSX-NI (Transactional Synchronization eXtensions)

MEM – Section (2 nodes)

  • 2x “Intel® Xeon® Platinum 9242 Processor” (Cascade Lake)
  • 96 cores und 2x Intel® AVX-512 units per core
  • 1536 GByte Main memory (DDR4-2933)
  • HPC interconnect: InfiniBand HDR100 (100 GBit/s)
  • Hostnames: mpqc0001 … mpqc0002
  • 48 cores per processor – Hyperthreading is off
    • CPU clock 2,3 GHz, Turbo 3,8 GHz
    • AVX-512 (Advanced Vector eXtensions, 512 bits)
    • VNNI (Vector Neural Network Instructions)
    • TSX-NI (Transactional Synchronization eXtensions)

ACC – Section GPUs (8 nodes)

  • 4x “Intel® Xeon® Platinum 8260 Processor” (Cascade Lake)
  • 96 cores and 2x Intel® AVX-512 units per core
  • 384 GByte Main memory (DDR4-2933)
  • 4x nodes with each 4x “NVIDIA® Tesla® V100” (Volta generation, GV100 chip)
    Hostnames: gvqc0001 … gvqc0004
  • 4x nodes with each 4x “NVIDIA® A100” (Ampere generation, GA100 chip)
    Hostnames: gaqc0001 … gaqc0004
NVIDIA® Volta 100 NVIDIA® Ampere 100
CUDA cores 5120 6912
Tensor cores 640 432
Memory / G-RAM 32 GB CoWoS-HBM2 ECC RAM 40 GB CoWoS-HBM2 ECC RAM
Memory Bandwidth 900 GByte/s 1600 GByte/s
Performance (Double Precision, FP64) 7 TFlop/s 9.7 TFlop/s (19.5 TFlop/s non Std.)
Performance (Single Precision, FP32) 14 TFlop/s 19.5 TFlop/s (156 TFlop/s non Std.)
Tensor Performance 112 TFlop/s 312 TFlop/s (624 TFlop/s with Sparsity)
Hostnames gvqc0001 … gvqc0004 gaqc0001 … gaqc0004

ACC – Section DGX A100 (3 nodes)

  • 2x “AMD EPYC™ 7742” Processor
  • 128 cores and 2x AVX-2 units per core
  • 1024 GByte Main memory (DDR4-3200)
  • HPC interconnect: 2x InfiniBand HDR200 (200 GBit/s)
  • Hostnames: gaoc0001 … gaoc0003
  • 8x “NVIDIA® A100 Tensor Core GPUs” (Ampere generation)
  • 64 cores per processor
    • CPU clock 2,25 GHz, Boost 3,4 GHz
    • AVX-2 (Advanced Vector eXtensions, 256 bits)
    • PCIe® 4.0 x 128
  • Accelerator cards Nvidia A100:
    • 6912 CUDA cores
    • 432 Tensor cores
    • 40 GB CoWoS-HBM2 ECC RAM
    • 1600 GByte/s Memory bandwidth
    • Double-Precision (64bit): 9.7 TFlop/s (19.5 TFlop/s non Std.)
    • Single-Precision (32bit): 19.5 TFlop/s (156 TFlop/s non Std.)
    • Tensor Performance: 312 TFlop/s (624 TFlop/s with Sparsity)

Totals (all compute and login nodes)

  • 62.592 cores
  • 257 TByte RAM
  • 16x Nvidia Voltaire V100
  • 40x Nvidia Ampere A100
  • 4x Nvidia Tesla T4