Baskerville System
The Baskerville system is a Lenovo® cluster solution and supplied by OCF. The system was deployed and integrated and is managed by Advanced Research Computing at the University of Birmingham.
Compute nodes¶
The system comprises 57 SD650-N V2 liquid cooled compute trays with
- 2x Intel® Xeon® Platinum 8360Y CPUs, each with 36 cores at 2.4GHz (with boost to 3.5GHz)
- 512GB RAM (16x 32GB DDR4)
- 1TB NVMe M.2 device (used for OS and available as
/scratch-local
) - 1x 25GbE NVIDIA® Mellanox® (on-planar ConnectX-4 port)
- 1x HDR (200Gbps) NVIDIA Mellanox Infiniband port (ConnectX-6 PCIe gen4 adapter)
- NVIDIA HGX-100 GPU planar
- 4x NVIDIA A100 GPUs
The GPUs on 11 nodes have 80GB RAM; those on the remaining 46 nodes have 40GB RAM.
The GPUs are meshed using Nvidia NVLINK. Full details of the architecture are provided in the Lenovo documentation.
The compute trays are all direct liquid cooled using Lenovo Neptune™ to provide dense computing.
Global storage¶
The system is equipped with Lenovo DSS-G storage systems running IBM® Spectrum Scale™:
- 1x DSS-G250 equipped with 418x 16TB HDD
- 1x DSS-G204 equipped with 48x 7.68TB SSD
Two file-systems are deployed:
/bask
- general storage for home directory and project bulk data storage/scratch-global
- transient storage on SSD enclosures available on all compute systems
Quota¶
User home directories are limited to 20GB hard limit. Home directory space is provided for login type scripts and it is expected that all code and data will be placed in a project space.
Project space by default is allocated at 1TB, however additional quota for projects can be requested and is allocated based on justified need.
Network¶
Baskerville uses three Networks:
- isolated 1GbE management network (not user accessible)
- 25GbE high speed Ethernet network
- HDR fat tree Infiniband network
Infiniband network¶
The HDR fat tree Infiniband network is built using NVIDIA Mellanox Quantum HDR switches (QM8790).
All compute systems use ConnectX-6 PCIe gen 4 adapters which provides a full 200Gb network connection. Architecturally, this is connected to Socket 1 on the system planar.
Login, management and storage systems are PCIe gen3 attached and provide HDR-100 connectivity to the fabric. Storage nodes all use multiple HDR-100 ports and use the Spectrum Scale verbsPorts option to enable the ports. RDMA is also enabled to the Spectrum Scale storage systems.
IPoIB is also deployed on the Infiniband network.
25GbE high speed Ethernet network¶
The high speed network is built using NVIDIA Mellanox Spectrum®-2 switches running NVIDIA Cumulus® Linux.
Trademarks and registered trademarks are owned by their respective companies.