Skip to content

OFED and GPUDirect

Technology use

The information provided in this page is provide assistance on the technology available within Baskerville. We are unable to assist with code migration to utilise the technologies outlined.

Mellanox OFED

Baskerville uses Mellanox OFED for the Infiniband and High Speed Network drivers. This page tracks the versions of MOFED deployed over time.

June 2021

# ofed_info -s
MLNX_OFED_LINUX-5.3-1.0.0.1:

GPUDirect

NVIDIA GPUDirect® is a family of technologies to enhance data movement and access for GPUs. The following components are made available on Baskerville GPU nodes:

Mellanox OFED GPUDirect RDMA

Current release: 1.1

Mellanox OFED GPUDirect is an addition to MOFED and provides a peer-to-peer path between GPU Memory directly to the Mellanox Infiniband adapter.

Note that each compute tray has a single HCA and whilst the GPUs are meshed using NVLINK, not all GPUs may be directly accessible from the socket attached to the HCA. Details of the architecture are linked in the Lenovo documentation.

GPUDirect Storage

Current release: nvidia-gds-0.95.1

GPUDirect Storage (Magnum IO) enables a direct data path between storage and GPU memory utilising RDMA for data transfers.

Open Beta

GPUDirect Storage is currently an Open Beta from NVIDIA. IBM Spectrum Scale support was added in the 0.95.1 release. Full testing of GPUDirect Storage has not been completed on Baskerville.

Checking GPUDirect Storage is loaded:

$ lsmod | grep nvidia_fs
nvidia_fs

GDRCopy

Current release: 2.2

GDRCopy is a low-latency GPU memory copy library based on NVIDIA GPUDirect RDMA technology.

Checking GDRCopy kernel module is loaded:

$ lsmod  | grep gdr
gdrdrv

Testing gdrcopy is working:

$ module load bask-apps/live GDRCopy/2.1-GCCcore-10.2.0-CUDA-11.1.1
$ GDRCOPY_ENABLE_LOGGING=1 GDRCOPY_LOG_LEVEL=0 copylat
GPU id:0; name: A100-SXM4-40GB; Bus id: 0000:31:00
GPU id:1; name: A100-SXM4-40GB; Bus id: 0000:4b:00
GPU id:2; name: A100-SXM4-40GB; Bus id: 0000:ca:00
GPU id:3; name: A100-SXM4-40GB; Bus id: 0000:e3:00
selecting device 0
device ptr: 0x7f267e000000
allocated size: 16777216
DBG:  wc_mapping=1
map_d_ptr: 0x7f26999ef000
info.va: 7f267e000000
info.mapped_size: 16777216
info.page_size: 65536
info.mapped: 1
info.wc_mapping: 1
page offset: 0
user-space pointer: 0x7f26999ef000
gdr_copy_to_mapping num iters for each size: 10000
WARNING: Measuring the issue overhead as observed by the CPU. Data might not be ordered all the way to the GPU internal visibility.
Test              Size(B)      Avg.Time(us)
DBG:  sse4_1=1 avx=1 sse=1 sse2=1
DBG:  using AVX implementation of gdr_copy_to_bar
gdr_copy_to_mapping             1           0.3448
gdr_copy_to_mapping             2           0.3433
gdr_copy_to_mapping             4           0.3433
gdr_copy_to_mapping             8           0.3429
gdr_copy_to_mapping            16           0.3434
gdr_copy_to_mapping            32           0.3433
gdr_copy_to_mapping            64           0.3419
gdr_copy_to_mapping           128           0.3621
gdr_copy_to_mapping           256           0.3683
gdr_copy_to_mapping           512           0.3815
gdr_copy_to_mapping          1024           0.4214
gdr_copy_to_mapping          2048           0.5269
gdr_copy_to_mapping          4096           0.7187
gdr_copy_to_mapping          8192           1.0892
gdr_copy_to_mapping         16384           1.1099
gdr_copy_to_mapping         32768           1.7705
gdr_copy_to_mapping         65536           3.6171
gdr_copy_to_mapping        131072           7.0432
gdr_copy_to_mapping        262144          13.9050
gdr_copy_to_mapping        524288          27.6336
gdr_copy_to_mapping       1048576          55.1509
gdr_copy_to_mapping       2097152         108.0527
gdr_copy_to_mapping       4194304         219.2105
gdr_copy_to_mapping       8388608         439.5790
gdr_copy_to_mapping      16777216         878.3777
gdr_copy_from_mapping num iters for each size: 100
Test              Size(B)      Avg.Time(us)
DBG:  using SSE4_1 implementation of gdr_copy_from_bar
gdr_copy_from_mapping             1           1.2106
gdr_copy_from_mapping             2           1.6792
gdr_copy_from_mapping             4           1.6770
gdr_copy_from_mapping             8           1.6771
gdr_copy_from_mapping            16           0.8376
gdr_copy_from_mapping            32           1.2744
gdr_copy_from_mapping            64           1.5003
gdr_copy_from_mapping           128           1.6954
gdr_copy_from_mapping           256           1.6985
gdr_copy_from_mapping           512           1.6938
gdr_copy_from_mapping          1024           2.5872
gdr_copy_from_mapping          2048           3.6041
gdr_copy_from_mapping          4096           6.9445
gdr_copy_from_mapping          8192          12.0247
gdr_copy_from_mapping         16384          23.7481
gdr_copy_from_mapping         32768          46.9334
gdr_copy_from_mapping         65536          91.3752
gdr_copy_from_mapping        131072         191.5826
gdr_copy_from_mapping        262144         403.2905
gdr_copy_from_mapping        524288         813.1163
gdr_copy_from_mapping       1048576        1614.5567
gdr_copy_from_mapping       2097152        3231.4384
gdr_copy_from_mapping       4194304        6765.2056
gdr_copy_from_mapping       8388608       14080.6063
gdr_copy_from_mapping      16777216       28266.7212
unmapping buffer
unpinning buffer
closing gdrdrv