Optimized primitives for collective multi-GPU communication.
Anvil: cuda-11.0_2.11.4, cuda-11.2_2.8.4, cuda-11.4_2.11.4
You can load the modules by:
module load modtree/gpu module load nccl