Hardware stub

NVIDIA Announces 18-port GPU Switch & "World's Largest GPU" at 32GB HBM2

Posted on March 27, 2018

NVidia today announced what it calls “the world’s largest GPU,” the gold-painted and reflective GV100, undoubtedly a call to its ray-tracing target market. The Quadro GV100 combines 2x V100 GPUs via NVLink2, running 32GB of HBM2 per GPU and 10,240 CUDA cores. NVidia advertises 236 TFLOPS Tensor Cores in addition to the power afforded by the 10,240 CUDA cores.

Additionally, nVidia has upgraded its Tesla V100 products to 32GB, adding to the HBM2 stacks on the interposer. The V100 is nVidia’s accelerator card, primarily meant for scientific and machine learning workloads, and later gave way to the Titan V(olta). The V100 was the first GPU to use nVidia’s Volta architecture, shipping initially at 16GB – just like the Titan V – but with more targeted use cases. NVidia's first big announcement for GTC was to add 16GB VRAM to the V100, further adding a new “NV Switch” (no, not that one) to increase the coupling capabilities of Tesla V100 accelerators. Now, the V100 can be bridged with a 2-billion transistor switch, offering 18 ports to scale-up the GPU count per system.

Each of the 18 ports runs at 50GB per port, with nVidia citing just under 1TB/s bandwidth. The ports are fully connected and can communicate between each device connected, enabling more efficient deep learning and machine learning data processing. NVidia noted that, in the instance of oil and gas industry signal processing, 1k x 1k x 1k FFTs are now about 50% faster, coupled with lower error rates. The company also noted easy virtualization and scaling for users that may be incapable of fully engaging all 18 ports on the switch.

Related to this, nVidia has announced that its DGX-1 has been upgraded with 32GB GPUs.

The V100 supports 300GB bandwidth for training. Aside from the memory overhaul, the rest of the architecture and GPU remain the same. There is no change to the rest of the GPU. It is entirely a 2x memory capacity upgrade, with the NV Switch shipping as a separate and new product.

NVidia's combined GPU features 16 Tesla V100s with 32GB each, all connected by the NVSwitch, capable of leveraging 512GB HBM2 and 14.4TB/sec aggregate bandwidth across 81,920 CUDA cores and 2,000 TFLOPs combined Tensor cores.

There should be more GTC coverage as we go at the event.

Editorial: Steve Burke