Server GPUs for HPC (high-performance computing) applications aren't exactly within the realm of our gaming coverage, but the Tesla K80 is worthy of note purely from a technological standpoint. Nvidia's new Tesla graphics accelerator hosts dual-GK210 Kepler GPUs capable of double-precision floating-point performance of approximately 2.91 TFLOPs, or single-precision FP performance at a staggering 8.74TFLOPs. Strictly for reference – because the Tesla is not comparable to gaming cards – the GTX 980 GM204 pushes about 4.6TFLOPs single-precision FP performance.
The new Tesla K80 GPU is intended for use in high-end servers that perform scientific calculations or models that require high precision processing. In the world of gaming, double-precision performance is entirely unnecessary and can actually drag-down a game's framerate due to the excess processing; it's as if the GPU over-processes the game to a degree of accuracy that is (1) not really compatible with the software and (2) unnecessary. DP cards see implementation in projections, simulation, and modeling / render farms that handle intensive, parallel calculations.
NVIDIA Tesla K80 Specs
Features | Tesla K80¹ | Tesla K40 |
GPU | 2x Kepler GK210 | 1 Kepler GK110B |
Peak double precision floating point performance | 2.91 Tflops (GPU Boost Clocks) 1.87 Tflops (Base Clocks) | 1.66 Tflops (GPU Boost Clocks) 1.43 Tflops (Base Clocks) |
Peak single precision floating point performance | 8.74 Tflops (GPU Boost Clocks) 5.6 Tflops (Base Clocks) | 5 Tflops (GPU Boost Clocks) 4.29 Tflops (Base Clocks) |
Memory bandwidth (ECC off)² | 480 GB/sec (240 GB/sec per GPU) | 288 GB/sec |
Memory size (GDDR5) | 24 GB (12GB per GPU) | 12 GB |
CUDA cores | 4992 ( 2496 per GPU) | 2880 |
1. Tesla K80 specifications are shown as aggregate of two GPUs.
This is really the only reason we're writing about the K80 today – it has massively powerful tech under the hood. The GK210 has already seen implementation into HPC cards, but the K80 has taken two and mashed them onto a single board. In the process of this, nVidia doubled the VRAM to 24GB total on-board (still 12GB per GPU, though) and pushes a total non-ECC memory bandwidth of 480GB/s (240GB/s per GPU).
Each GPU hosts 2496 CUDA cores for a combined total of 4992, fully CUDA and DP-enabled for precision computing. Nvidia said about its new card:
“From energy exploration to machine learning, data scientists can crunch through petabytes of data with Tesla accelerators, up to 10x faster than with CPUs. For computational scientists, Tesla accelerators deliver the horsepower needed to run bigger simulations faster than ever.”
Read more here: http://www.nvidia.com/object/tesla-servers.html
- Steve "Lelldorianx" Burke.