Consoles have long touted the phrase “close to the metal” as a means to explain that game developers have fewer software-side obstacles between their application and the hardware. One of the largest obstacles and enablers faced by PC gaming has been DirectX, an API that enables wide-sweeping compatibility (and better backwards compatibility), but also throttles performance with its tremendous overhead. Mantle, an effort of debatable value, first marketed itself as a replacement for Dx11, proclaiming DirectX to be dead. Its primary advantage was along the lines of console development: Removing overhead to allow greater software-hardware performance. Then DirectX 12 showed up.
DirectX is a Microsoft API that has been a dominant programming interface for games for years. Mantle 1.0 is AMD's abandoned API and is being deprecated as developers shift to adopt Dx12. The remnants of Mantle's codebase are being adapted into OpenGL, a graphics API that asserts minimal dominance in the desktop market.
Although often associated with GPU performance, DirectX 12 and Mantle most directly benefit lower-end CPUs by allowing more drawcalls than previous APIs. Games demand thousands of draw calls for each frame – effectively every time a unique piece of geometry is observed on-screen – and this heavily loads the hardware. A “draw call” is a transaction wherein the CPU commands the GPU to render a new object on the screen, creating a CPU bottleneck with lower-end SKUs. Intel's affordable G3258 and AMD APUs are examples of devices that would throttle on draw calls, effectively binding high-end GPUs to the CPU. DirectX 12 and Mantle shift slightly more load to the GPU as a means to reduce CPU throttling, ultimately allowing greater draw call counts with more efficient parallelization and multi-threading.
We tested DirectX 12 vs. DirectX 11 and Mantle using 3DMark's latest benchmark, using a G3258, i7-4790K, Titan X, and 290X.
The Test: 3DMark's API Overhead Benchmark
3DMark freshly launched its API Overhead Benchmark, available now as part of the 3DMark suite. The overhead benchmark works by gradually incrementing draw call count until a point at which the test “fails,” defined as dropping below 30 frames per second. The test counts the amount of draw calls made per frame until this point, then logs the maximum amount of draw calls before failure. At this time, the next API test is run, logged, and the three are compared at the end of the benchmark.
3DMark's API overhead benchmark tests DirectX 11 single-threaded, Dx11 multi-threaded, Dx12, and Mantle. Because this test requires DirectX 12 to run, Dx12 testing is only available using the Windows 10 Technical Preview, an early development version of Windows that can be had for free. Note that Windows 10 is not yet final, the drivers for Windows 10 are not final, and DirectX 12 is not final; all of these together mean that users who'd like to run the same test may experience difficulty with device compatibility.
Note that results cannot be compared between GPUs. This is not like a standard FPS or frame-time benchmark. This is purely an API test, and so any delta between nVidia and AMD hardware should not be regarded as superiority or inferiority. We tested using a Titan X – just because it's new and we thought it'd be interesting to see how many draw calls it can pull off – and the 290X for Mantle testing.
As stated above, the API differences will be most present when testing lower-end CPUs. For this, we dug-up our G3258 Intel CPU as it should heavily throttle the GPU and show the biggest performance gains with newer APIs. For sake of high-end representation, we also tested the APIs using an i7-4790K to show the high-end capabilities.
All tests were conducted at 1080p. To reduce load on other aspects of the GPU, this test does not apply any lighting or visual effects to the geometry. To this end, it is not a real-world test, but a perfect tool for objective analysis.
Note: NVidia GPUs are not compatible with AMD's Mantle technology.
2015 Test Bench Hardware
G3258 Performance – A 10x Improvement with Dx12
The G3258 equipped with an R9 290X was able to produce 727,110 draw calls per second using the current-issue Dx11, which pales in comparison to Dx12's 10x improvement at 7,441,163 draw calls per second. AMD's Mantle drags just slightly behind at 6,870,569, but would outpace Dx12 if we were using a four- or six-core CPU.
Using a Titan X, we saw substantially improved Dx11 performance but slightly dipped Dx12 performance. Despite this, keep in mind that results between the two devices should not be directly compared in a fashion that determines superiority. The test is not built for this type of comparison. The Titan X, which assuredly forces a CPU throttle, pushed 1.3 million draw calls per second with Dx11 and 6.6 million draw calls per second with Dx12. It's not quite a 10x improvement, but still massive and exciting for users of lower-end CPUs.
4790K Performance with Dx12 vs. Mantle
Intel's i7-4790K is a hyper-threaded CPU with eight threads, almost guaranteeing that bottlenecking will be minimal and potentially relegated to the application or API layer. Using a 290X, Dx11 MT performance hovered around 1,095,996 draw calls per second – a ~40% gain over the G3258 – and hit 1.125 million with single-thread performance. The 290X pushed 18.7 million draw calls per second with Mantle and nearly 16 million with DirectX 12, a marked improvement over the current-gen Dx11 API.
Using a Titan X – and again, these results can't be directly compared / contrasted against the 290X – the system produced 2.7 million draw calls per second using Dx11 and 14.6 million draw calls per second using Dx12.
First off, despite Mantle's apparent lead in some instances, it's worth noting that effectively no games use the API and AMD itself has urged developers to instead adopt Dx12. Further on this front, Dx12 is still in development and may improve with time, as will the video drivers.
Analyzing GPU load during the API test reveals that load is overall greater with DirectX 12, if a bit sporadic at times. This means that the API is functioning as intended, shifting load to the more powerful parallel processor and reducing overhead on the CPU as a result.
It appears that DirectX 12 offers greater performance than Mantle in specific use-case scenarios, like using a limited-thread CPU (G3258 or similar).
DirectX 12 is incredibly promising for the future of video game graphics and will enable developers to further embellish titles without the same hardware-software limitations. PC-targeted titles will exhibit the greatest gains, but anything that adopts Dx12 will directly benefit users of lower-powered CPUs, freeing up budget to invest in better GPUs and other components.
- Steve “Lelldorianx” Burke.