Liquid Metal vs. Thermal Paste Benchmarks: Intel’s Thermal Problem, Pt1
Posted on September 24, 2017
There are many reasons that Intel may have opted for TIM with their CPUs, and given that the company hasn’t offered a statement of substance, we really have no exact idea of why different materials are selected. Using TIM could be a matter of cost – as seems to be the default assumption – and spend, it could be an undisclosed engineering challenge to do with yields (with solder), it could be for government or legal grants pertaining to environmental conscientiousness, or related to conflict-free advertisements, or any number of other things. We don’t know. What we do know, and what we can test, is the efficacy of the TIM as opposed to alternatives. Intel’s statement pertaining to usage of TIM on HEDT (or any) CPUs effectively paraphrases as “as this relates to manufacturing process, we do not discuss it.” Intel sees this as a proprietary process, and so the subject matter is sensitive to share.
With an i7-7700K, TIM is perhaps more defensible – it’s certainly cheaper, and that’s a cheaper part. Once we start looking at the 7900X and other CPUs of a similar class, the ability to argue in favor of Dow Corning’s TIM weakens. To the credit of both Intel and Dow Corning, the TIM selected is highly durable to thermal cycling – it’ll last a long time, won’t need replacement, and shouldn’t exhibit any serious cracking or aging issues in any meaningful amount of time. The usable life of the platform will expire prior to the CPU’s operability, in essence.
But that doesn’t mean there aren’t better solutions. Intel has used solder before – there’s precedent for it – and certainly there exist thermal solutions with greater transfer capabilities than what’s used on most of Intel’s CPUs.
Part of encouraging a change is proving that there’s reason to change. If the Dow Corning TIM is “good enough,” then there’s really no reason to change it. It’s not, though, and that’s what we’re setting forth to prove today. This is part one of a two-part series that’s dedicated to exploring the efficacy of Intel’s TIM on HEDT CPUs, starting with the i9-7900X. By deploying liquid metal as an analog for “anything better,” we can illustrate the potential for (1) improved operation of the CPU through reduced thermal load on the die, (2) cheaper cooling solutions by removing the brute force requirement presently in place, and resultingly lower noise emission, and (3) increased overclocking headroom.
In this thermal analysis of Intel’s i9-7900X, we’ll benchmark liquid metal application on a delidded CPU versus a stock CPU with thermal paste, then determine whether the delid was “worth it.” The goal then, of course, is to explore whether Intel should offer better thermal solutions on their HEDT CPUs. We are not seeking for Intel to switch to liquid metal – mostly because that’s logistically insane and completely unsustainable – but we do ask that Intel considers alternatives somewhere between the current thermal paste and our Conductonaut Thermal Grizzly solution.
Delidding, Liquid Metal, & Test Methodology
After a few trial runs with liquid metal application and consultation with Der8auer and VSG, we eventually got the method down and improved thermals significantly. Delidding was done with the Delid DieMate X (we later learned that the vertical clamp is used for resealing, so is unnecessary to the delid process), after which process a cleaning of the IHS and die took place. We scraped off the silicone adhesive on the IHS-side to permit better contact and lower the copper plate closer to the die, leaving a thin layer of adhesive as a placement guide on the double-substrate Intel i9-7900X. Speaking with Der8auer, there is minimal thermal headroom to be gained by removing 100% of the silicone adhesive, and the delidding expert suggested that we leave a thin layer to better place the IHS. Going into these results, know that there is another 1-3C that could be gained from removing all the adhesive, according to Der8auer.
We eventually got liquid metal application down to a better science: A single dot of the stuff was placed on the IHS and CPU die, then spread into a thin layer across each surface. Nail polish was used for its nitrocellulose and akyl acetate, adding a protective layer over the SMDs on the substrate. We did not re-glue the IHS, but rather carried the CPU horizontally to its test bed, then clamped it down under the socket and cooler.
Conductonaut Thermal Grizzly compound was used with the Delid DieMate X. All testing was performed on Intel’s CPU with its stock TIM first, so there was never any process of re-applying thermal paste. The testing was completely planned and conducted prior to delidding, then followed again once delidded. We have a part 2 of this content scheduled for publication in the immediate future.
Prime95 28.5's automated torture test was executed through scripting, allowed to run for ~1380 seconds, in most cases. An idle period of ~180 seconds preceded this. We logged on a current clamp, a thermocouple reader (for ambient temperature second-to-second), a wall meter, and software. All core temperatures were averaged for the below readings. We took averages from the peak hottest steady state load period (in the event of power cycling), with an additional 10-second high taken to look for potential extreme hot periods. Liquid temperature was collected from the X62 sensor, but could not be collected for the Floe without a thermocouple mod.
Prime95 28.5 LFFT Torture at 4.5GHz / 1.175VID (Liquid Metal vs. Thermal Paste)
We’re starting with the most abusive and exaggerated test, then we’ll move on to Blender and lower frequency testing. With Prime95 running at 4.5GHz with a 1.175VID, we end up throttling on the CPU that’s cooled with Intel’s stock thermal paste. Some cores are hitting the TjMax of 105C, causing power throttling that we can show on our current clamp – and will momentarily. Keep in mind that our thermal differences here would be shown to an even greater degree if we hadn’t been throttling on the TIM CPU. Still, we’re at 99C peak steady-state temperature with TIM and 85-86C with liquid metal. That’s a reduction of more than 15C – more, because it’d be higher without thermal throttling. We’ll show that later. Moving to a 360mm Thermaltake Floe radiator with 3x Maglev fans doesn’t appear help us much when using TIM, as we’re still throttling and hitting 100C temperatures, but the throttling isn’t nearly as bad. The CPU is still at 100C AVG, but we’ve got a couple degrees headroom from TjMax on most cores. The 360mm radiator helps in other instances – we’ll show those more in part 2 of this content – but it’s just not enough here. Prime is abusive. We can survive with Blender, but that’s about it.
Liquid temperatures are sort of a question mark for us: Our original hypothesis was that liquid temperatures would increase with greater thermal transfer ability between the IHS and die, as the transfer of energy would be facilitated by the greater conductivity of liquid metal. Contrary to our belief, we’re not measuring that here. This is only one test of dozens – many of which will be in part 2 – that has such a noteworthy disparity in die and liquid temperatures; we generally saw liquid temperatures within error tolerances (effectively equal) in scenarios which did not involve heavy overclocking and overvolting. Let’s run a plot versus time to try and understand this behavior:
Temperature vs. Time – P95 4.5GHz at 1.175VID 7900X
Shown here, the CPUs are thermal stepping along with Prime’s power cycling, with each relatively aligned. Data alignment is handled prior to number averaging, with averaged highs taken at peak consumption, but you can see the stepping. The liquid temperature ramps faster on the 7900X with TIM, as does core temperature. When we look at the power chart, we’re showing severe throttling on the unit with TIM. Applying liquid metal brought down our temperatures enough to avoid throttling as a result of thermals, and so we see these boosted power consumption figures, as measured at the EPS12V cables.
Above: A sure sign of thermals throttling the CPU.
Intel TIM Causing Downclocking vs. Liquid Metal
This shows the severity of the problem. We’re able to achieve higher clocks and hold them with liquid metal, which speaks to the uselessness of Intel’s TIM when overclocking. It’s unfortunate, too, because Intel has the best overclocking candidates right now – and performance jumps as much as 30% when running our benchmarks with high OCs and liquid metal. We can easily get another 200MHz out of the clock by switching thermal interfaces, sometimes 300MHz, so it’s a real shame to see Intel squander their advantages in an increasingly competitive market. These gains are from reducing thermals below throttle territory, as we’d otherwise trip TjMax threshold protections. Overclocking is one of the areas where Intel really starts to compete with AMD’s parts fiercely, but they’re needlessly limiting their own advantages with bad thermal interface material.
Blender at 4.5GHz, 1.175VID on 7900X – TIM vs. Liquid Metal
Our 4.5GHz overclock with a 1.175VID couldn’t reasonably pass the Blender render without the help of maglev fans and a 360mm cooler, otherwise we entered throttle territory with our Kraken X62 and NZXT fans. The X62 was therefore discarded for this test, leaving us to the more effective 360mm radiator and maxed maglev fans. Throttling becomes a serious concern when considering that we’re testing on an open-air bench around 24C, and testing in a case can easily bring internal ambient temperatures up to 40C. That’s why this is important – that’s what we’re trying to convey to Intel about using better TIM. The CPU has all this potential in it, but it’ll thermal throttle quickly if using the stock TIM. It’s squandered, and we’re not sure why. It could be process or it could be money, but Intel has done better materials in the past.
Looking at the numbers, we’re at 63C with liquid metal and a 280mm cooler with NZXT’s stock fans. With TIM, the CPU was hitting 73C on a 360mm cooler with 3x maglev fans at max RPM. These are some of the best fans you can buy without going enterprise-grade, and the cooling setup was nearing 60dBA during testing. Even when it passed, it was unacceptably loud. That’s a 10-degree difference that favors the liquid metal mod when using a worse cooler with worse fans. That’s what this is about. Intel is creating a hidden cost to its CPUs: More noise and more money to get the things cooled under overclocks, and it’s awfully unfortunate. Even without overclocks, noise becomes a concern. The CPUs overclock exceptionally well – to the point that they start getting fiercely competitive – and Intel throws it all away. Given that Intel now has actual, real competition from AMD, it’s time to start paying attention to these “easy” points of improvement, relatively speaking.
Prime95 with Fixed 3.6GHz / 1.15VID Settings
Just to sort of prove a point, let’s lower clocks closer to stock. We’re now locking frequency to 3.6GHz and locking voltage to 1.15VID – it’s a little higher than necessary, but it won’t change. That’s important. Although we also tested auto and saw improvements there, auto moves voltage around based on need, and so fixing voltage to 1.15VID means we can completely control the environment.
We’re showing liquid metal dragging down temperatures to around 68C, with liquid temperatures now more evenly matched. We think this has to do with some sort of non-linear tipping point for either the cooler or the CPU, where CPU temperatures of 100C cause some sort of runaway scenario.
Regardless, the difference is 80C on the TIM unit versus 68C on the liquid metal unit and with the X62, and that’s without any overclocking. That’s a reduction of 12 degrees Celsius without overclocking, and as we saw earlier, the temperatures further scale with higher power throughput. Liquid temperatures here are within margin of test variance and error.
Blender – 3.6GHz / 1.15VID Fixed
This trend continues to the lower-clocked 3.6GHz and 1.15VID tests with Blender, where we see about a 10C improvement on the liquid metal version. Liquid temperatures are within error, so are effectively equal. We’re at about 51-52C on the liquid metal version and 61C on the TIM version.
Here’s a look at the power consumption for this test. We’re consistently drawing about 212-214W down the EPS12V cables with TIM, and running about 205W on the liquid metal mod. We’d need to do more tests to understand if this is just normal variance and error or if this is repeatable. This could be a power leakage reduction and efficiency improvement or could be margin of error. We’re not sure right now.
Here’s the test over time. Results are consistent: We’re at about 50-52C steady state with liquid metal, and about 60-62C steady state with TIM.
Conclusion: Liquid Metal vs. Thermal Paste on Delidded CPUs
We want to make clear – primarily for Intel – that this isn’t just looking at CPU thermal performance from the perspective of overclocking. Yes, a higher frequency can be had more easily by driving down temperatures, but that’s not the core of this. Intel has a few classes of users with these HEDT CPUs, one of which includes professional workstation users, another includes enthusiasts, and then some smaller grouping of “I want the best and have lots of money” users.
For almost all of these, perhaps excluding overclocking enthusiasts, noise is a concern. Cost of the cooling solution is a concern. The near-necessity to purchase high-end 240 & 280mm coolers and run them at max or near-max fan speeds means that there is a hidden cost to these CPUs, and it’s in the cooler. Overclocking starts demanding exotic solutions, custom loops, or 360-420mm radiators with high-end fans. Prices are high on all of these components and noise is high, and yet, not one of these solutions is remotely as efficient at improving thermal performance as a $5 liquid metal application. We are not asking Intel to use liquid metal, but we’d ask that the company considers something between the current Dow Corning TIM and our liquid metal stand-in as “anything better.” Solder has been done in the past, but it may be out of the cards depending on the real reason for sticking to TIM – this could be a matter of grants and environmental impact, it could be engineering challenges (that somehow exist now and not previously, granted), it could be cost, or it could be something else altogether. We don’t know.
What we do know is that, just from the above testing, Intel isn’t doing the best it can, and the company is failing to exploit its biggest advantage over AMD – significant overclocking headroom given controlled thermals. For non-overclocking workstation users, we defer to this chart from one of our previous noise tests on CLCs:
Above: Taken from one of our CPU cooler reviews, the higher dBA units are what would be deployed for keeping X299 CPUs reasonably cool.
Intel HEDT parts would put you, if we’re being generous, in the range of the ~50dBA CLCs at max RPMs. Existing in an ambient environment greater than ours (24C), like a case (we’ve seen up to 40C internal case ambient in some units), means that the cooler requirement boosts along with noise output.
There are more arguments for Intel to consider than just “we want to overclock higher because we’re enthusiasts,” and those arguments must be made to convince a giant like Intel to listen. Enthusiast overclocking is insignificant. Noise emissions, higher cooler costs, OEM fear of high liquid temperatures that could breach Asetek specification – these are all strong arguments against Intel’s present HEDT TIM practices. Just looking at some of the liquid temperatures hitting the 50-55C range, it’s clear that we’re rapidly approaching the 60C limiter before tripping Asetek’s out-of-spec concerns. Inside of a case, that’s easily done in heavy load scenarios (like AVX workloads). We’d implore OEM giants like Dell and HP to perform internal testing of CLC-enabled HEDT products under various workloads, particularly AVX, to determine if Intel’s TIM is forcing those companies to border on the cooler spec or forcing higher-end cooler purchases, thus potentially losing competitive edge.
If Intel is going to listen to anyone, it’s going to be OEMs.