Explaining Meltdown, Spectre Vulnerabilities & What Happens Next
Posted on January 4, 2018
There’s been a lot of talk of an “Intel bug” lately, to which we paid close attention upon the explosion of our Twitter, email, and YouTube accounts. The “bug” that has been discussed most commonly refers to a new attack vector that can break the bounding boxes of virtual environments, including virtual machines and virtual memory, that has been named “Meltdown.” This attack is known primarily to affect Intel at this time, with indeterminate effect on AMD and ARM. Another attack, “Spectre,” attacks through side channels in speculative execution and branch prediction, and is capable of fetching sensitive user information that is stored in physical memory. Both attacks are severe, and between the two of them, nearly every CPU on the market is affected in at least some capacity. The severity of the impact remains to be seen, and will be largely unveiled upon embargo lift, January 9th, at which time the companies will all be discussing solutions and shortcomings.
For this content piece, we’re focusing on coverage from a strict journalism and reporting perspective, as security and low-level processor exploits are far outside of our area of expertise. That said, a lot of you wanted to know our opinions or thoughts on the matter, so we decided to compile a report of research from around the web. Note that we are not providing opinion here, just facts, as we are not knowledgeable enough in the subject matter to hold strong opinions (well, outside of “this is bad”).
The Short Version
The shortest version of the issue is this: Linux and Windows operating systems are undergoing major reworks to cope with security vulnerabilities that are present on the last ten years of Intel CPUs, with the Spectre exploit discovered on AMD, ARM, and Intel CPUs. Everyone is affected in at least some capacity, but the exploits affecting each vendor vary. The hardware itself is not secure or compromised physically, but requires software changes to close security holes that are present. The concern from the community has been how those security updates will impact performance, as initial reports from GR Security indicated between 5% and 29% performance deficit in some tasks. Note that the commonly-cited 29% number was specifically derived from an i7-6700 test bench with page-table isolation, one of the proposed hardening techniques for increasing security. This was specifically found with a RAP Linux benchmark, and that number is not a blanket number for all performance changes. We’ll get to that momentarily.
What is the vulnerability?
Project Zero, a Google team, reports that there is a system call issue with the Kernel, which could lead to security vulnerabilities when virtual memory allocation is read. Project Zero reported this issue to not only Intel, but AMD and ARM back in June of 2017. There are two separate attacks that have been developed around this security vulnerability, codenamed “Meltdown” and “Spectre.” Meltdown is a breakout attack, capable of exiting the confines of virtual environments, and Spectre is a speculation execution attack. Meltdown is the worse of the two attacks, and is known presently to affect Intel, with indeterminate effect on AMD and ARM. The team is still researching. Spectre affects Intel, AMD, and ARM alike.
Both attacks are capable of intercepting user data that is currently being read, particularly when involving virtual memory allocation. These attacks give access to data stored in memory, which could include passwords, usernames, and other transactions that are being actively transacted between memory and the CPU. This is particularly concerning from an Enterprise or server standpoint, as one of the attacks leverages branch prediction exploits to, as a KVM guest, read the memory of a host kernel. This has severe implications for virtual machine users, primarily those who may slice servers into virtualized environments for customer use. Meltdown, for example, is capable of granting an attacker full access and control over all contents of the physical memory on the machine, breaking out of the boundaries of virtual machines.
Project Zero notes that Meltdown, quote, “breaks the mechanism that keeps applications from accessing arbitrary system memory,” and that Spectre “tricks other applications in to accessing arbitrary memory locations in their memory,” stating further that “both attacks use side channels to obtain information from the accessed memory location.” The researchers behind Meltdown and Spectre have published papers on these exploits, and have also published an FAQ for consumers. The very first question is simple: “Am I affected by the bug?” The answer is even simpler: “Most certainly, yes.”
The researchers note that the Meltdown exploit has worked on Intel CPUs dating back to 2011, and have also noted that they are not yet clear on whether the Meltdown exploit will work on ARM and AMD processors. When we asked Intel for a statement, we were sent this page, where the company alleges that these exploits may also affect AMD and ARM. We have asked AMD for a statement, but are currently in a holding pattern. AMD did, however, publish its own short note about these exploits. As of now, AMD has mostly noted that they’re aware of the vulnerabilities, and that the company is investigating further. Critically, AMD notes that they do not think these exploits have been used in “the public domain,” though the Meltdown researchers state that they are uncertain whether Meltdown-like attacks have been deployed.
As for the Spectre attack, the team notes that this exploit has been verified on Intel, AMD, and ARM processors, and notes that it will work against nearly every type of computer – including smartphones and cloud servers.
Google has confirmed that Android is affected, and has issued a security advisory about the attack, Intel issued a statement, and AMD issued a statement. Neither of the latter have much information, while Google has published some of the most detail on the subject, if you are interested in further reading.
Why Did This Happen?
The Meltdown whitepaper indicates a root-cause being branch prediction on CPUs, particularly speculative prediction – the foundation for Spectre’s name. Speculative prediction is something we have talked about before, primarily with GPU architectures. Branch prediction is the goal of a CPU or GPU to execute commands before those commands are explicitly issued to the CPU or GPU. The idea is to reduce wait time – maybe the command never comes, and so the work is wasted, but the potential upside is worth it, as pipelines are sped-up and tasks can execute with lower latency. In the instance of this attack, some data can get left behind in L1 Cache, which should be protected data. The exploit is able to gain access to this orphaned information, giving attackers access to potentially sensitive data.
Spectre is interesting: This one is able to attack userspace in virtual machines. The whitepaper details an example where JavaScript code running inside of a Google Chrome browser could be leveraged to read data sent through Chrome, like reading field or form inputs on webpages. This attack can be deployed through JavaScript download, as we understand it, which would mean an ad network compromise could have disastrous effect. This has already been tested as successful on the Chrome browser, and could theoretically work on other browsers.
This issue doesn’t come down to sacrificing security for speed, either: Kernel-level security would indicate that memory should be protected by other models, like address-space layout randomization. GN Patreon backer Steve Streza was able to provide a great example of this, quoting Steve: “With address-space layout randomization, or ASLR, basically re-links a program at random locations at launch time, so you can’t just say “give me the memory at 0xDEADBEEF because I know that’s the user’s password,” as well as Kernel ASLR, or KASLR, which does the same thing for the Kernel’s memory. Since the kernel’s memory isn’t protected anymore, it can be read at will by the attacker. It’s not a speed versus security thing; the security was supposed to be handled elsewhere.
What Happens Next?
Expect a large dump of information on January 9th, which is when the embargo lifts on everything that’s been kept behind doors. This will be the next major milestone, and will be a point at which the more general consumer community should obtain a better understanding of what’s going on and if it changes the way their PCs perform. The most immediate steps are being taken by the hardware and software manufacturers, with Microsoft fast-tracking updates to Windows for security. This is a software-level solution, as the hardware itself is not physically compromised. If you own affected CPUs, and you do – even if it’s Spectre affecting AMD – you can expect software-level patches to help resolve some of these concerns.
ARM has already developed software solutions that can mitigate the affect of Spectre. Linux kernel virtual memory systems are already being overhauled. Microsoft is working toward a January 9th patch, and has already issued improvements to fast-ring users. The question is whether or not any of these fixes will impact performance.
Performance Claims
Early claims by GR Security have gone through a game of telephone. Initial reports showed potential for 5% to 30% performance deficit in some specific tasks, with different tasks suffering in different ways. The performance loss comes by way of introducing more latency, done by nature of adding more layers of security. What this does not mean, however, is that every CPU in every application will instantly be 30% slower. Most of the major performance slowdowns have been reported with enterprise-level software, not necessarily consumer-level software. Phoronix, for instance, has already published some preliminary gaming benchmarks on Linux, and has shown performance deltas that are largely within margin of error. This is only one operating system with a few games, of course, so there is room for other games or for Windows to be impacted in greater ways.
What we’re most curious in is the impact to workstation and production-type applications, as they straddle the line between consumer and enterprise. We’re also unclear on if these performance losses can be mitigated with more time; for example, if these are panicked fixes that were thrown in place, they may be inefficient solutions.
This isn’t to downplay the significance of the exploit, nor the significance of lost performance, but to restore some sanity to the discussion: When citing numbers, know what they represent – it’s not just a flat 30% hit to every application on every CPU, for instance.
We are waiting on further performance testing and information. The updates will coincide with CES, which makes it difficult to get hands-on until after the show.
Editorial: Steve Burke
Video: Andrew Coleman