Security products nowadays tend to focus on threats that operate at the high levels of the software stack, and they achieve reasonably good results. However, this leaves them unable to see what’s going on in the dark waters of firmware. If an attacker has already gained privileged access to the system and installed a firmware implant, these products are useless.
Very few security products examine firmware, and those that do only do so from the operating system level, detecting the presence of implants only after they’ve successfully installed and compromised the system. More complex implants can also use their privileged position in the system to avoid detection and subvert OS-level security products.
For these reasons, firmware rootkits and implants are one of the most dangerous threats to PCs, and they pose an even bigger threat they pose to modern cloud platforms, where a single misconfigured or compromised guest operating system endangers all other guests, exposing their memory to malicious manipulation.
Detecting firmware anomalies is a difficult technical challenge for many reasons. The UEFI firmware codebases provided by various vendors are all different, and the existing methods of detecting anomalies aren’t effective in every case. Attackers can also use both the false positives and false negatives of a detection scheme to their advantage, and they can even take over the interfaces that OS-level detection algorithms use to access and examine the firmware.
The only viable way to protect against firmware rootkits is to prevent their installation. Detection and other mitigations don’t work; instead, we have to block the possible infection vectors. Solutions for detecting or preventing firmware threats work only when the developer has full control over both the software and hardware stacks, like Apple or Microsoft does. Third-party solutions will always have blind spots.
In this chapter, we’ll outline most of the known vulnerabilities and exploitation vectors used for infecting UEFI firmware. We’ll first examine the vulnerable firmware, classify types of firmware weaknesses and vulnerabilities, and analyze existing firmware security measures. We will then describe vulnerabilities in Intel Boot Guard, SMM modules, the S3 Boot Script, and the Intel Management Engine.
We’ll begin by going over the specific firmware that attackers could target with a malicious update. Updates are the most effective method of infection.
Vendors will typically describe UEFI firmware updates broadly as BIOS updates, because the BIOS is the main firmware included, but a typical update also delivers many other kinds of embedded firmware to the various hardware units inside the motherboard, or even the CPU.
A compromised BIOS update destroys the integrity guarantees for all other firmware updates managed by the BIOS (some of these updates, like Intel microcode, have additional authentication methods and don’t rely solely on the BIOS), so any vulnerability that bypasses authentication for a BIOS update image also opens the door for the delivery of malicious rootkits or implants to any of these units.
Figure 16-1 shows the typical firmware units managed by the BIOS, all of which are susceptible to malicious BIOS updates.
Figure 16-1: Overview of different firmware in modern x86-based computers
Here are brief descriptions of each type of firmware:
Power Management Unit (PMU) A microcontroller that controls the power functions and transitions of a PC between different power states, such as sleep and hibernate. It contains its own firmware and a low-power processor.
Intel Embedded Controller (EC) A microcontroller that is always on. It supports multiple features, such as turning the computer on and off, processing signals from the keyboard, calculating thermal measurements, and controlling the fan. It communicates with the main CPU over ACPI, SMBus, or shared memory. The EC, along with the Intel Management Engine described shortly, can function as a security root of trust when the System Management Mode is compromised. The Intel BIOS Guard technology (vendor-specific implementations), for example, uses the EC to control the read/write access to SPI flash.
Intel Integrated Sensor Hub (ISH) A microcontroller responsible for sensors, such as device rotation detectors and automatic backlight adjustors. It can also be responsible for some low-power sleep states for those sensors.
Graphics Processing Unit (GPU) An integrated graphics processor (iGPU) that is part of the Platform Controller Hub (PCH) design in most modern Intel x86-based computers. GPUs have their own advanced firmware and computing units focused on generating graphics, such as shaders.
Intel Gigabit Network Intel-integrated ethernet network cards for x86-based computers are represented as PCIe devices connected to PCH and contain their own firmware, delivered via BIOS update images.
Intel CPU Microcode The CPU’s internal firmware, which is the interpretive layer that interprets the ISA. The programmer-visible instruction set architecture (ISA) is a part of microcode, but some instructions can be more deeply integrated on the hardware level. Intel microcode is a layer of hardware-level instructions that implement higher-level machine code instructions and the internal state machine sequencing in many digital processing elements.
Authenticated Code Module (ACM) A signed binary blob executed in cache memory. Intel microcode loads and executes within protected internal CPU memory, which is called Authenticated Code RAM (ACRAM), or Cache-as-RAM (CAR). This fast memory is initialized early in the boot process. It functions as regular RAM before the main RAM is activated and before the reset-vector code for early boot ACM code (Intel Boot Guard) runs; it can also be loaded later in the boot process. Later, it is repurposed for general-purpose caching. The ACM is signed by an RSA binary blob with a header that defines its entry point. Modern Intel computers can have multiple ACMs for different purposes, but they are mostly used to support additional platform security features.
Intel Management Engine (ME) A microcontroller that provides the root-of-trust functionality for multiple security features developed by Intel, including the software interface to the firmware Trusted Platform Module, or fTPM (usually the TPM is a specialized chip on an endpoint device for hardware-based authentication that also contains separate firmware of its own). Since the sixth generation of the Intel CPU, the Intel ME is an x86-based microcontroller.
Intel Active Management Technology (AMT) The hardware and firmware platform used for managing personal computers and servers remotely. It provides remote access to monitors, keyboards, and other devices. It comprises Intel’s chipset-based Baseboard Management Controller technology for client-oriented platforms (discussed next), integrated into Intel’s ME.
Baseboard Management Controller (BMC) A set of computer interface specifications for an autonomous computer subsystem that provides management and monitoring capabilities independently of the host system’s CPU, UEFI firmware, and real-time operating system. The BMC is usually implemented on a separate chip with its own ethernet network interface and firmware.
System Management Controller (SMC) A microcontroller on the logic board that controls the power functions and sensors. It’s most commonly found in computers produced by Apple.
Every firmware unit is an opportunity for an attacker to store and execute code, and all units depend on one another to maintain their integrity. As an example, Alex Matrosov identified an issue in recent Gigabyte hardware wherein the ME allowed its memory regions to be written to and read from the BIOS. When combined with a weak Intel Boot Guard configuration, this issue allowed us to bypass the hardware’s Boot Guard implementation completely. (See CVE-2017–11313 and CVE-2017–11314 for more information about this vulnerability, which the vendor has since confirmed and patched.) We’ll discuss implementations of Boot Guard and possible ways to bypass them later in this chapter.
The primary objective of a BIOS rootkit is to maintain a persistent and stealthy infection, just like the kernel-mode rootkits and MBR/VBR bootkits described in the book so far. However, a BIOS rootkit may have additional interesting goals. It might, for instance, try to temporarily gain control of the System Management Mode (SMM) or nonprivileged Driver Execution Environment (DXE; executed outside of SMM) to conduct hidden operations with memory or the filesystem. Even a nonpersistent attack executed from the SMM can bypass security boundaries in modern Windows systems, including virtualization-based security (VBS) and instances of virtual machine guests.
Before digging into the vulnerabilities, let’s classify the kinds of security flaws a BIOS implant installation might target. All the classes of vulnerabilities shown in Figure 16-2 can help an attacker violate security boundaries and install persistent implants.
Intel researchers first attempted to classify UEFI firmware vulnerabilities according to the potential impact of an attack on that vulnerability. They presented their classifications at Black Hat USA 2017 in Las Vegas in their talk “Firmware Is the New Black—Analyzing Past Three Years of BIOS/UEFI Security Vulnerabilities” (https://www.youtube.com/watch?v=SeZO5AYsBCw), which covered different classes of security issues as well as some mitigations. One of its most important contributions is the statistics on the growth in the total number of security issues processed by Intel PSIRT.
We have a different classification of security issues related to UEFI firmware that focuses on the impact of firmware rootkits, shown in Figure 16-2.
NOTE
The threat model represented in Figure 16-2 covers only flows related to UEFI firmware, but the scope of security issues for Intel ME and AMT is increasing significantly. Additionally, in the past few years, the BMC has emerged as a very important security asset for remote management server platforms and is getting a lot of attention from researchers.
Figure 16-2: A classification of BIOS vulnerabilities useful for installing BIOS implants
We can categorize the vulnerability classes proposed in Figure 16-2 by how they are used, giving us two major groups: post-exploitation and compromised supply chain.
Post-exploitation vulnerabilities are usually used as the second stage in delivering malicious payloads (this exploitation scheme is explained in Chapter 15). This is the main category of vulnerabilities that attackers take advantage of to install both persistent and non-persistent implants after they’ve successfully exploited previous stages of attack. The following are the classes for the main implants, exploits, and vulnerabilities in this category.
Secure Boot bypass Attackers focus on compromising the Secure Boot process over exploiting root of trust (that is, full compromise) or another vulnerability in one of the boot stages. Secure Boot bypasses can occur at different boot stages and can be leveraged by the attacker against all the subsequent layers and their trust mechanisms.
SMM privilege escalation SMM has a lot of power on x86 hardware, as almost all privilege escalation issues for SMM end up as code execution issues. Privilege escalation to SMM is often one of the final stages of a BIOS implant installation.
UEFI firmware implant A UEFI firmware implant is the final stage of a persistent BIOS implant installation. The attacker can install the implant on various levels of the UEFI firmware, either as a modified legitimate module or a stand-alone driver like DXE or PEI, which we’ll discuss later.
Persistent implant A persistent implant is one that can survive full reboot and shutdown cycles. In some cases, in order to survive the post-update process, it can modify BIOS update images before those updates are installed.
Non-persistent implant A non-persistent implant is one that doesn’t survive full reboot and shutdown cycles. These implants might provide privilege escalation and code execution inside the OS with protected hardware virtualization (such as Intel VT-x) and layers of trusted execution (such as MS VBS). They can also be used as covert channels to deliver malicious payloads to the kernel mode of the operating system.
Compromised supply chain attacks take advantage of mistakes made by the BIOS development team or the OEM hardware vendor, or they involve deliberate misconfigurations of the target software that provide attackers with a deniable bypass of the platform’s security features.
In supply chain attacks, an attacker gets access to the hardware during its production and manufacturing processes and injects malicious modifications to the firmware or installs malicious peripheral devices before the hardware ever gets to the consumer. Supply chain attacks can also happen remotely, as when an attacker gains access to the firmware developer’s internal network (or sometimes a vendor website) and delivers malicious modifications directly into the source code repository or build server.
Supply chain attacks with physical access involve covertly meddling with the target platform, and they sometimes have similarities with evil maid attacks, when attackers have physical access for a limited time during which they exploit a supply chain vulnerability. These attacks take advantage of situations in which the hardware’s owner can’t monitor physical access to the hardware—such as when the owner leaves a laptop in a checked bag, surrenders it for a foreign customs inspection, or simply forgets it in a hotel room. An attacker can use these opportunities to misconfigure hardware and firmware to deliver BIOS implants or just physically flash malicious firmware to the SPI flash chip.
Most of the following issues apply to supply chain and evil maid attack scenarios.
Misconfigured protections By attacking the hardware or firmware during the development process or post-production stage, an attacker can misconfigure technology protections to allow them to be bypassed easily later.
Nonsecure root of trust This vulnerability involves compromising the root of trust from the operating system via its communication interfaces with firmware (SMM, for example).
Malicious peripheral devices This kind of attack involves implanting peripheral devices during the production or delivery stages. Malicious devices can be used in multiple ways, such as for Direct Memory Access (DMA) attacks.
Implanted BIOS updates An attacker may compromise a vendor website or another remote update mechanism and use it to deliver an infected BIOS update. The points of compromise can include the vendor’s build servers, developer systems, or stolen digital certificates with the vendor’s private keys.
Unauthenticated BIOS update process Vendors may break the authentication process for BIOS updates, whether intentionally or not, allowing attackers to apply any modifications they want to the update images.
Outdated BIOS with known security issues BIOS developers might continue to use older, vulnerable code versions of BIOS firmware, even after the underlying codebase has been patched, which makes the firmware vulnerable to attack. An outdated version of the BIOS originally delivered by the hardware vendor is likely to persist, without updates, on the users’ PCs or data center servers. This is one of the most common security failures involving BIOS firmware.
It’s very hard to mitigate risks related to supply chains without making radical changes to the development and production lifecycles. The typical production client or server platform includes a lot of third-party components, in both software and hardware. Most companies that don’t own their full production cycle don’t care too much about security, nor can they really afford to.
The situation is exacerbated by the general lack of information and resources related to BIOS security configuration and to chipset configuration. The NIST 800-147 (“BIOS Protection Guidelines”) and NIST 800-147B (“BIOS Protection Guidelines for Servers”) publications serve as a useful starting point but are quickly becoming outdated since their initial release in 2011 and update for servers in 2014.
Let’s dive into the details of some UEFI firmware attacks to fill some of these gaps in widespread knowledge.
In this section, we’ll go over some classes of vulnerabilities that allow an attacker to bypass Secure Boot; we’ll discuss specific Secure Boot implementation details in the next chapter.
Previously, any security issue that allowed the attacker to execute code in the SMM environment could bypass Secure Boot. Though some modern hardware platforms, even with recent hardware updates, are still vulnerable to SMM-based Secure Boot attacks, most enterprise vendors have shifted to using the newest Intel security features, which make these attacks harder. Today’s Intel technologies, such as Intel Boot Guard and BIOS Guard (both of which will be discussed later in this chapter), move the boot process’s root of trust from SMM to a more secure environment: the Intel ME firmware/hardware.
The first version of UEFI Secure Boot was introduced in 2012. Its main components included a root of trust implemented in the DXE boot phase (one of the latest stages in UEFI firmware boot, just before the OS receives control). That meant this early implementation of Secure Boot only really ensured the integrity of the OS bootloaders, not the BIOS itself.
Soon the weaknesses of this design became clear, and in the next implementation, the root of trust was moved to PEI, an early platform initialization stage, where it was locked before DXE. That security boundary also proved weak. Since 2013, with the release of the Intel Boot Guard technology, the root of trust has been locked into hardware by way of the TPM chip (or equivalent functionality implemented in ME firmware to reduce the cost of support). Field-programmable fuses (FPFs) are located in the motherboard chipset (the PCH component, programmable via ME firmware).
Before we dig into the history of the relevant exploitations that motivated these redesigns, let’s discuss how basic BIOS protection technologies work.
Figure 16-3 shows a high-level view of the technologies used to protect persistent SPI flash storage. The SMM was originally allowed both read and write access to SPI flash storage as a means of implementing routine BIOS updates. This meant the integrity of the BIOS was dependent on the code quality of any code running in the SMM, as any such code would be able to modify the BIOS in the SPI storage. The security boundary was therefore as weak as the weakest code ever run in SMM that had access to the memory region outside of it. As a result, platform developers took steps to separate BIOS updates from the rest of the SMM functionality, introducing a series of additional security controls, such as Intel BIOS Guard.
Figure 16-3: High-level representation of BIOS security technologies
We discussed some of the controls shown in Figure 16-3 in “(In)Effectiveness of Memory Protection Bits” on page 263: the BIOS Control Bit Protection (BIOS_CNTL), the Flash Configuration Lock-Down (FLOCKDN), and the SPI flash Write Protection (PRx). However, the BIOS_CNTL protections are effective only against an attacker attempting to modify the BIOS from the OS, and they can be bypassed by any code execution vulnerability from SMM (SMI handlers accessible from outside), as SMM code can freely change these protection bits. Basically, BIOS_CNTL only creates an illusion of security.
Originally, the SMM had both read and write access to SPI Flash storage so it could implement routine BIOS updates. This made the integrity of the BIOS dependent on the quality of any code running in the SMM with calls to outside memory regions, as any such code was able to modify the BIOS in the SPI storage. This security boundary proved rather weak—as weak as the weakest code ever running in SMM.
As a result, platform developers took steps to separate BIOS updates from the rest of SMM functionality. Many of these controls themselves were rather weak. An example is the BIOS Control Bit Protection (BIOS_CNTL), which is effective only against an attacker attempting to modify the BIOS from the operating system; it can be bypassed by any code execution vulnerability from SMM, since SMM code can freely change these protection bits.
The PRx control is more effective because its policies can’t be changed from the SMM. However, as we’ll discuss shortly, many vendors don’t use PRx protections—including Apple and, surprisingly, Intel, the inventor of this protection technology.
Table 16-1 summarizes the state of active protection technologies based on security lock bits on x86-based hardware used by popular vendors as of January 2018. Here, RP indicates read protections and WP write protections.
Table 16-1: Security Level of Popular Hardware Vendors
Vendor name |
BLE |
SMM_BWP |
PRx |
Authenticated update |
ASUS |
Active |
Active |
Not active |
Not active |
MSI |
Not active |
Not active |
Not active |
Not active |
Gigabyte |
Active |
Active |
Not active |
Not active |
Dell |
Active |
Active |
RP/WP |
Active |
Lenovo |
Active |
Active |
RP |
Active |
HP |
Active |
Active |
RP/WP |
Active |
Intel |
Active |
Active |
Not active |
Active |
Apple |
Not active |
Not active |
WP |
Active |
As you can see, vendors differ wildly in their approaches to BIOS security. Some of these vendors don’t even authenticate BIOS updates, thereby creating a serious security concern because it is far easier to install implants (unless the vendor enforces Intel Boot Guard policies).
Moreover, PRx protections must be configured correctly to be effective. Listing 16-1 shows an example of poorly configured flash regions with all PRx segment definitions set to zero, rendering them useless.
[*] BIOS Region: Base = 0x00800000, Limit = 0x00FFFFFF
SPI Protected Ranges
------------------------------------------------------------
PRx (offset) | Value | Base | Limit | WP? | RP?
------------------------------------------------------------
PR0 (74) | 00000000 | 00000000 | 00000000 | 0 | 0
PR1 (78) | 00000000 | 00000000 | 00000000 | 0 | 0
PR2 (7C) | 00000000 | 00000000 | 00000000 | 0 | 0
PR3 (80) | 00000000 | 00000000 | 00000000 | 0 | 0
PR4 (84) | 00000000 | 00000000 | 00000000 | 0 | 0
Listing 16-1: Poorly configured PRx access policies (dumped by Chipsec tool)
We’ve also seen some vendors configure policies for read protection only, which still allows the attacker to modify SPI flash. Furthermore, PRx doesn’t guarantee any type of integrity measurements on the actual contents of SPI, as it only implements bit-based locking of direct read/write access in the very early PEI stage of the boot process.
The reason vendors like Apple and Intel tend to disable PRx protections is that these protections require an immediate reboot, making updating the BIOS less convenient. Without PRx protections, a vendor’s BIOS update tool can write the new BIOS image into a free region of physical memory using OS APIs, then call an SMI interrupt, so that some helper code residing in the SMM can take the image from that region and write it into SPI flash. The updated SPI flash image takes control on the next reboot, but that reboot can occur in the future at the user’s convenience.
When PRx is enabled and configured correctly to protect the appropriate regions of the SPI from modifications made by SMM code, the BIOS updater tool no longer can use the SMM to modify the BIOS. Instead, it must store the update image in dynamic random access memory (DRAM) and trigger an immediate reboot. The helper code to install the update must be part of a special early boot-stage driver, which runs before PRx protections are activated and transfers the update image from DRAM to SPI. This method of update sometimes requires a reboot (or a call to the SMI handler directly without reboot) right when the tool runs, which is a lot less convenient for the user.
No matter which route the BIOS updater takes, it’s critical that the helper code authenticate the update image before installing it. Otherwise, PRx or no PRx, reboot or no reboot, the helper code will happily install an altered BIOS image with an implant, so long as the attacker manages to modify it at some point before the helper runs. As Table 16-1 shows, some hardware vendors don’t authenticate firmware updates, making the attacker’s job as easy as tampering with the update image.
In September 2018, the antivirus company ESET released a research report about LOJAX, a rootkit that attacked UEFI firmware from the OS.1 All of the techniques used by the LOJAX rootkit were well-known at the time of the attack, having been used in other discovered malware over the previous five years. LOJAX used tactics similar to those of the Hacking Team’s UEFI rootkit: it abused the unauthenticated Computrace components stored in the NTFS, as we discussed in Chapter 15. Thus, the LOJAX rootkit doesn’t use any new vulnerabilities; its only novelty is in how it infects the targets—it checks the systems for unauthenticated access to the SPI flash and, finding it, delivers a modified BIOS update file.
Loose approaches to BIOS security present plenty of opportunities for attacks. An attacker can scan a system at runtime to find the right vulnerable targets and the right infection vector, both of which are plentiful. The LOJAX rootkit infector checked for several protections, including the BIOS Lock Bit (BLE) and the SMM BIOS Write Protection Bit (SMM_BWP). If the firmware hadn’t been authenticated, or if it hadn’t checked the integrity of a BIOS update image before transferring it to SPI storage, the attacker could deliver modified updates directly from the OS. LOJAX used the Speed Racer vulnerability (VU#766164, originally discovered by Corey Kallenberg in 2014) to bypass SPI flash protection bits via a race condition. You can detect this vulnerability and other weaknesses related to BIOS lock protection bits with the chipsec_main -m common.bios_wp command.
This example shows that a security boundary is only as strong as its weakest component. No matter what other protections the platform may have, Computrace’s loose handling of code authentication undermined them, reenabling the OS-side attack vector that the other protections sought to eliminate. It only takes one breach of a sea wall to flood the plains.
How does Secure Boot change this threat landscape? The short answer is, it depends on its implementation. Older versions, implemented before 2016 without Intel Boot Guard and BIOS Guard technologies, will be in danger, because in these old implementations, the root of trust is in the SPI flash and can be overwritten.
When the first version of UEFI Secure Boot was introduced in 2012, its main components included a root of trust implemented in the DXE boot phase, which is one of the latest stages in UEFI firmware boot, occurring just before the OS receives control. Because the root of trust came so late in the boot process, this early Secure Boot implementation really assured only the integrity of the OS bootloaders, rather than the integrity of the BIOS itself. The weakness of this design soon became clear, and in the next implementation, the root of trust was moved to PEI, an early platform initialization stage, to lock the root of trust before DXE. That security boundary also proved weak.
Boot Guard and BIOS Guard, more recent additions to Secure Boot, address this weakness: Boot Guard moved the root of trust from SPI into hardware, and BIOS Guard moved the task of updating the contents of the SPI flash from SMM to a separate chip (the Intel Embedded Controller, or EC) and removed the permissions that allowed the SMM to write to the SPI flash.
Another consideration for moving the root of trust earlier in the boot process, and into hardware, is minimizing the boot time of a trusted platform. You could imagine a boot protection scheme that would verify digital signatures over dozens of individual available EFI images rather than a single image that includes all the drivers. However, this would be too slow for today’s world, in which platform vendors look to shave milliseconds off the bootup time.
At this point, you might be asking: with so many moving parts involved in the Secure Boot process, how can we avoid situations in which a trivial bug destroys all of its security guarantees? (We’ll cover the full process of Secure Boot in Chapter 17.) The best answer, to date, is to have tools that make sure every component plays its appointed role and that every stage of the boot process takes place in the exact intended order. That is to say, we need a formal model of the process that automated code analysis tools can validate—and that means that the simpler the model, the more confidence we have that it will be checked correctly.
Secure Boot relies on a chain of trust: the intended execution path begins with the root of trust locked into the hardware or SPI flash storage and moves through the stages of the Secure Boot process, which can proceed only in a particular order and only if all of the conditions and policies at every stage are satisfied.
Formally speaking, we call this model a finite state machine, where different states represent different stages of the system boot process. If any of the stages has nondeterministic behavior—for example, if a stage can switch the boot process into a different mode or have multiple exits—our Secure Boot process becomes a nondeterministic finite state machine. This makes the task of automatically verifying the Secure Boot process significantly harder, because it exponentially increases the number of execution paths we must verify. In our opinion, nondeterministic behavior in Secure Boot should be regarded as a design mistake that is likely to lead to costly vulnerabilities, as in the case of the S3 Boot Script vulnerability discussed later in this chapter.
In this section, we’ll discuss how Intel Boot Guard technology works, then explore some of its vulnerabilities. Although Intel has no publicly available official documentation about Boot Guard, our research and that of others allow us to paint a coherent picture of this remarkable technology.
Boot Guard divides Secure Boot into two phases: in the first phase, Boot Guard authenticates everything located in the BIOS section of the SPI storage, and in the second stage, Secure Boot handles the rest of the boot process, including authentication of the OS bootloader (Figure 16-4).
Figure 16-4: The boot process with active Intel Boot Guard technology
The Intel Boot Guard technology spans several levels of the CPU architecture and the related abstractions. One benefit is that it doesn’t need to trust the SPI storage, so it’s able to avoid the vulnerabilities we discussed earlier in this chapter. Boot Guard separates integrity checking of the BIOS stored in the SPI flash from the BIOS itself by using the Authenticated Code Module (ACM), which is signed by Intel, to verify the integrity of the BIOS image before allowing it to execute. With Boot Guard activated on a platform, the root of trust moves inside the Intel microarchitecture, wherein the CPU’s microcode parses the ACM contents and checks the digital signature verification routines implemented in the ACM, which in turn will check the BIOS signature.
By contrast, the original UEFI Secure Boot root of trust resided in the UEFI DXE phase, almost the last one before control is passed to the OS bootloader—which is, as we’ve mentioned before, very late in the game. If UEFI firmware is compromised at the DXE stage, an attacker can completely bypass or disable Secure Boot. Without hardware-assisted verification, there is no way to guarantee the integrity of the boot process stages that take place before the DXE phase (PEI implementation also has confirmed weaknesses), including the integrity of the DXE drivers themselves.
Boot Guard addresses this problem by moving the root of trust for Secure Boot from the UEFI firmware to the hardware itself. For example, Verified Boot—a recent variant of Boot Guard that Intel introduced in 2013, which we’ll discuss in more detail in the next chapter—locks the hash of an OEM public key within the field programmable fuse (FPF) store. The FPF can be programmed only one time, and the hardware vendor locks the configuration by the end of the manufacturing process (in some cases this can be revoked, but because these are edge cases, we won’t discuss them here).
Boot Guard’s efficacy depends on all of its components working together, with no layer containing any vulnerabilities for the attacker to execute code or to elevate privileges in order to interfere with other components of the multilayer Secure Boot scheme. Alex Matrosov’s “Betraying the BIOS: Where the Guardians of the BIOS Are Failing” (https://www.youtube.com/watch?v=Dfl2JI2eLc8), presented at Black Hat USA 2017, revealed that an attacker could successfully target the scheme by interfering with the bit flags set by the lower levels to pass the information about their state of integrity to the upper levels.
As has been demonstrated, firmware cannot be trusted because most SMM attacks can compromise it. Even the Measured Boot scheme, which relies on the TPM as its root of trust, can be compromised, because the measuring code itself runs in SMM and can in many cases be modified from the SMM, even though the key stored in the TPM hardware cannot be changed by SMM. Although some attacks on the TPM chip are possible, the SMM privilege–wielding attackers do not need them, as they would simply attack the firmware’s interfaces to the TPM. In 2013 Intel introduced Verified Boot, which we just mentioned, to address this Measured Boot weakness.
The Boot Guard ACM verification logic measures the initial boot block (IBB) and checks its integrity before passing control to the IBB entry point. If IBB verification fails, the boot process will generally be interrupted depending on the policy. The IBB part of the UEFI firmware (BIOS) executes on a normal CPU (not isolated or authenticated). Next, IBB continues the boot process, following the Boot Guard policies in the verified or measured mode to the platform initialization phase. The PEI driver verifies the integrity of the DXE drivers and transitions the chain of trust to the DXE phase. The DXE phase then continues the chain of trust to the operating system bootloader. Table 16-2 presents research data about the state of security in each of these stages across various hardware vendors.
Table 16-2: How Different Hardware Vendors Configure Security (as of January 2018)
Vendor name |
ME access |
EC access |
CPU debugging (DCI) |
Boot Guard |
Forced Boot Guard ACM |
Boot Guard FPF |
BIOS Guard |
ASUS VivoMini |
Disabled |
Disabled |
Enabled |
Disabled |
Disabled |
Disabled |
Disabled |
MSI Cubi2 |
Disabled |
Disabled |
Enabled |
Disabled |
Disabled |
Disabled |
Disabled |
Gigabyte Brix |
Read/write enabled |
Read/write enabled |
Enabled |
Measured verified |
Enabled (FPF not set) |
Not set |
Disabled |
Dell |
Disabled |
Disabled |
Enabled |
Measured verified |
Enabled |
Enabled |
Enabled |
Lenovo ThinkCenter |
Disabled |
Disabled |
Enabled |
Disabled |
Disabled |
Disabled |
Disabled |
HP Elitedesk |
Disabled |
Disabled |
Enabled |
Disabled |
Disabled |
Disabled |
Disabled |
Intel NUC |
Disabled |
Disabled |
Enabled |
Disabled |
Disabled |
Disabled |
Disabled |
Apple |
Read enabled |
Disabled |
Disabled |
Not supported |
Not supported |
Not supported |
Not supported |
As you can see, catastrophic misconfigurations of these security options are not merely theoretical. For example, some vendors have not written their hashes in the FPF, or did so but didn’t subsequently disable the manufacturing mode that allows such a write. As a result, the attackers can write FPF keys of their own and then lock the system, tying it forever to their own root and chain of trust (though if the hardware manufacturer has developed a revocation process, a fuse overwrite for revocation exists). More precisely, the FPF can be written by the ME as its memory regions when the ME is still in the manufacturing mode; the ME in that mode, in turn, can be accessed from the OS for both reads and writes. In this way, the attacker really gets the keys to the kingdom.
Additionally, most of the researched Intel-based hardware had CPU debugging enabled, so all the doors were open to attackers with physical access to the CPU. Some of the platforms included support for the Intel BIOS Guard technology, but it was disabled in the manufacturing process to simplify BIOS updates.
Thus, Table 16-2 provides multiple excellent examples of supply chain security problems, wherein the vendors trying to simplify supporting hardware have created critical security holes.
Let’s now look at another vector for exploiting UEFI firmware from the OS: leveraging mistakes in the SMM modules.
We’ve discussed SMM and SMI handlers in previous chapters, but we’ll review both concepts now as a refresher.
SMM is a highly privileged execution mode of x86 processors. It was designed to implement platform-specific management functions independently of the OS. These functions include advanced power management, secure firmware updates, and configuration of UEFI Secure Boot variables.
The key design feature of SMM is that it provides a separate execution environment, invisible to the OS. The code and data used in SMM are stored in a hardware-protected memory region, called SMRAM, that is accessible only to code running within SMM. To enter SMM, the CPU generates a System Management Interrupt (SMI), a special interrupt intended to be raised by the OS software.
SMI handlers are the platform firmware’s privileged services and functions. The SMI serves as a bridge between the OS and these SMI handlers. Once all the necessary code and data have been loaded in SMRAM, the firmware locks the memory region so that it can be accessed only by code running in SMM, preventing the OS from accessing it.
Given SMM’s high privilege level, SMI handlers present a very interesting target for implants and rootkits. Any vulnerability in these handlers may present an opportunity for the attacker to elevate privileges to that of the SMM, the so-called Ring –2.
As with other multilayer models, such as the kernel-userland separation, the best way to attack the privileged code is to target any data that can be consumed from outside the isolated privileged memory region. For SMM, this is any memory outside the SMRAM. For SMM’s security model, the attacker is the OS or privileged software (such as BIOS update tools); thus, any location in the OS that is outside the SMRAM is suspect because it can at times be manipulated by an attacker (potentially even after it has been somehow checked). Potential targets include function pointers consumed by the SMM code that can point execution to areas outside SMRAM or any buffers with data that SMM code reads/parses.
Nowadays, UEFI firmware developers try to reduce this attack surface by minimizing the number of SMI handlers communicating directly with the outside world (Ring 0—the kernel mode of the operating system), as well as by finding new ways to structure and check these interactions. But this work has only just started, and security problems with SMI handlers will likely persist for quite some time.
Of course, the code in SMM can receive some data from the OS to be useful. However, in order to remain secure, just as with other multilayer models, the SMM code must never act on the outside data unless it’s been copied and checked inside the SMRAM. Any data that’s been checked but left outside the SMRAM can’t be trusted, as the attacker could potentially race to change it between the point of check and the point of use. Moreover, any data that has been copied in shouldn’t reference any unchecked and uncopied outside data.
This sounds simple, but languages like C don’t natively help track the regions to which pointers point, and thus the all-important security distinction between the “inside” SMRAM memory locations and the “outside,” attacker-controlled, OS memory is not necessarily evident in the code. So the programmers are mostly on their own. (If you’re wondering how much of this problem can be solved with static analysis tools, read on—as it turns out, the SMI calling convention we discuss next makes it quite a challenge.)
To understand how attackers can exploit SMI handlers, you need to understand their calling convention. Although, as Listing 16-2 shows, calls to the SMI handler from the Python side of the Chipsec framework look like regular function calls, the actual binary calling convention, shown in Listing 16-3, is different.
import chipsec.chipset
import chipsec.hal.interrupts
#SW SMI handler number
SMI_NUM = 0x25
#CHIPSEC initialization
cs = chipsec.chipset.cs()
cs.init(None, True)
#create instances of required classes
ints = chipsec.hal.interrupts.Interrupts(cs)
#call SW SMI handler 0x25
cs.ints.send_SW_SMI(0, SMI_NUM, 0, 0, 0, 0, 0, 0, 0)
Listing 16-2: How to call an SMI handler from Python with the Chipsec framework
The code in Listing 16-2 calls the SMI handler with all the parameters zeroed out except for 0x25, the number of the called handler. Such a call may indeed pass no parameters, but it’s also possible that the SMI handler retrieves these parameters indirectly—via ACPI or UEFI variables, for example—once it gets control. When the operating system triggers SMI (for instance, as a software interrupt via I/O port 0xB2), it passes arguments to the SMI handler via general-purpose registers. In Listing 16-3, you can see what an actual call to the SMI handler looks like in assembly and how the parameters are passed. The Chipsec framework, of course, implements this calling convention under the hood.
mov rax, rdx ; rax_value
mov ax, cx ; smi_code_data
mov rdx, r10 ; rdx_value
mov dx, 0B2h ; SMI control port (0xB2)
mov rbx, r8 ; rbx_value
mov rcx, r9 ; rcx_value
mov rsi, r11 ; rsi_value
mov rdi, r12 ; rdi_value
; write smi data value to SW SMI control/data ports (0xB2/0xB3)
out dx, ax
Listing 16-3: An SMI handler call in assembly language
Most common SMI handler vulnerabilities of interest for BIOS implants fall into two major groups: SMI callout issues and arbitrary code execution (which, in many cases, is preceded by SMI callout issues). In SMI callout issues, SMM code unwittingly uses a function pointer, controlled by the attacker, that points at an implant payload outside the SMM. In arbitrary code execution, SMM code consumes some data from outside SMRAM that is capable of affecting the control flow and can be leveraged for more control. Such addresses are typically below the first megabyte of physical memory, as SMI handlers expect to use that memory range, which is unused by the OS. In SMI callout issues, when an attacker can overwrite the address of an indirect jump or a function pointer that is called from SMM, then arbitrary code under the attacker’s control will be executed outside of SMM, but with the privileges of SMM (a good example of such an attack is VU#631788).
In the newer versions of the BIOS from major enterprise vendors, such vulnerabilities are harder to find, but issues with accessing pointers outside the SMRAM range remain, despite the introduction of the standard function SmmIsBufferOutsideSmmValid() to check whether a pointer to a memory buffer is in that range. The implementation of this generic check was introduced in the Intel EDK2 repository on GitHub (https://github.com/tianocore/edk2/blob/master/MdePkg/Library/SmmMemLib/SmmMemLib.c), and its declaration is shown in Listing 16-4.
BOOLEAN
EFIAPI
SmmIsBufferOutsideSmmValid (
IN EFI_PHYSICAL_ADDRESS Buffer,
IN UINT64 Length
)
Listing 16-4: Prototype of the function SmmIsBufferOutsideSmmValid() from Intel EDK2
The SmmIsBufferOutsideSmmValid() function accurately detects pointers to memory buffers outside the SMRAM range, with one exception: it’s possible for the Buffer argument to be a structure and for one of the fields of this structure to be a pointer to another buffer outside SMRAM. If the security check happens only for the address of the structure itself, SMM code may still be vulnerable, despite a check with SmmIsBufferOutsideSmmValid(). Thus, SMI handlers have to validate each address or pointer—including offsets!—that they receive from the OS prior to reading from or writing to such memory locations. Importantly, this includes returning status and error codes. Any type of arithmetic calculation that happens inside SMM should validate any parameters coming from outside of SMM or less privileged modes.
Now that we’ve discussed the perils of SMI handlers taking data from the OS, it’s time to dig into a real case of SMI handler exploitation. We’ll look at the common workflow of a UEFI firmware update process used by Windows 10, among other operating systems. In this situation, the firmware is validated and authenticated inside SMM with weak DXE runtime drivers.
Figure 16-5 shows a high-level picture of the BIOS update process in this scenario.
Figure 16-5: High-level representation of the BIOS update process from the OS
As you can see, the userland BIOS update tool (Update App) communicates with its kernel-mode driver (Update Driver), which usually has direct access to the physical memory device over the Ring 0 API function MmMapIoSpace(). This access allows potential attackers to modify or map malicious data to the memory regions used to communicate with the SMI handler BIOS (SmiFlash or SecSmiFlash) update parsers. Usually, the parsing flow is complex enough to leave room for vulnerabilities, especially when the parsers are written in C, as they typically are. The attacker crafts a malicious data buffer and calls a vulnerable SMI handler by its number, as shown in Listing 16-3, using __outbyte() intrinsic functions available in the MS Visual C++ compiler.
The DXE drivers shown in Figure 16-5, SmiFlash and SecSmiFlash, are found across many SMM codebases. SmiFlash flashes a BIOS image without any authentication. Using an update tool based on this driver, the attacker can simply flash a maliciously modified BIOS update image without further ado (a good example of this type of vulnerability is VU#507496, found by Alex Matrosov). SecSmiFlash, by contrast, can authenticate the update by checking its digital signature, blocking this kind of attack.
In this section, we’ll give you an overview of vulnerabilities in the S3 Boot Script, the script that the BIOS uses to wake from sleep mode. Although the S3 Boot Script speeds up the waking process, incorrect implementations of it can have serious security impacts, as we’ll explore here.
The power transition states of modern hardware—such as working mode and sleep mode—are very complex and involve multiple DRAM manipulation stages. During sleep mode, or S3, DRAM is kept powered, although the CPU is not. When the system wakes from the sleep state, the BIOS restores the platform configuration, including the contents of the DRAM, and then transfers control to the operating system. You can find a good summary of these states in https://docs.microsoft.com/en-us/windows/desktop/power/system-power-states/.
The S3 boot script is stored in DRAM, preserved across the S3 state, and executed when resuming full function from S3. Although called a “script,” it is really a series of opcodes interpreted by the Boot Script Executor firmware module (https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Library/PiDxeS3BootScriptLib/BootScriptExecute.c). The Boot Script Executor replays every operation defined by these opcodes at the end of the PEI phase to restore the configuration of the platform hardware and the entire preboot state for the OS. After executing the S3 boot script, the BIOS locates and executes the OS waking vector to restore its software execution to the state it was in when it left off. This means the S3 boot script allows the platform to skip the DXE phase and reduces the time it takes to wake from the S3 sleep state. Yet this optimization comes with some risks, as we’ll discuss next.2
An S3 boot script is just another kind of program code stored in memory. An attacker who can gain access to it and alter the code can either add surreptitious actions to the boot script itself (staying within the S3 programming model so as not to ring alarm bells) or, if this doesn’t suffice, exploit the boot script’s interpreter by going beyond the opcodes’ intended functionality.
The S3 boot script has access to input/output (I/O) ports for read and write, PCI configuration read and write, direct access to the physical memory with read and write privileges, and other data that is critical for the platform’s security. Notably, an S3 boot script can attack a hypervisor to disclose otherwise isolated memory regions. All of this means that a rogue S3 script will have an impact similar to a code execution vulnerability inside the SMM, discussed earlier in this chapter.
As S3 scripts are executed early in the wake process, before various security measures are activated, the attacker can use them to bypass some security hardware configurations that would normally take effect during the boot process. Indeed, by design, most of the S3 boot script opcodes cause the system firmware to restore the contents of various hardware configuration registers. For the most part, this process isn’t any different from writing to these registers during the operating system runtime, except that write access is allowed for the S3 script but disallowed for the operating system.
Attackers can target the S3 boot script by altering a data structure called the UEFI boot script table, which saves the platform state during the Advanced Configuration and Power Interface (ACPI) specification’s S3 sleep stage, when most of the platform’s components are powered off. UEFI code constructs a boot script table during normal boot and interprets its entries during an S3 resumption, when the platform is waking up from sleep. Attackers able to modify the current boot script table’s contents from the OS kernel mode and then trigger an S3 suspend-resume cycle can achieve arbitrary code execution at the early platform wake stage, when some of security features are not yet initialized or locked in the memory.
The impact of an S3 boot script exploit is clearly huge. But how exactly does the attack work? First, the attacker must already have code execution in the kernel mode (Ring 0) of the operating system, as Figure 16-6 shows.
Figure 16-6: Step-by-step exploitation of an S3 boot script
Let’s dig into each step of this exploit.
Listing 16-5 lists all S3 boot script opcodes documented by Intel, including the highlighted EFI_BOOT_SCRIPT_DISPATCH_OPCODE, which executes the malicious shellcode.
EFI_BOOT_SCRIPT_IO_WRITE_OPCODE = 0x00
EFI_BOOT_SCRIPT_IO_READ_WRITE_OPCODE = 0x01
EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE = 0x02
EFI_BOOT_SCRIPT_MEM_READ_WRITE_OPCODE = 0x03
EFI_BOOT_SCRIPT_PCI_CONFIG_WRITE_OPCODE = 0x04
EFI_BOOT_SCRIPT_PCI_CONFIG_READ_WRITE_OPCODE = 0x05
EFI_BOOT_SCRIPT_SMBUS_EXECUTE_OPCODE = 0x06
EFI_BOOT_SCRIPT_STALL_OPCODE = 0x07
EFI_BOOT_SCRIPT_DISPATCH_OPCODE = 0x08
EFI_BOOT_SCRIPT_MEM_POLL_OPCODE = 0x09
Listing 16-5: S3 Boot Script dispatch opcodes
You can find a reference implementation of the S3 boot script developed by Intel in the EDKII repository on GitHub (https://github.com/tianocore/edk2/tree/master/MdeModulePkg/Library/PiDxeS3BootScriptLib/). This code is useful for understanding both the internals of the S3 boot script behavior on x86 systems and the mitigations implemented to prevent the vulnerability we just discussed.
To check whether a system is affected by the S3 boot script vulnerability, you can use Chipsec’s S3 Boot Script tool (chipsec/modules/common/uefi/s3bootscript.py). You can’t use this tool to exploit the vulnerability, however.
You could, however, use Dmytro Oleksiuk’s PoC of the exploit published on GitHub (https://github.com/Cr4sh/UEFI_boot_script_expl/) to deliver a payload. Listing 16-6 shows the successful result of this PoC exploitation.
[x][ =======================================================================
[x][ Module: UEFI boot script table vulnerability exploit
[x][ =======================================================================
[*] AcpiGlobalVariable = 0x79078000
[*] UEFI boot script addr = 0x79078013
[*] Target function addr = 0x790780b6
8 bytes to patch
Found 79 zero bytes at 0x0x790780b3
Jump from 0x79078ffb to 0x79078074
Jump from 0x790780b6 to 0x790780b3
Going to S3 sleep for 10 seconds ...
rtcwake: wakeup from "mem" using /dev/rtc0 at Mon Jun 6 09:03:04 2018
[*] BIOS_CNTL = 0x28
[*] TSEGMB = 0xd7000000
[!] Bios lock enable bit is not set
[!] SMRAM is not locked
[!] Your system is VULNERABLE
Listing 16-6: The result of successful S3 boot script exploitation
This vulnerability and its exploit are also useful for disabling some of the BIOS protection bits, such as BIOS Lock Enabled, BIOS Write Protection, and some others configured in the FLOCKDN (Flash Lock-Down) register. Importantly, an S3 exploit can also disable the protected ranges of PRx registers by modifying their configuration. Also, as we mentioned before, you can use the S3 vulnerability to bypass virtualization memory isolation technologies, such as Intel VT-x. In fact, the following S3 opcodes can make direct memory accesses during recovery from sleep state:
EFI_BOOT_SCRIPT_IO_WRITE_OPCODE = 0x00
EFI_BOOT_SCRIPT_IO_READ_WRITE_OPCODE = 0x01
Those opcodes can write some value to a specified memory location on behalf of the UEFI firmware, which makes it possible to attack a guest VM. Even when the architecture includes a hypervisor more privileged than the host system, the host system can attack it via S3 and, through it, all the guests.
The S3 boot script vulnerability was one of the most impactful security vulnerabilities in UEFI firmware. It was easy to exploit and hard to mitigate, since an actual fix required multiple firmware architectural changes.
Mitigating the S3 boot script issue required integrity protection from Ring 0 modifications. One way to achieve this was to move the S3 boot script to the SMRAM (SMM memory range). But there’s another way: in a technique introduced in EDKII (edk2/MdeModulePkg/Library/SmmLockBoxLib), Intel architects designed a LockBox mechanism to protect the S3 boot script from any modifications outside of SMM.3
The Intel Management Engine is interesting for an attacker. This technology has tantalized hardware security researchers ever since its inception, because it’s both virtually undocumented and extremely powerful. Today, the ME uses a separate x86-based CPU (in the past, it used the boutique ARC CPU) and serves as the foundation for the Intel hardware root of trust and multiple security technologies such as Intel Boot Guard, Intel BIOS Guard, and, partially, Intel Software Guard Extension (SGX). Thus, compromising ME provides a way to bypass Secure Boot.
Control of ME is a highly coveted goal for attackers, since ME has all the power of SMM but can also execute an embedded real-time OS on a separate 32-bit microcontroller that operates totally independently of the main CPU. Let’s look at some of its vulnerabilities.
In 2009, security researchers Alexander Tereshkin and Rafal Wojtczuk from Invisible Things Lab presented their research on abusing ME in their talk, “Introducing Ring –3 Rootkits,” presented at the Black Hat USA conference in Las Vegas.4 They shared their discoveries about Intel ME internals and discussed ways of injecting code into the Intel AMT execution context—by co-opting ME into a rootkit, for example.
The next advance in understanding ME vulnerabilities came an entire eight years later. Researchers Maxim Goryachy and Mark Ermolov from Positive Technologies discovered code execution vulnerabilities in the newer version of ME, present in Intel’s sixth, seventh, and eighth generations of CPUs. These vulnerabilities—CVE-2017-5705, CVE-2017-5706, and CVE-2017-5707, respectively—allowed an attacker to execute arbitrary code inside ME’s operating system context, resulting in a complete compromise of the respective platforms at the highest level of privilege. Goryachy and Ermolov presented these discoveries in “How to Hack a Turned-Off Computer, or Running Unsigned Code in Intel Management Engine” at Black Hat Europe 2017,5 where the researchers showed how rootkit code could bypass or disable multiple security features, including Intel’s Boot Guard and BIOS Guard technologies, by compromising their root of trust. Whether any security technologies are resilient to a compromised ME remains an open research question. Among other capabilities, rootkit code that executes in the Intel ME context allows the attacker to modify the BIOS image (and, partially, the root of trust of Boot Guard) directly inside the SPI flash chip and thus to bypass most security features.
Even though ME code executes on its own chip, it communicates with other layers of the OS and can be attacked via these communications. As always, the communication boundary is a part of any computational environment’s attack surface, no matter how isolated the environment.
Intel created a special interface, called the Host-Embedded Controller Interface (HECI), so ME applications could communicate with the operating system kernel. This interface could be used, for example, to remotely manage a system via a network connection terminating at the ME but capable of capturing the operating system GUI (via VNC, for example) or for operating system–aided configuration of the platform during the manufacturing process. It could also be used to implement Intel vPro enterprise management services, including AMT (which we discuss in the next section).
Typically, UEFI firmware initializes HECI via a proxy SMM driver, HeciInitDxe, located inside the BIOS. This SMM driver passes messages between ME and the host OS vendor-specific driver over the PCH bridge, which connects the CPU and the ME chip.
Applications running inside the ME can register HECI handlers to accept communication from the host operating system (the ME should not trust any input from the OS). If the OS kernel is taken over by an attacker, these interfaces become a part of the ME’s attack surface; for example, an overly trusting parser inside an ME application that does not fully validate messages coming from the OS side could be compromised by a crafted message, just as weak network servers are. This is why it’s important to reduce the attack surface for ME applications by minimizing the number of HECI handlers. Indeed, Apple platforms permanently disable the HECI interfaces and minimize the number of their ME applications as a deliberate security policy decision. However, one compromised ME application doesn’t mean the entire ME is compromised.
Let’s now consider vulnerabilities in two technologies that use the ME. To manage large data centers, as well as massive enterprise workstation inventories that must be centrally managed, organizations often use technologies that embed the management endpoint and logic into a platform’s main board. This allows them to control the platform remotely, even when the platform’s main CPU isn’t running. These technologies, which include Intel’s AMT and various baseboard management controller (BMC) chips, have inevitably become a part of their platforms’ attack surface.
A full discussion of attacks on AMT and BMCs is outside the scope of this chapter. However, we still want to provide some pointers, since exploitation of these technologies is directly tied to UEFI vulnerabilities and has gotten a lot of attention lately, due to high-impact Intel AMT and BMC vulnerabilities revealed in 2017 and 2018. We’ll discuss these vulnerabilities next.
Intel’s AMT platform is implemented as an ME application and so directly relates to the Intel ME execution environment. AMT leverages the ME’s ability to communicate with a platform over a network even when the main CPU is not active or is completely powered down. It also uses the ME to read and write DRAM at runtime, independently of the main CPU. AMT is an archetypical example of an ME firmware application that is intended to be updated via the BIOS update mechanism. For this purpose, Intel AMT runs its own web server, used as the main entry point for an enterprise remote management console.
In 2017, after nearly two decades of having a clean public security record, AMT had its first vulnerability reported—but it was a shocking one, and, given its nature, hardly the last one we’ll see! Researchers from Embedi (a private security company) alerted Intel about the critical issue CVE-2017-5689 (INTEL-SA-00075), which allowed for remote access and authentication bypass. All Intel systems produced since 2008 and that support the ME are affected. (This excludes the sizable Intel Atom population, which itself did not include the ME, although all of its server and workstation products were likely vulnerable if they included vulnerable components of the ME. Officially, only Intel vPro systems have AMT.) The scope of this vulnerability is pretty interesting, as it mostly affected systems designed to be accessed via a remote AMT management console even when turned off—meaning that the system could also be attacked when turned off.
Typically, AMT was marketed as a part of the Intel vPro technology, but in the same presentation, Embedi researchers demonstrated that AMT could be enabled for non-vPro systems. They released the AMTactivator tool, which an operating system administrator could run to activate AMT even when it was not officially a part of the platform. The researchers showed that AMT was a part of all current Intel CPUs powered by the ME, no matter whether they were marketed as vPro-enabled or not; in the latter case, AMT was still present and could be activated, for good or bad. More details about this vulnerability can be found at https://www.blackhat.com/docs/us-17/thursday/us-17-Evdokimov-Intel-AMT-Stealth-Breakthrough-wp.pdf.
Intel has deliberately disclosed very little information regarding AMT, creating considerable difficulties for anyone outside of Intel attempting to research the security failings of this technology. However, advanced attackers took the challenge and made significant progress in analyzing AMT’s hidden possibilities. Further nasty surprises for defenders may follow.
At the same time that Intel was developing vPro offerings powered by the AMT platform’s ME execution environments, other vendors were busy developing competing centralized remote management solutions for servers: BMC chips integrated into the servers. As products of this parallel evolution, BMC designs have a lot of the same weaknesses as AMT.
Commonly found in server hardware, BMC deployments are ubiquitous in data centers. Major hardware vendors like Intel, Dell, and HP have their own BMC implementations, based primarily on ARM microcontrollers with integrated network interfaces and flash storage. This dedicated flash storage contains a real-time OS (RTOS) that powers a number of applications, such as a web server listening on the BMC chip’s network interface (a separate network management interface).
If you’ve been reading attentively, this should scream “attack surface!” Indeed, a BMC’s embedded web server is typically written in C (including CGI) and is thus a prime target for attackers in the market for input-handling vulnerabilities. A good example of such a vulnerability is HP iLO BMC’s CVE-2017-12542, which allowed an authentication bypass and remote code execution in the respective BMC’s web server. This security issue was discovered by Airbus researchers Fabien Périgaud, Alexandre Gazet, and Joffrey Czarny. We highly recommend their detailed whitepaper “Subverting Your Server Through Its BMC: The HPE iLO4 Case” (https://bit.ly/2HxeCUS).
BMC vulnerabilities underscore the fact that, no matter what hardware separation techniques you employ, the overall measure of a platform’s attack surface is its communication boundary. The more functionality you expose at this boundary, the greater the risk to the platform’s overall security. A platform may feature a separate CPU with a separate firmware running on it, but if this firmware includes a rich target, such as a web server, the attacker can leverage the platform’s weaknesses to install an implant. For example, a BMC-based firmware update process that does not authenticate over-the-network update images is just as vulnerable as any security-through-obscurity software installation scheme.
The trustworthiness of UEFI firmware and other system firmware for x86-based platforms is a hot topic today, worthy of an entire book of its own. In a sense, UEFI was meant to reinvent the BIOS, but it did so with all the failings of security-by-obscurity approaches of the legacy BIOS, plus a lot more.
We made some hard decisions about which vulnerabilities to include here and which to give more detailed coverage to in order to illustrate the larger architectural failings. In the end, we hope that this chapter has covered just enough background to give you a deeper understanding of the current state of UEFI firmware security through the prism of common design flaws, rather than merely regaling you with a hodgepodge of infamous vulnerabilities.
Nowadays UEFI firmware is the cornerstone of platform security, despite being universally neglected by vendors a few years ago. The collaborative effort of the security research community made this change possible—and we hope that our book gives it its due and helps further its progress.
3.133.159.49