© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
S. Banik, V. ZimmerFirmware Developmenthttps://doi.org/10.1007/978-1-4842-7974-8_1

1. Spotlight on Future Firmware

Subrata Banik1   and Vincent Zimmer2
(1)
Bangalore, Karnataka, India
(2)
Issaquah, WA, USA
 

“In real open source, you have the right to control your own destiny.”

—Linus Torvalds

When purchasing a computing device, users are concerned with the hardware configuration of the device and whether it has the latest versions of the software and applications. Most computer and consumer electronics device users don’t realize that there are several layers of programs that run between the user pressing the power button of the device and when the operating system starts running. These programs are called firmware.

Firmware is responsible for bringing the device into its operational state and remains active while the OS is running, even when the device is in low-power mode. A computing system, irrespective of consumer, server, or IoT type, contains many different types of firmware. Firmware that runs on the host CPU is known as system firmware, and firmware that is specific to devices is called device firmware. In addition, there are other microcontrollers being used to manage the device, and firmware running on those controllers is called manageability firmware. The firmware code that runs on these devices has certain responsibilities prior to handing over control to the higher-level system software. For example, the system firmware upon platform reset is the main interface for initializing CPUs, configuring the physical memory, communicating with peripheral devices, and finally picking the OS loader to boot an operating system. Every computing device is equipped with peripherals such as input and output devices, block devices, and connectivity devices. While system firmware is focusing on the host CPU and its associated interface initialization, device firmware has started its execution either by executing a self-start program or by waiting for an initiation command from the system firmware to become operational. Figure 1-1 shows an overview of typical computing system (consumer or server) firmware.

A block diagram labeled system on chip S o C has nineteen components, B M C firmware, discrete graphics card, display panel, camera, battery, N V M E device, T C P C P D, T B T firmware, and others.

Figure 1-1

Typical computing system firmware inventory

Recent research from the LinuxBoot project on the server platform claims that the underlying firmware is at least 2.5x times bigger than the kernel. Additionally, these firmware components are capable. For example, they support the entire network stack including the physical layer, data link layer, networking layer, and transportation layer; hence, firmware is complex as well. The situation becomes worse when the majority of the firmware is proprietary and remains unaudited. Along with end users, tech companies and cloud service providers (CSPs) may be at risk because firmware that is compromised is capable of doing a lot of harm that potentially remains unnoticed by users due to its privileged operational level. For example, exploits in Base Management Controller (BMC) firmware may create a backdoor entry into the server, so even if a server is reprovisioned, the attacker could still have access into the server. Besides these security concerns, there are substantial concerns regarding performance and flexibility with closed source firmware.

This chapter will provide an overview of the future of the firmware industry, which is committed to overcoming such limitations by migrating to open source firmware. Open source firmware can bring trust to computing by ensuring more visibility into the source code that is running at the Ring 0 privilege level while the system is booting. The firmware discussed in this chapter is not a complete list of possible firmware available on a computing systems, rather just a spotlight on future firmware so you understand how different types of firmware could shape the future. Future firmware will make device owners aware of what is running on their hardware, provide more flexibility, and make users feel more in control of the system than ever.

Migrating to Open Source Firmware

Firmware is the most critical piece of software that runs on the platform hardware after boot, and it has direct access to hardware registers and system memory. Firmware is responsible for bringing the system into a state where higher-level software can take control of the system and the end user can make use of the peripheral devices. Prior to that, the user doesn’t have any control of the system while the system is booting. A misconfiguration in firmware might make the system unusable or create security loopholes. Hence, it’s important to know what is running at the lowest level of the platform hardware. Figure 1-2 shows the privilege level of software programs running on a computing system. Typically, computer users are more familiar with the system protection rings between Ring 0 and Ring 3, where Ring 0 is considered as the most privileged level where the kernel is operating and Ring 3 is considered as the least privileged for running applications or programs. Interestingly, underneath the kernel’s Ring 0 layer, there is firmware space running, which provides a more privileged mode of operations compared to the kernel. In this chapter, these layers running beneath Ring 0 are referred to as Ring -1 to Ring -3.

A stacked concentric circle has eight layers labeled platform hardware, manageability firmware, system management mode, system firmware, kernel, device drivers, device drivers, and applications.

Figure 1-2

System protection rings

Let’s take a look into these “minus” rings in more detail.

Ring -1: System Firmware

System firmware is a piece of code that resides inside the system boot device, i.e., SPI Flash in most of the embedded systems, and is fetched by the host CPU upon release from reset. Depending on the underlying SoC architecture, there might be higher-privileged firmware that initiates the CPU reset (refer to the book System Firmware: An Essential Guide to Open Source and Embedded Solutions for more details).

Operations performed by system firmware are typically known as booting, a process that involves initializing the CPU and the chipsets part of systems on chip (SoCs), enabling the physical memory (dynamic RAM, or DRAM) interface, and preparing the block devices or network devices to finally boot to an operating system. During this booting process, the firmware drivers can have access to direct hardware registers, memory, and I/O resources without any restrictions. Typically, services managed by system firmware are of two types: boot services, used for firmware drivers’ internal communication and vanished upon system firmware being booted to the OS; and runtime services, which provide access to system firmware resources to communicate with the underlying hardware. System firmware runtime services are still available after control has transferred to the OS, although there are ways to track this kind of call coming from the OS layer to the lower-level firmware layer at runtime.

System firmware belonging to the SPI Flash updatable region qualifies for in-field firmware update, and also supports firmware rollback protection to overcome vulnerabilities.

Ring -2: System Management Mode

System Management Mode (SMM) is the highest privileged mode of operations on x86-based platforms. There are two widely used ways to allow the system to enter into SMM:
  • Hardware-based method: This triggers a system management interrupt (SMI), a dedicated port 0xb2 with the unique SMI vector number.

  • Software-based method: This uses a general-purpose interrupt through the Advanced Programmable Interrupt Controller (APIC).

During initialization of the system firmware, a code block (program) can get registered with an SMI vector, which will get executed while entering into SMM. All other processors on the system are suspended, and the processor states are saved. The program that is getting executed when in SMM has access to the entire system, i.e., processor, memory, and all peripherals. Upon exiting from SMM, the processor state is restored and resumes its operations as if no interruption had occurred. Other higher-level software doesn’t have visibility about this mode of operation.

SMM exploits are common attacks on computer systems, where hackers use SMI to elevate the privilege level, access the SPI control registers to disable the SPI write protection, and finally write BIOS rootkits into the SPI Flash.

The major concern with SMM is that it’s completely undetectable, so one doesn’t know what kind of operation is running in SMM.

Ring -3: Manageability Firmware

Ring -3 firmware consists of the separate microcontrollers running its firmware and later booted to a real-time operating system (RTOS). This firmware always remains on and is capable of accessing all the hardware resources; it’s meant to perform the manageability operations without which one might need to access these devices physically. For example, it allows the remote administration of the enterprise laptops and servers by IT admins, such as powering on or off the device, reprovisioning the hardware by installing the operating system, taking the serial log to analyze the failure, emulating the special keys to trigger recovery, performing active thermal management like controlling the system fans, and handling the critical hardware device failure like a bad charger, failure of storage device, etc.

Although this firmware has access to system resources (access to the host CPU, unlimited access to the host system memory and peripherals) of the host CPU (based on how it is being interfaced with the host CPU), the operations performed by these processors are “invisible” to the host processor. The code that is running on these processors is not publicly available. Moreover, these codes are provided and maintained by the silicon vendors; hence, they are assumed to be trusted without verifying through any additional security layer like a verified boot or secure boot.

The fact to consider here is that all these software codes are developed by humans and reviewed by other sets of humans. It’s possible to have some bugs exist irrespective of which layer of ring it’s getting executed in, and the concern is that the more privileged layer that gets executed, the more opportunity there is for hackers to exploit the system.

As per the National Vulnerability Database (NVD), there are several vulnerabilities being reported or detected by security researchers on production systems every year. Among those security defects there are many that exist within the “minus” rings. For example, CVE-2017-3197, CLVA-2016-12-001, CLVA-2016-12-002, CVE-2017-3192, CVE-2015-3192, CVE-2012-4251, etc., are vulnerabilities reported in firmware from the NVD.

Most of the firmware being discussed was developed using closed source, which means the documentation and code source to understand what’s really running on a machine is not publicly available. When a firmware update is available, you may be worried about clicking the accept button because you don’t have any clue whether this update is supposed to run on your machine. The user has a right to know what’s really running on their device. The problem with the current firmware development model is not the security; a study done on 17 open source and closed source software showed that the number of vulnerabilities existing in a piece of software is not affected by the source availability model that it uses. The problem is lack of transparency.

Transparency is what is missing in closed source firmware if we park the argument about the code quality due to internal versus external code review. All these arguments will point back to the need to have visibility into what is really running on the device that is being used. Having the source code available to the public might help to get rid of the problem of running several “minus” rings.

Running the most vulnerable code as part of the highest privileged level makes the entire system vulnerable; by contrast, running that code as part of a lesser privileged level helps to meet the platform initialization requirement as well as mitigates the security concern. It might also help to reduce the attack surface.

Additionally, performance and flexibility are other concerns that can be improved with transparency. For example, typically closed source firmware development focuses on short-term problems such as fixing the bugs with code development without being bothered about redundancy if any. A case study presented at the Open Source Firmware Conference 2019 claimed that a system is still functional and meets the booting criteria even after 214 out of 424 firmware drivers and associated libraries were removed, which is about a 50 percent reduction. Having more maintainers of the code helps to create a better code sharing model that overcomes such redundancy and results in instant boot. Finally, coming to the security concerns, having a transparent system is more secure than a supposed secure system that hides those potential bugs in closed firmware.

This is a summary of the problems with the current firmware:
  • Firmware is the most critical piece of code running on the bare hardware with a privileged level that might allure attackers.

  • A compromised firmware is not only dangerous for the present hardware but all systems that are attached to it, even over a network.

  • Lower-level firmware operations are not visible to upper-level system software; hence, attacks remain unnoticed even if the operating system and drivers are freshly installed.

  • Modern firmware and its development models are less transparent, which leads to multiple “minus” rings.

  • Having a transparency firmware development model helps to restore trust in firmware as device owners are aware of what is running on the hardware. In addition, better design helps to reduce the “minus” rings, represents less vulnerability, and provides better maintenance with improved code size and higher performance.

Open source firmware (OSF) is the solution to overcome all of these problems. The OSF project performs a bare-minimum platform initialization and provides flexibility to choose the correct OS loader based on the targeted operating system. Hence, it brings efficiency, flexibility, and improved performance. Allowing more eyes to review the code while firmware is getting developed using an open source model provides a better chance to identify the feature detects, find security flaws, and improve the system security state by accommodating the community feedback. For example, all cryptographic algorithms are available in GitHub publicly. Finally, to accommodate the code quality question, a study conducted by Coverity Inc. finds open source code to be of better quality. All these rationales are adequate to conclude why migrating to OSF is inevitable. Future firmware creators are definitely looking into an opportunity to collaborate more using open source firmware development models.

This chapter will emphasize the future firmware development models of different firmware types such as system firmware, device firmware, and manageability firmware using open source firmware.

Open Source System Firmware Development

Most modern system firmware is built with proprietary firmware where the producer of the source code has restricted the code access; hence, it allows private modification only, internal code reviews, and the generation of new firmware images for updates. This process might not work with a future firmware development strategy where proprietary firmware is unreliable, or the functionality is limited in cases where device manufacturers relied on a group of firmware engineers who know only what is running on the device and therefore are capable of implementing only the required features. Due to the heavy maintenance demands of closed source firmware, often device manufacturers defer regular firmware updates even for critical fixes. Typically, OEMs are committed to providing system firmware updates two times during the entire life of the product, once at the launch and another six months later in response to an operating system update. System firmware development with an open source model in the future would provide more flexibility to users to ensure that the device always has the latest configuration. For that to happen, future system firmware must adhere to the open source firmware development principle. The open source firmware model is built upon the principle of having universal access to source code with an open source or free license that encourages the community to collaborate in the project development process.

This book provides the system architecture of several open source system firmware types including the bootloader and payload. Most open source bootloaders have strict resistance about using any closed source firmware binary such as binary large objects (BLOBs) along with open source firmware. Typically, any undocumented blobs are dangerous for system integration as they could be rootkits and might leave the system in a compromised state. But, the industry recognizes that in order to work on the latest processors and chipsets from the silicon vendors, the crucial piece of the information is the silicon initialization sequence. In the majority of cases, this is considered as restricted information prior to product launch due to innovation and business reasons and may be available under only certain legal agreements (like NDAs). Hence, to unblock the open source product development using latest SoCs, silicon vendors have come up with a proposal for a binary distribution. Under this binary distribution model, the essential silicon initialization code is available as a binary, which eventually unblocks platform initialization using open source bootloaders and at the same time abstracts the complexities of the SoC programming and exposes a limited set of interfaces to allow the initialization of SoC components. This model is referred to in this book as the hybrid work model.

This section will highlight the future system firmware journey using following operational models:
  • Hybrid system firmware model: The system firmware running on the host CPU might have at least one closed source binary as a blob integrated as part of the final ROM. Examples: coreboot, SBL on x86 platforms.

  • Open source system firmware model: The system firmware code is free from running any closed source code and has all the native firmware drivers for silicon initialization. Example: coreboot on RISC-V platforms.

Hybrid System Firmware Model

As defined earlier, the hybrid system firmware model relies on a silicon-provided binary for processor and chipset initialization; hence, it needs the following components as part of the underlying system firmware:
  • Bootloader: A boot firmware is responsible for generic platform initialization such as bus enumeration, device discoveries, and creating tables for industry-standard specifications like ACPI, SMBIOS, etc., and performing calls into silicon-provided APIs to allow silicon initialization.

  • Silicon reference code binary: One or multiple binaries are responsible for performing the silicon initialization based on their execution order. On x86-platforms, Firmware Support Package (FSP) is the specification being used to let silicon initialization code perform the chipset and processor initialization. It allows dividing the monolithic blob into multiple sub-blocks so that it can get loaded into system memory as per the associated bootloader phase and provides multiple APIs to let the bootloader configure the input parameters. Typically, this mode of FSP operation is known as API mode. Unlike other blobs, the FSP has provided the documentation, which includes the specification and expectation from each API and platform integration guide. This documentation clearly calls out the expectations from the underlying bootloader, such as the bootloader stack requirement, heap size, meaning for each input parameter to configure FSP, etc.

Facts

Intel FSP Specification v2.1 introduces an optional FSP boot mode named Dispatch mode to increase the FSP adaptation toward PI spec bootloaders.

  • Payload: An OS loader or payload firmware can be integrated as part of the bootloader or can be chosen separately, which provides the additional OS boot logic.

The book System Firmware: An Essential Guide to Open Source and Embedded Solutions provides the detailed system architecture of the bootloader and payload and defines the working principle with hybrid firmware as FSP. This section will focus on defining the work relationship between open source boot firmware with FSP.
  • coreboot using FSP for booting the IA-Chrome platform

  • EDKII Minimum Platform Firmware for Intel Platforms

coreboot Using Firmware Support Package

Firmware Support Package (FSP) provides key programming information for initializing the latest chipsets and processor and can be easily integrated with any standard bootloader. In essence, coreboot consumes FSP as a binary package that provides easy enabling of the latest chipsets, reduces time-to-market (TTM), and is economical to build as well.

FSP Integration
The FSP binary follows the UEFI Platform Initialization Firmware Volume Specification format. Hence, each firmware volume (FV) as part of FSP contains a phase initialization code. Typically, FSP is defined as a single firmware device (FD) binary, but because it contains several FVs and each FV represents a different initialization phase and runs at different noncontiguous addresses, a monolithic binary wouldn’t work here. Since the FSP 2.0 specification, the FSP binary can be split into three blobs as FSP-T, called FSP for Temporary RAM Initialization; FSP-M, called FSP for Memory Initialization; and FSP-S, called for Silicon Initialization. Here are some required steps for the FSP integration:
  • Configuration: The FSP provides configuration parameters that can be customized based on target hardware and/or operating system requirements by the bootloader. These are inputs for FSP to execute silicon initialization.

  • eXecute-in-place and relocation: The FSP is not position independent code (PIC), and each FSP component has to be rebased if it needs to support the relocation, which is different from the preferred base address specified during the FSP build. The bootloader has support for both these modes where components need to be executed at the address where it’s built, called eXecute-In-Place (XIP) components and marked as --xip, for example, FSP-M binary. Also, position-independent modules are modules that can be located anywhere in physical memory after relocation.

  • Interfacing: The bootloader needs to add code to set up the execution environment for the FSP, which includes calling the FSP with correct sets of parameters as inputs and parsing the FSP output to retrieve the necessary information returned by the FSP and consumed by the bootloader code.

FSP Interfacing
Since its origin, FSP has tried to provide a flexible interface between the bootloader and FSP to have correct sets of parameters required to perform the silicon initialization. Although FSP has gone through significant specification changes since its first introduction, the basic input/output architecture remains unchanged between all these different FSP versions. For example, a data structure used to pass a configuration parameter from the bootloader to FSP works as input parameters, and hand-off blocks (HOBs), a standard mechanism to pass FSP information back to the bootloader, work as output parameters. Figure 1-3 shows the evolution of FSP interfaces along with its specification.

A framework of Firmware Support Package has six steps labeled, coreboot, F S P, F S P 1.0, F S P 1.1, F S P 2.0, and F S P 2.2 with four levels.

Figure 1-3

Explaining FSP interfacing with coreboot boot-firmware

coreboot supports FSP Specification version 2.x (the latest as of this writing is 2.2).

FSP Configuration Data
Each FSP module contains a configuration data structure called the updatable product data (UPD), which is used by FSP for silicon initialization. Typically, UPD contains the default parameters for FSP initialization. The bootloader contains a separate UPD data structure for each FSP module, which allows the bootloader to override any of the default UPD parameters. As part of the FSP integration process, the bootloader is also required to keep FSP UPD data structures in the bootloader source code along with the corresponding FSP binary. See Figure 1-4.

A tree structure has s r c on top, followed by vendor code, intel, f s p, f s p 2 underscore 0, tiger lake, and four different F S P U P D structures that has region signature, T component, M component, and S component, respectively.

Figure 1-4

UPD data structure as part of coreboot source code

It is recommended that the bootloader copy the whole UPD structure from the FSP component to memory, update the parameters, and initialize the UPD pointer to the address of the updated UPD structure. The FSP API will then use this updated data structure instead of the default configuration region as part of the FSP binary blob while initializing the platform. In addition to the generic or architecture-specific data structure, each UPD data structure contains platform-specific parameters.

Open Source Challenges with FSP Configuration Data
FSP configuration data structures are crucial for the hybrid system firmware development model as it is used to configure the default built-in UPD configuration data, which might not be applicable for the current open source project to ensure the correct silicon initialization. Hence, while integrating the FSP blobs with the bootloader, it is recommended to ensure it has the same version of UPD structures as part of the bootloader source code. FSP is responsible for the entire silicon-related initialization process and feature enabling, and only inputs from the bootloader are in the UPD data structure. Hence, while calling the FSP APIs like TempRamInit(), FspMemoryInit(), and FspSiliconInit(), the bootloader needs to pass a pointer that provides the updated data structure. Figure 1-5 shows the bootloader code structure that ensures the initialization of the FSP configuration data for an open source firmware development project.

A flow chart depicts the initialization of the F S P configuration data to the open source firmware development project.

Figure 1-5

coreboot code structure to override UPD data structures

OSF development efforts expect the entire project source code is available for review and configuration, but due to business reasons like innovation and/or competition, the early open sourcing of the FSP configuration data structure is not feasible for non-production-release qualification (PRQ) products. It poses risk while developing an open source project using the latest SoC prior to PRQ. Consequences of this restriction would be incomplete SoC and mainboard source code per platform initialization requirements and incomplete feature enabling.

To overcome this problem, a solution is being developed that is open source friendly even for open project development using non-PRQ SoC, called partial FSP configuration data structure (also known as partial header). Here are the working principles of the partial FSP configuration data structure generation process:
  • This structure consists only of platform UPDs required for a specific bootloader to override for the current project.

  • The rest of the UPDs are renamed as reserved. For any project, reserved fields are not meant for bootloader overrides.

  • Embargoed UPD parameters’ names and descriptions are being abstracted.

Partial headers are generated using a Python-based tool. This tool will generate the partial headers for those bootloaders that do not need the full list of UPD data structures. It takes two arguments for the header generation process.
  • First argument: This is the path for the complete FSP-generated UPD data structure. The tool will run on this header itself to filter out only the required UPD parameters as per the second argument.

  • Second argument: This is a file that provides the lists of required UPD parameters for bootloader overrides.

This effort will ensure complete source code development on the bootloader side along with enabling new features without being bothered about the state of the silicon release. Post SoC PRQ, after the embargo is revoked, the complete FSP UPD data structure gets uploaded into FSP on GitHub, which replaces all reserved fields of the partial header with the proper naming.

coreboot and FSP Communications Using APIs
Since the FSP 2.1 specification, FSP supports two possible boot flows based on the implementation of the bootloader and its selection for the operational mode of FSP. The majority of the open source bootloaders are working with FSP and are using the API mode boot flow. Figure 1-6 shows the coreboot boot flow using FSP in API mode.

A flow diagram has two layers, bootloader and F S P A P Is. Layer 1 has boot block, rom stage, post car, and ram stage. Layer 2 has F S P-T, F S P-M, and F S P-S. Both layers connect by arrows, along with a series of intermediate steps. A reset option connects to the bootblack step of layer 1.

Figure 1-6

coreboot boot flow using FSP in API mode

Here is the detailed boot flow description:
  1. 1.

    coreboot owns the reset vector.

     
  2. 2.

    coreboot contains the real mode reset vector handler code.

     
  3. 3.

    Optionally, coreboot can call the FSP-T API (TempRamInit()) for temporary memory setup using (CAR) and create a stack.

     
  4. 4.

    coreboot fills in the UPD parameters required for the FSP-M API, such as FspMemoryInit(), which is responsible for memory and early chipset initialization.

     
  5. 5.

    On exit of the FSP-M API, either coreboot tears down CAR using the TempRamExit() API, if the bootloader initialized the temporary memory in step 3 using the FSP-T API, or coreboot uses the native implementation in coreboot.

     
  6. 6.

    It fills up the UPD parameters required for silicon programming as part of the FSP-S API, FspSiliconInit. The bootloader finds FSP-S and calls into the API. Afterward, FSP returns from the FspSiliconInit() API.

     
  7. 7.

    If supported by the FSP, the bootloader enables multiphase silicon initialization by setting FSPS_ARCH_UPD.EnableMultiPhaseSiliconInit to a nonzero value.

     
  8. 8.

    On exit of FSP-S, coreboot performs PCI enumeration and resource allocation.

     
  9. 9.

    The bootloader calls the FspMultiPhaseSiInit() API with the EnumMultiPhaseGetNumberOfPhases parameter to discover the number of silicon initialization phases supported by the bootloader.

     
  10. 10.

    The bootloader must call the FspMultiPhaseSiInit() API with the EnumMultiPhaseExecutePhase parameter n times, where n is the number of phases returned previously. The bootloader may perform board-specific code in between each phase as needed.

     
  11. 11.

    The number of phases, what is done during each phase, and anything the bootloader may need to do in between phases, will be described in the integration guide.

     
  12. 12.

    coreboot continues the remaining device initialization. coreboot calls NotifyPhase() at the proper stage like AfterPciEnumeration, ReadyToBoot, and EndOfFirmware before handing control over to the payload.

     
Facts

If FSP returns the reset required status from any API, then the bootloader performs the reset as specified by the FSP return status return type.

FSP Drivers
The bootloader implements a corresponding version driver to support the calling convention of the FSP entry point. Ideally, the purpose of these drivers are as follows:
  • Find the FSP header to locate the dedicated entry point, and verify the UPD region prior to calling.

  • Copy the default value from the UPD area into the memory to allow the required override of UPD parameters based on the target platform using driver-provided callbacks into the SoC code, for example: before calling into memory init or before silicon init.

  • Fill out any FSP architecture-specific UPDs that are generic like NvsBufferPtr for MRC cache verification.

  • Finally, call the FSP-API entry point with an updated UPD structure to silicon initialization.

  • On failure, handle any errors returned by FSP-API and take action; for example, manage the platform reset request to either generic libraries or SoC-specific code.

  • On success, retrieve the FSP outputs in the form of hand-off blocks that provide platform initialization information. For example, FSP would like to notify the bootloader about a portion of system memory that is being reserved by FSP for its internal use, and coreboot will parse the resource descriptor HOBs produced by the FSP-M to create a system memory map. The bootloader FSP driver must have capabilities to consume the information passed through the HOB produced by the FSP.

Current coreboot code has drivers for the FSP 1.1 and FSP 2.0 specifications. The FSP 2.0 specification is not backward compatible but updated to support the latest specification as FSP 2.2.

Mitigate Open Source Challenges with FSP Driver
Typically, system firmware development using an open source model has expectations that all new silicon feature-related documentation should be available to the public to allow the development of the new feature. But in reality, with the latest processors and chipsets, the feature programming lists are growing and expected to grow even more in future. With more capable and complex SoC solutions in the future, there might be some cases where certain feature programming might be classified as restricted; hence, it is not feasible to implement using an open source bootloader. For example, the current coreboot is capable enough to handle the multiprocessor (MP) initialization on the x86-platform using the coreboot native cpu and mp drivers. The Boot Strap Processor (BSP) performs the MP initialization and typically involves two major operations.
  • Bringing-up process: This enables application processors (APs) from a reset. It loads the latest microcode on all cores and syncs the latest Memory Type Range Register (MTRR) snapshot between BSP and APs.

  • Perform CPU feature programming: Allow vendor-specific feature programming to run such as to ensure higher power and performance efficiency, enable overclocking, and support specific technologies like Trusted eXecution Technology (TXT), Software Guard Extensions, Processor Trace, etc.

Typically, the bringing-up process for APs is part of the open source documentation and generic in nature. But, the previously listed CPU feature programming lists are expected to grow in the future and be considered proprietary implementations. If the system firmware implementation with the open source bootloader isn’t able to perform these recommended CPU features, programming might resist the latest hardware features. To overcome this limitation, the hybrid system firmware model needs to have an alternative proposal as part of the FSP driver.

Currently, coreboot is doing CPU multiprocessor initialization for the IA platform before calling FSP-S using its native driver implementation and having all possible information about the processor in terms of maximum number of cores, APIC IDs, stack size, etc. The solution offered here is a possible extension of coreboot support by implementing additional sets of APIs, which are used by FSP to perform CPU feature programming.

FSP uses the Pre-EFI Initialization (PEI) environment defined in the PI Specification and therefore relies on install/locate PPI (PEIM to PEIM Interface) to perform certain API calls. The purpose of creating a PPI service inside the bootloader is to allow accessing its resources while FSP is in operation. This feature is added into the FSP specification 2.1 onward where FSP is allowed to make use of external PPIs, published by boot firmware and able to execute by FSP, being the context master.

In this case, coreboot publishes a multiprocessor (MP) service PPI, EFI_MP_SERVICES_PPI, as per PI Specification Volume 1, section 2.3.9. coreboot implements APIs for the EFI_MP_SERVICES_PPI structure with its native functions as follows:

APIs as per the Specification

coreboot Implementation of APIs

APIs Description

PeiGetNumberOfProcessor

get_cpu_count() to get processor count

Get the number of CPUs.

PeiGetProcessorInfo

Fill ProcessorInfoBuffer:

- Processor ID: apicid

- Location: get_cpu_topology_from_apicid()

Get information on a specific CPU.

PeiStartupAllAps

Calling the mp_run_on_all_aps() function

Activate all the application processors.

PeiStartupThisAps

mp_run_on_aps() based on the argument logical_cpu_number

Activate a specific application processor.

PeiSwitchBSP

Currently not being implemented in coreboot due to scoping limitations

Switch the bootstrap processor.

PeiEnableDisableAP

Enable or disable an application processor.

PeiWhoAmI

Calling to activate the cpu_index() function

Identify the currently executing processor.

PeiStartupAllCpus

Only available in EDKII_PEI_MP_SERVICES2_PPI

mp_run_on_aps() based on MP_RUN_ON_ALL_CPUS

Run the function on all CPU cores (BSP + APs).

Here is code flow between coreboot and FSP while running the restricted CPU feature programming:
  1. 1.

    coreboot selects either CONFIG_MP_SERVICES_PPI_V1 or CONFIG_MP_SERVICES_PPI_V2 from the SoC directory as per the FSP recommendation to implement the MP Services PPI for FSP usage. coreboot does the multiprocessor initialization as part of ramstage early, before calling the FSP-S API. All possible APs are out of reset and ready to execute the restricted CPU feature programming.

     
  2. 2.

    coreboot creates the MP (MultiProcessor) Services APIs as per PI Specification Vol 1, section 2.3.9, and is assigned into the EFI_MP_SERVICES or EDKII_PEI_MP_SERVICES2_PPI structure as per the MP specification revision.

     
  3. 3.

    FSP-S to install EFI_MP_SERVICES or EDKII_PEI_MP_SERVICES2_PPI based on the structure provided by coreboot as part of the CpuMpPpi UPD. At the later stage of FSP-S execution, locate the MP Services PPI and run the CPU feature programming on APs.

     
  4. 4.

    While FSP-S is executing multiprocessor initialization using Open Source EDKII UefiPkg, it invokes a coreboot-provided MP Services API and runs the “restricted” feature programming on APs.

     
Figure 1-7 shows the pictorial representation of the boot flow.

A flow diagram has coreboot and F S P blocks. Each block has different stages. Coreboot has boot block, rom stage, ram stage, and A Ps. F S P has F S P-Notify, F S P-S, F S P-M, and F S P-T. Triangular labels, 1 to 4, are placed near various stages of the diagram.

Figure 1-7

coreboot-FSP multiprocessor init flow

This design would allow running SoC vendor-recommended restricted CPU feature programming using the FSP module without any limitation while working on the latest SoC platform (even on non-PRQ SoC) in the hybrid system firmware model. The CPU feature programming inside FSP will be more transparent than before as it’s using coreboot interfaces to execute those programming features. coreboot will have more control over running those programming features as the API optimization is handled by coreboot.

This solution is future-proof, because in the future this design of the PEIM-PEIM interface (PPI) can be expanded beyond just running the restricted CPU feature programming in a coreboot context. Here is a list of other opportunities to scale this solution for future hybrid system firmware:
  • Today on the CrOS platform, the cbmem -c command is capable only of redirecting the coreboot serial log into the cbmem buffer using the bootloader driver. With this approach, the coreboot serial library may be used by FSP to populate serial debug logs.

  • The same can be used for post code-based debug methods as well.

  • Rather than implementing a dedicated timer library inside FSP, this method can be used by FSP to inject any programmable delay using the bootloader-implemented PPI, which natively uses the bootloader timer driver.

To summarize, a hybrid system firmware model in the future provides the ease of porting to a new silicon. It allows for bootloaders (coreboot, SBL, UEFI MinPlatform, etc.) to have an FSP interfacing infrastructure for finding and loading FSP binaries, configuring FSP UPDs as per platform need, and finally calling FSP APIs.

EDKII Minimum Platform Firmware

Since the introduction of the Unified Extensible Firmware Interface (UEFI) firmware in 2004, all Intel architecture platforms have migrated from legacy BIOS to UEFI firmware implementations. With a blistering speed, UEFI firmware has taken over the entire PC ecosystem to become the de-facto standard for system firmware. Historically, platforms that use UEFI firmware have been nourished and maintained by a closed group; hence, the source used in a platform that uses UEFI firmware remains closed source although the specifications are open standards. Details about the UEFI architecture and specification are part of System Firmware: An Essential Guide to Open Source and Embedded Solutions.

Over the years the platform enabling activity has evolved and demands more openness due to firmware security requirements, cloud workloads, business decisions for implementing solutions using more open standards, etc.

Minimum Platform Architecture
The Minimum Platform Architecture (MPA) provides the design guidelines for implementing platform initialization using the open source EDKII standard to meet the industry expectations from the UEFI firmware. Figure 1-8 shows the high-level firmware stack used in the MPA.

A block diagram has Min Platform Architecture with three layers at the top. Layer 1 has F S P, Min Platform and Core. Layer 2 has Silicon, Platform and Core AP Is, below layer 1. Layer 3 has a Board package. This flows to Server, Client, and I o T via 3 arrows. A legend below indicates the various types of components present.

Figure 1-8

MPA diagram

The MPA firmware stack demonstrates the hybrid-firmware development work model where it combines the several closed and open source components for platform initialization.

Core

Tianocore is an open source representation of the UEFI. EDKII is the modern implementation of UEFI and Platform Initialization (PI) specifications. Typically, the EDKII source code consists of standard drivers based on the various industry specifications such as PCI, USB, TCG, etc.

Silicon

A closed source binary model was developed and released by silicon vendors (example: Intel, AMD, Qualcomm etc.) with an intention to abstract the silicon initialization from the bootloader.

Prior to the MPA architecture, the FSP API boot mode was the de facto standard for silicon initialization when the bootloader needs to implement a 32-bit entry point for calling into the APIs as per the specification. This limits the adaptation of SoC vendor-released silicon binaries aka FSP toward a bootloader that adheres to the UEFI PI firmware specification. Traditionally, the UEFI specification deals with firmware modules responsible for platform initialization and dispatched by the dispatcher (Pre-EFI Initialization aka PEI and Driver eXecution Environment aka DXE core). To solve this adaptation problem in the UEFI firmware platform enabling model, a new FSP boot mode has been designed with the FSP External Architecture Specification v2.1 known as dispatch mode.

Dispatch Mode
FSP API boot mode requires bootloaders to perform a call into the FSP entry points like FSP-M (for Memory Init) and FSP-S (for Silicon Init) for initiating the silicon initialization. The dispatch mode is more aligned with the UEFI specification where FSP-M and FSP-S are containers that expose firmware volumes (FVs) that can be directly used by a UEFI PI–compliant bootloader. For example, the UEFI bootloader known as an FSP wrapper uses FSP the same way as any other firmware file system partition. The PEIM in these FVs are executed as is in the PEI environment with the bootloader being the context master. All the FSP entry points introduced as part of API boot mode (i.e., FspMemoryInit(), FspSiliconInit(), and NotifyPhase()) are not in use. Figure 1-9 shows the work relationship between a UEFI PI bootloader and FSP in dispatch mode.
  • The UEFI PI bootloader adhering to the MPA is equipped with a PCD database to pass the configuration information between bootloaders to FSP. This includes hardware interface configuration (typically, configured using UPD in API mode) and boot stage selection. Refer to the “Min-Tree” section to understand the working principle and MPA stage approach for incremental platform development.

  • PEI Core as part of Silicon Reference code blob aka FSP is used to execute the modules residing into the firmware volumes (FVs) directly.

  • The PEIMs belonging to these FVs are communicating with each other using PPI as per the PI specification.

  • The hand-off-blocks are being used to pass the information gathered in the silicon initialization phase with the UEFI PI bootloader.

  • The UEFI bootloader doesn’t use NotifyPhase APIs; instead, FSP-S contains a DXE driver that implements an equivalent implementation using a DXE native driver that is getting invoked at NotifyPhase() events.

Two blocks, U E F I P I Boot loader and F S P Binary. The former has a P C D database, and the latter has a P E I Core. Both connect by 2 flow arrows, namely, feature configuration using P C Ds and dispatch firmware volumes. There is a notify phase implementation option in D X E in block 2. Block 2 connects to block 1 at the bottom by a flow arrow.

Figure 1-9

FSP work model in dispatch mode with UEFI bootloader

UEFI Bootloader and FSP Communications Using Dispatch Mode

The communication interface designed between the UEFI bootloader and FSP in dispatch mode is intended to remain as close as possible to the standard UEFI boot flow. Unlike API mode, where the communication between the bootloader and FSP takes place by passing configuration parameters known as UPD to the FSP entry points, in dispatch mode the Firmware File Systems (FFSs) that belong to FVs consist of Pre-EFI Initialization Modules (PEIMs) and get executed directly in the context of the PEI environment provided by the bootloader. This can also be referred to as the firmware volume drop-in model. In dispatch mode, the PPI database and HOB lists prepared by FSP are shared between the bootloader and FSP.

Here is the detailed boot flow description:
  1. 1.

    The bootloader owns the reset vector and SecMain as part of the bootloader getting executed upon the platform start executing from the reset vector.

     
  2. 2.

    SecMain is responsible for setting up the initial code environment for the bootloader to continue execution. Unlike the coreboot workflow with FSP in API mode, where coreboot does the temporary memory initialization using its native implementation on the x86 platform instead calling the FSP-T API, dispatch mode tries to maximize the usage of FSP and uses FSP-T for initializing temporary memory and setting up the stack.

     
  3. 3.

    The bootloader provides the boot firmware volume (BFV) to the FSP. The PEI core belonging to FSP uses the BFV to dispatch the PEIMs and initialize the PCD database.

     
  4. 4.

    In addition to the bootloader PEI modules, FSP dispatches the PEI module part of FSP-M to complete the main memory initialization.

     
  5. 5.

    The PEI core continues to execute the post-memory PEIMs provided by the bootloader. During the course of dispatch, the PEIM included within FSP-S FV is executed to complete the silicon-recommended chipset programming.

     
  6. 6.

    At the end of the PEI phase, all silicon-recommended chipset programming is done using the closed source FSP, and DXE begins its execution.

     
  7. 7.

    The DXE drivers belonging to the FSP-S firmware volume are dispatched. These drivers will register events to be notified at different points in the boot flow. For example, NotifyPhase will perform the callbacks to complete the remaining silicon-recommended security configurations such as disabling certain hardware interfaces, locking the chipset register, and dropping the platform privilege level prior to handing control off to the payload or operating system.

     
  8. 8.

    The payload phase executes the OS bootloader and loads the OS kernel into the memory.

     
  9. 9.

    The OS loader signals the events to execute the callbacks, registered as part of the DXE drivers to ensure the pre-boot environment has secured the platform.

     
Platform
In the past, UEFI firmware development has used closed source for platform initialization, but with MPA, this limitation is diminished by creating a platform standard known as the EDKII Minimum Platform Specification. This approach allows the platform using UEFI firmware to open source and improves customer engagement, brings transparency to product development, establishes the trust in the community, and finally establishes the ecosystem that encourages the community to contribute toward platform implementation. The key innovation in this architecture is the layered approach called stages, which are based on the development phase and the functionality for specific use cases. Each stage builds upon its previous stage with extensibility to meet silicon, platform, or board requirements. The MPA tries to split the platform implementation into two parts.
  • Generic: This part remains generic in nature by providing the required APIs to define the control flow. This generic control flow is being implemented inside MinPlatformPkg ( Edk2-Platforms/Platform/Intel/MinPlatformPkg), such that the tasks performed by the MinPlatformPkg can be reused by all other platforms (belonging to the board package) without any additional source modification.

  • Board package: This part focuses on the actual hardware initialization source code aka board package. Typically, the contents of this package are limited to the scope of the platform requirements and the feature sets that board users would like to implement. As described in Figure 1-8, the board package code is also open source and represented as Edk2-Platforms/Platform/Intel/<xyz>OpenBoardPkg, where xyz represents the actual board package name. For example, TiogaPass, a board supported by Open Compute Project (OCP) based on Intel’s Purley chipset, uses the PurleyOpenBoardPkg board package.

Facts

A closed source representation of the OpenBoardPkg is just BoardPkg, which still directly uses the MinPlatformPkg from EDKII platforms.

The board package consists of a standard EDKII package along with the following items and must implement the guidelines:
  • A board package may consist of one or more supported boards. These boards are sharing the common resources from the board package.

  • Board-specific source code must belong to the board directory and name after the supported board. For the previous example, the board directory for TiogaPass is named as BoardTiogaPass.

  • All the board-relevant information is made available to the MinPlatformPkg using board-defined APIs.

To summarize MPA, it consists of a closed source FSP package for silicon initialization, and the rest of the source code is potentially open source where MinPlatformPkg and a board package are combined together to call the platform.

Min-Tree

MPA is built around the principle of a structural development model. This structural development model can be referred to as a min-tree, where the source code tree is started with a minimalistic approach and enriched based on the required functionality getting included over time as the platform is getting matured. To make this model structural, the design principle relied on dividing the flow, interfaces, communication, etc., into a stage-based architecture (refer to “Minimum Platform Stage Approach” section).

Figure 1-10 shows the min-tree development model over the product life cycle. Typically, in the product development cycle, the early phase is always focusing on creating the bare-minimum source code. The target is to make sure the early silicon-based simulation or emulation platform is able to perform the basic boot to an operating system. To meet this goal, the platform development starts by leveraging the source code of the previous generation platform (typically referred to as n-1 where n is the current generation platform) and existing feature sets. This often includes creating the new sets of silicon and platform code on top of the prior platform after analyzing the basic differences between the new target platform from its prior generation. Hence, at the start of the product development cycle, the min-tree just consists of silicon and platform-related changes that are applicable for the present platform and leverages existing features from the prior generation platform.

A graph of min-tree over product timeline. There are three layers. From bottom to top, there is a minimum platform, advanced features, and product differentiator. The components reduce per layer from bottom to top. There are two stages, from left to right, namely Early and later development cycles. The first block in layer 1 has two units.

Figure 1-10

Min-tree evolution over product timeline

The later product development stages are targeted more toward meeting the product milestone releases; hence, it focuses on the code completion that includes development of the full feature sets applicable for this platform. Next is platform development, which focuses on the enablement of the product differentiator features, which is important for product scaling. Finally, the platform needs to be committed to sustenance, maintenance, and derivative activities. The staged platform approach is a more granular representation form of the min-tree where based on product requirement, timeline, security, feature sets, etc., one can decide the level of the tree to design the minimum platform architecture. For example, product-distinguishing features are not part of the essential or minimum platform or advanced feature list, and the board package is free to exclude such features using boot stage PCD (gMinPlatformPkgTokenSpaceGuid.PcdBootStage). This may be used to meet a particular use case based on the platform requirement. For example, a board may disable all advanced features by setting the board stage PCD value to 4 instead of 6 to improve the boot time. Decrementing the additional stages might also be used for SPINOR size reduction as the final bootloader executable binary size is expected to get reduced.

Minimum Platform Stage Approach

The MPA staged approach describes the minimal code block and binary components required while creating system firmware. The flexible architecture allows modifying the FD image to make it applicable for the target platform. In this architecture, each stage will have its own requirement and functionality based on the specific uses. For example, Stage III, Boot to UI, is focused on interfacing with console I/Os and other various hardware controllers using the command-line interface. Additionally, decrementing a stage might also translate to reducing the platform feature set. For example, a Stage III bootloader won’t need to publish ACPI tables as this feature is not useful for the platform.

Figure 1-11 describes the stage architecture, including the expectations from the stage itself. Each stage is built upon the prior stage with extensibility to meet the silicon, platform, or board requirements.

Stage I: Minimal Debug
Stage I (Minimal Debug) is the base foundation block for the later stages supported by this architecture to add more complexity by introducing advanced functionality.

A flow chart has two platforms, minimum and full, with 7 stages. The minimum platform has 1 to 6 stages. The full platform has a whole 7 stages. The stages are, Minimal debugging, memory functional, boot to U I, boot to OS, security enabled, advanced feature selection, and optimization.

Figure 1-11

Minimum platform stage architecture

Stage I is contained within the SEC and PEI phases; hence, it should get packed and uncompressed inside the firmware volume. The minimal expectation from this stage is to implement board-specific routines that enable the platform debug capability like serial output and/or postcode to see the sign of life.

The major responsibilities of Stage I are as follows:
  • This is similar to all other bootloaders that come up on a memory-restricted environment like x86. Perform initialization of temporary memory and set up the code environment.

  • Perform pre-memory board specific initialization (if any).

  • Detect the platform by reading the board ID after performing the board-specific implementation.

  • Perform early GPIO configuration for the serial port and other hardware controllers that are supposed to be used early in the boot flow.

  • Enable the early debug interface, typically, serial port initialization over legacy I/O or modern PCH-based UARTs.

The functional exit criteria of Stage I is when temporary memory is available and the debug interface is initialized where the platform has written a message to indicate that Stage I is now getting terminated.

Stage II: Memory Functional

Stage II (Memory Functional) is primarily responsible for ensuring the code path that executes the memory initialization code for enabling the platform permanent memory. This stage extends the operations on top of Stage I and performs the additional/mandatory silicon initializations required prior to memory initialization. Because of the memory-restricted nature of platform boot, this stage is also packed uncompressed. Stage II is more relying on the FSP-M firmware volume in terms of finding the PEI core and dispatching the PEIMs.

The following of the major responsibilities of Stage II:
  • Perform pre-memory recommended silicon policy initialization.

  • Execute memory initialization module and ensure the basic memory test.

  • Switch the program stack from temporary memory to permanent memory.

The functional exit criteria of Stage II are that early hardware devices like GPIOs are being programmed, main memory is initiated, temporary memory is disabled, memory type range registers (MTRRs) are programmed with main memory ranges, and the resource description HOB is built to pass that initialization information to the bootloader.

Stage III: Boot to UI

The primary objective of Stage III (Boot to UI) is to be able to successfully boot to the UEFI Shell with a basic UI enabled. The success criteria of this stage is not to demonstrate that every minimum platform architecture should be equipped with the UEFI Shell, but rather more focuses on the generic DXE driver execution on top of the underlying stages like Stages I and II (mainly targeted for silicon and board). The bare-minimal UI capability required for Stage III is a serial console.

Stage III is contained with the Driver Execution Environment (DXE) and Boot Device Selection (BDS) for booting to the UEFI Shell. The major responsibilities of Stage III are as follows:
  • Bring generic UEFI-specific interfaces like DXE Initial Program Load (IPL), DXE Core, and dispatch DXE modules. This includes installing the DXE architectural protocols.

  • Perform post-memory silicon-recommended initialization.

  • Have a provision to access the nonvolatile media such as SPINOR using UEFI variables. Additionally, capabilities that can be enabled as part (but not only limited to this lists) of this phase allow various input and output device driver access such as USB, graphics, storage, etc.

The functional exit criteria of Stage III is to ensure all generic device drivers are not operational and the platform has reached the BDS phase, meaning the bootloader is able to implement minimal boot expectations for the platform.

Stage IV: Boot to OS

Leveraging on the previous stage, Stage IV (Boot to OS) is to enable a minimal boot path to successfully boot to an operating system (OS). The minimal boot path is the delta requirement over Stage III that ensures booting to an OS.

The minimal boot path for Stage IV includes the following:
  • Add minimum ACPI tables required for booting an ACPI-compliant operating system. Examples are ACPI tables, namely, RSDT (XSDT), FACP, FACS, MADT, DSDT, HPET, etc.

  • Based on the operating system expectation, it might additionally publish DeviceTree to allow the operating system to be loaded.

  • Trigger the boot event that further executes the callbacks being registered by the FSP-S PEIMs to ensure locking down the chipset configuration register and dropping the platform privilege prior to launching the application outside trusted boundaries.

  • This phase will also utilize the runtime services being implemented by the UEFI bootloader for communication like Timer and nonvolatile region access from the OS layer.

After the platform is able to successfully boot to a UEFI-compliant OS with a minimal ACPI table being published, it is enough to qualify Stage IV to call its termination. Additionally, this stage implements SMM support for x86–based platforms where runtime communication can get established based on software triggering SMI.

Stage V: Security Enable

The basic objective of Stage V (Security Enable) is to include security modules/foundations incrementally over Stage IV. Adhering to the basic/essential security features is the minimal requirement for the modern computing systems. Chapter 5 is intended to highlight the scenarios to understand the security threat models and what it means for the platform to ensure security all around even in the firmware.

The major responsibilities of Stage V are as follows:
  • Ensure that the lower-level chipset-specific security recommendation such as lockdown configuration is implemented.

  • Hardware-based root of trust is being initialized and used to ensure that each boot phase is authenticated and verified prior to loading into the memory and executing it as a chain throughout the boot process.

  • Protect the platform from various memory-related attacks if they implement the security advisory well.

  • At the end of this phase, it will allow running any trusted and authenticated application including the operating system.

Stage VI: Advanced Feature Selection

Advanced features are the nonessential block in this min-tree structural development approach. All the essential and mandatory features required for a platform to reach an operating system are developed using stages I to V. The advanced feature selection is focused on developing firmware modules based on a few key principles such as modularization, reducing interdependencies over other features, etc. It helps these modules to get integrated with min-tree as per the user requirements, product use cases, and even the later product development cycle.

The design principles behind Stage VI are as follows:
  • Platform development models become incremental where more essential features are integrated and developed at an early phase. Otherwise, the complex but generic advanced features can be developed without being bottlenecked on the current silicon and board but can be readily shared across platforms.

  • Advanced feature modules necessarily do not contain functionality that is unrelated to the targeted feature.

  • Each feature module should be self-content in nature, meaning it minimizes the dependencies to the other feature.

  • The feature should expose a well-defined software interface that allows easy integration and configuration. For example, all modules should adhere to EDKII configuration options such as PCD to configure the feature.

Stage VII: Optimization

In the scope of current architecture, Stage VII (Optimization) is a proposed architectural stage reserved for future improvements. The objective of this stage is to provide an option for the platform to ensure optimization that focuses on the target platform. For example, on a scaling design without Thunderbolt ports, there should be a provision using PCD that disables dispatching of Thunderbolt drivers (including host, bus, and device). This is known as a configurable setting.

Additionally, there could be compilation-time configuration attached to the PCD that strips unused components from the defined FV. For example, FSP modules are used for API boot mode. It is intended that such optimization/tuning can be intercepted in the product even at a later stage without impacting the product milestone aka schedules.

These are just examples that demonstrate the architecture freedom to improve the platform boot time and SPINOR size reduction at the later stage.

To summarize, a hybrid system firmware development using EDKII MPA is intended to improve the relationship between open source and closed source components. An MPA design brings transparency to platform development even with EDKII platform code. The min-tree design serves as a basic enablement vehicle for the hardware power-on and allows cross-functional teams to get started on feature enablement. The feature enablement benefits from its modular design that is simple to maintain.

Open Source System Firmware Model

The ideal philosophy of open source system firmware is to make sure that all pieces of the firmware are open source, specifically, the ones required for the boot process post CPU reset. This effort of achieving the system firmware code is 100 percent open source and has significant dependency over the underlying platform hardware design. Typically, due to the unavailability of the detailed hardware interface document and programming sequence for boot-critical IPs like memory controller, system firmware projects should choose the hybrid system firmware model over complete open source system firmware. RISC-V is a good example of an open standard hardware specification that allows pure open source system firmware development on RISC-V-based embedded systems, personal computers, etc. The word pure being used here intentionally to differentiate a firmware project that supports closed source blobs for platform pre-reset flow from the transparent system reset flow (pre and post CPU reset) with all possible open source firmware.

There are several open source system firmware projects available, and this section is about having a detailed overview about expectations from the future open source system firmware. Hence, future system firmware will focus not only on getting rid of proprietary firmware blobs but also on adopting a modern programming language for developing system firmware. oreboot is an aspiring open source system firmware project that is slowly gaining momentum by migrating its supports from evaluation boards to real hardware platforms. oreboot has a vision of pure open systems, meaning firmware without binary blobs. But to add the latest x86-based platforms, it had made an exception to include only boot-critical blobs (for example, manageability firmware, AMD AGESA, FSP for performing specific silicon initialization), where feature implemented by blobs during boot is not possible to implement in oreboot.

This section will provide an architecture overview of oreboot and its internals, which will be valuable for developers to learn for preparing themselves for system firmware architecture migration into a more efficient and safe programming language. It’s like a recurrence of events that happened a few decades back that had migrated the present system firmware programming language to C from assembly.

oreboot = Coreboot - C + Much More

At a high level, it’s easy to define oreboot as downstream of the coreboot project, which is developed without the C programming language. The oreboot system firmware project has zero C code, very minimal code written in assembly to just set up the programming environment, and remaining code in the Rust language. With the introduction of the Rust code for system firmware development, it offers better security and reliability. The oreboot image is licensed under the GPL, version 2. Here are the design principles of oreboot, which make it different from the other boot firmware used on embedded systems:
  • oreboot is focused on reducing the firmware boundary to ensure instant system boot. The goal for oreboot is to have fewer than one boot on embedded devices.

  • It improves the system firmware security, which typically remains unnoticed by the platform security standards with a modern, safe programming language. Refer to Appendix A for details about the usefulness of Rust in system firmware programming, which deals with direct memory access and even operations that run on multithreaded environments.

  • It removes dedicated ramstage usage from the boot flow and defines a stage named Payloader Stage. This will help to remove the redundant firmware drivers and utilities from LinuxBoot as payload.

  • It jumps to the kernel as quickly as possible during boot. Firmware shouldn’t contain the high-level device drivers such as network stack, disk drivers, etc., and it can leverage the most from LinuxBoot.

Currently, oreboot has support for all the latest CPU architecture, and adding support for the newer SoC and mainboards are a work in progress. Currently the RISC-V porting being done using oreboot is fully open sourced. In addition, it’s able to boot an ASpeed AST2500 ARM-based server management processor as well as a RISC-V OpenTitan “earlgrey” embedded hardware.

oreboot Code Structure

The source code organization of an oreboot project is similar to coreboot with a more simplified build infrastructure. The makefile parts of oreboot directories are much simpler; unlike coreboot, they don’t contain the control flow. The .toml-based configuration file is used to define and configure sets of tasks to run as part of control flow. A task is Rust code that needs to be executed. Tasks can have dependencies that are also tasks that will be executed before the current task itself. The following table describes the oreboot code structure:

Directory

Description

src/arch

Lists of supported CPU architecture, for example: armv7, armv2, risc-v, x86, etc.

src/drivers

Supported firmware drivers, written in Rust, that follow oreboot unique driver model, for example: clock, uart, spi, timer, etc.

src/lib

Generic libraries like devicetree, util, etc.

src/mainboard

Lists of supported mainboards as part of the oreboot project. This list contains emulation environments like qemu, engineering board such as upsquared based on x86, and HiFive the RISC-V based development board, BMC platform ast2500, etc.

Each mainboard directory contains a makefile and Cargo.toml file to define the build dependencies, which will allow it to build all boards in parallel.

Example of Cargo.toml:

[dependencies]

cpu = { path = "../../../cpu/armltd/cortex-a9"}

arch = { path = "../../../arch/arm/armv7"}

payloads = { path = "../../../../payloads"}

device_tree = { path = "../../../lib/device_tree" }

soc = { path = "../../../soc/aspeed/ast2500" }

[dependencies.uart]

path = "../../../drivers/uart"

features = ["ns16550"]

Include source files written in Rust (.rs) and assembly (.S) as per the boot phase requirements.

Two special files reside in the mainboard directory as fixed-dtfs.dts to create the flash layout and describe system hardware configuration as mainboard.dts. mainboard.dtb is the binary encoding of the device tree structure.

src/soc

Source code for SoC that includes clock programming, early processor initialization, setting up code environment, DRAM initialization sequence, chipset registers programming, etc.

Each SoC directory also contains Cargo.toml that defines the dependent drivers and library required for SoC-related operations.

payloads/

Library for payload-related operations like loading into memory and executing.

tools/

Tools directory that contains useful utilities like layoutflash to create an image from binary blobs, as described in the layout specified using device tree, bin2vmem to convert binary to Verilog VMEM format, etc.

README.md

Describes the prerequisites to getting started with oreboot, cloning source code, compilation, etc., useful for the first-time developer.

Makefile.inc

This makefile is included by the project mainboard directory makefile.

oreboot Internals

This section will guide developers through the various key concepts of oreboot that are required to understand its architecture. Without understanding these architectural details, it would be difficult to contribute to a project. Also, these are the key differentiating features for oreboot, compared to the coreboot project.

Flash Layout

Flash layout specifies how different binaries as part of oreboot are getting stitched together to create the final firmware image (ROM) for flashing into the SPI Flash. This file is named fixed-dtfs.dts, belonging to each mainboard directory.

oreboot has replaced the coreboot file system (CBFS) with the Device Tree File System (DTFS). It is easy to expose the layout of the flash chip without any extra OS interface. DTFS provides an easy method to describe the different binary blobs.

Here is sample code to describe the different regions belonging to the flash layout (see Figure 1-12):
area@x {
   description = "Boot Blob";
   offset = <0xff0000>;
   size = <0x20000>; // 512KiB
   file = "$(TARGET_DIR)/bootblob.bin";
};

A layout of the six components is depicted in an interface, Boot Blob, fixed D T F S, N V R A M, empty space, rom payload, and ram payload.

Figure 1-12

32MiB flash layout

The description field defines the type of binary, offset is the base address of the region, the size field specifies the region limit, and the file field is used to mention the path of the binary. x is the region number inside the flash layout, for example: boot blob, rampayload, NVRAM, etc. With reduced boot phases, the oreboot architecture allows ample headroom in flash.

Build Infrastructure
coreboot uses make menuconfig to allow configuration, but oreboot doesn’t have such a provision; hence, it relies on conditional compilation. An oreboot build starts when the developer executes the make command from a specific mainboard directory. The code inside src/mainboard/*/*/src/main.rs starts with assembly instruction first, which performs the minimal amount of initialization that is required to call into the Rust program. Compared to other C-based firmware modules, which have predefined entry points such as main(), here main.rs has the pub extern "C" fn 'entry_func_name' method that is being called from the assembly to start the program. The code written in Rust does the platform initialization and prepares the system to load and run the payload. The mainboard code uses only the core library, which means no heap allocated structures and that arrays should be with statically allocated size. See Figure 1-13.

A flowchart for ore boot build has four levels in the middle of four layers. They are cargo build, rust object copy, layout flash, and device tree compiler. There are three input flows for each layer, namely, the Arch directory, the Mainboard directory, and the Payload directory. The flow diagram ends with the Final Binary ore boot dot bin.

Figure 1-13

oreboot build flow

The binary generation process is a two-step approach.
  • Create an executable and linking format (ELF) binary from source code using the cargo build command.

  • Convert the .elf file to binary format (.bin) with the rust objcopy command.

The output binary (.bin) belongs to a region specified using the file field as part of the flash layout file. Now these binaries need to construct an image that will be flashed into the device. The tool name layoutflash is (source code belongs to the tools directory as mentioned earlier) used to construct the final binary (.ROM). It takes arguments as an oreboot device tree to specify the image layout and compiled binary files generated by the compilation process.

Device Tree

The DTS specification specifies a construct called a device tree, which is typically used to describe system hardware. A device tree is a tree data structure with nodes that describe the device present in the system. Each node has a value that describes the characteristics of the device. At compilation, the boot firmware prepares the device information in the form of device tree that can’t necessarily be dynamically detected during boot, and then during boot, the firmware loads the device tree into the system memory and passes a pointer to the device for the OS to understand the system hardware layout. Unlike coreboot, the device tree structure prepared by oreboot is more scalable and can be parsed by existing OSs without any modification.

In oreboot, the device tree is mainly used to serve two different purposes.
  • Hardware device tree: Part of the mainboard directory, this is used to describe the system hardware that the system firmware is currently running. This is typically named after the mainboard; for example, a device tree name for RISC-V processor–based development board HiFive is hifive.dts.

  • oreboot device tree: This is the device tree used to define the layout of the image that is flashed into the device.

The device_tree library inside the src/lib source code is used to operate on the device tree data structure. Device Tree Syntax (DTS) is a human-friendly text representation of the device tree, which is used by the Device Tree Compiler (DTC) to convert into either Device Tree Blob (DTB) format or Flattened Device Tree (FDT) format, a binary encoding of the device tree structure. Figure 1-14 shows an example representation of a simple hardware device tree that represents the HiFive board. Device nodes are shown with properties and values inside each node.

A flow chart illustrates the main board of oreboot hi Five. It starts with the root node, then flows to four nodes, namely, c pus, memory, r e f c l k, and serial, from top to bottom. c pus flows to c p u at the rate of zero and one, respectively.

Figure 1-14

Device tree example from oreboot HiFive mainboard

In the previous example, cpus, memory, refclk, and serial are node names, and root node is identified by a forward slash (/). @ is used to specify the unit-address to the bus type on which the node sits.

Driver Model
oreboot defines an unique driver model that creates a driver trait, an interface that implements four functions: init(), read(), write(), and shutdown(). The details of these functions are as follows:

Driver Functions

Description

init()

Initializes the device.

pread()

Positional read. It takes two arguments:

- First argument: A mutable buffer that will get filled data from the driver.

- Second argument: The position that one would like to read from.

The function returns the result; the type of the result could be either a number, defined the number of bytes being read, or an error. If there are no more bytes to read, it returns an end-of-file (EOF) error.

pwrite()

Positional writing. It takes two arguments:

- First argument: A buffer that contains data is used by the driver to write on the hardware.

- Second argument: The position that one would like to write into.

The function returns the number of bytes written.

shutdown()

Shuts down the device.

This model is useful for different types of devices like block devices and character devices since the driver could ignore the position like the offsets while operating on hardware devices. Here are some examples of different driver types that oreboot supports:
  • Physical device drivers: The drivers that are used to operate on real hardware devices like memory drivers are capable of performing reads/writes to physical memory addresses, serial drivers used to read/writes to serial devices, clock drivers to initialize the clock controller present on the hardware, and DDR drivers to perform DRAM-based device initialization.

  • Virtual drivers: Drivers that are not associated with any real hardware device but rather used to create the interface for accessing the hardware device. For example, the union driver is capable of stream input or output to multiple device drivers; refer to the following example of mainboard, which implements the union driver for a serial device; and section reader, which reads a section from another device window specified using offset and size and returns EOF when the end of the window is reached.

The following is an example of a mainboard implementing more than one UART. The system firmware would like to use all of them and hence implements the union driver as shown. The oreboot mainboard code creates an array of these drivers, and the union driver uses this array. Meanwhile, the console calls the init() function, initializes all these UART controllers, and then writes a string using the pwrite() function to write into all these UARTs.
let mut uarts = [
    &mut PL011::new(0x1E72_3000, 115200) as &mut dyn Driver, // UART 1
    &mut PL011::new(0x1E72_D000, 115200) as &mut dyn Driver, // UART 2
    &mut PL011::new(0x1E72_E000, 115200) as &mut dyn Driver, // UART 3
    &mut PL011::new(0x1E72_F000, 115200) as &mut dyn Driver, // UART 4
];
let console = &mut Union::new(&mut uarts[..]);
console.init();
console.pwrite(b"Welcome to oreboot ", 0).unwrap();

oreboot Boot Flow

The boot flow defined by oreboot is similar to coreboot, except for the fact that oreboot has accepted that a firmware boundary has to be reduced, so it makes sense to leverage more from the powerful payload offerings as LinuxBoot with a more mature Linux kernel driver. oreboot replaces the need to have a dedicated stage like ramstage, which is meant to perform an operation that can be replaced by a powerful payload and load a payload. The oreboot boot flow provides an option to load the Linux kernel as part of the flash image as the payload from the payloader stage.

Facts

Some of the work done in a coreboot project is separating the payload loading and running operations from a dedicated stage like ramstage and having a flexible design where the bootloader is free to decide which stage can be used to load the payload. This work is known as Rampayload or coreboot-Lite, which influences the design of oreboot having an independent stage for payload operations and being called from prior stages as per the platform requirements.

The following sections explain the oreboot boot flow in detail with a hardware porting guide. The oreboot boot process is divided into three stages.
  • Bootblob: This is the first stage post CPU reset, which is executed from the boot device. It holds the first instruction being executed by the CPU. This stage is similar to coreboot’s first stage called bootblock.

  • Romstage: This is functionally similar to the coreboot romstage boot phase, which is intended to perform the main memory initialization.

  • Payloader stage: This is only intended to load and run the payload. This is a feature differentiator from the coreboot, where the ramstage boot state machine has tasks to load and run payload at the end of hardware initialization.

Here is a more detailed description of each stage operations based on the real hardware. The hardware used for this demonstration of the oreboot boot flow is the open source HiFive Unleashed Board based on the SiFive FU540 processor. Figure 1-15 shows the hardware block diagram.

A schematic diagram of the Si Five-Hi Five processor. The components are S P I Flash, Serial Console, U 54-M C Core plex, P R C I, S P I controllers, U A R T controllers, D D R controller, and D RAM, all connected by flow arrows.

Figure 1-15

Hardware block diagram of SiFive-HiFive Unleashed

In this example, RISC-V SoC has four pins (0001, MSEL0 is 1 and MSEL1-3 are set to 0) called MSEL to choose where the bootloader is, and Zeroth Stage Boot Loader (ZBL) is stored in the ROM of the SoC. ZBL loads oreboot from the SPI Flash, and control reaches the bootblob.

Bootblob
In the oreboot boot flow architecture, bootblob is the first stage, which gets control upon the CPU coming out from the reset. In a multiprocessor boot environment, it’s getting executed by the Boot Strap Processor (BSP) using temporary memory. Operations performed by the bootblock phase include the following:
  • The early piece of the code in bootblob is written in assembly, which is executed by the CPU immediately after release from power-on reset. It performs the processor-specific initialization as per the CPU architecture.

  • It sets up the temporary RAM as Cache as RAM, aka CAR or SRAM, as physical memory is not yet available.

  • It prepares the environment for running Rust code like setting up the stack and clearing memory for BSS.

  • It initializes UART(s) to show the sign-of life using the debug print message “Welcome to oreboot.”

  • It finds the romstage from the oreboot device tree and jumps into the romstage.

Here is some sample bootblob code written in assembly belonging to the SoC directory:

soc/sifive/fu540/src/bootblock.S

/* Early initialization code for RISC-V */

.globl _boot

_boot:

       # The previous boot stage passes these variables:

       #   a0: hartid

       #   a1: ROM FDT

       # a0 is redundant with the mhartid register. a1 might not be valid on

       # some hardware configurations, but is always set in QEMU.

       csrr a0, mhartid

setup_nonboot_hart_stack:

       # sp <- 0x02021000 + (0x1000 * mhartid) - 2

       li sp, (0x02021000 - 2)

       slli t0, a0, 12

       add sp, sp, t0

       # 0xDEADBEEF is used to check stack underflow.

       li t0, 0xDEADBEEF

       sw t0, 0(sp)

            # Jump into Rust code

       call _start_nonboot_hart

Figure 1-16 represents the operations performed by the bootblob stage pictorially.

A block diagram of boot blob stage has the components S P I flash and Serial Console and connect to P R C I, S P I, and U A R T controllers. These connect to U 54-M C Core plex and D D R Controller, which finally connects to DRAM. The components are classified as inactive or disabled state, active or enabled state, and initialization in progress.

Figure 1-16

Operational diagram of bootblob stage

Romstage
The romstage is the stage invoked right after the bootblob in the boot flow. This stage gets executed from the SPI Flash and performs DRAM initialization. The responsibilities of the romstage are as follows:
  • Perform early device initialization, for example configuring memory-mapped control and status register for controlling component power states, resets, clock selection and low-level interrupts, etc.

  • Initiate the DRAM initialization. Configure memory controllers as part of the SoC hardware block. This process involves running SoC vendor-specific routines that train the physical memory or implementing memory reference code in Rust (basically a direct porting from C to Rust). For the HiFive Unleashed platform, oreboot has implemented DDR initialization code in Rust belonging to soc/sifive/fu540/src/ddr*.rs by referring to open source FSBL implementation.

Here is some sample romstage code written in Rust that initializes clocks:
// Peripheral clocks get their dividers updated when the PLL initializes.
let mut clks = [spi0 as &mut dyn ClockNode, spi1 as &mut dyn ClockNode, spi2 as &mut dyn ClockNode, uart0 as &mut dyn ClockNode];
let mut clk = Clock::new(&mut clks);
clk.pwrite(b"on", 0).unwrap();
Figure 1-17 represents the operations performed by the romstage pictorially.

A flow diagram of rom stage S P I flash connects to F U 540-C 0 0 0 and S P I controllers, and Serial console connects to U ART controllers. These two, along with P R C I, connect to the U 54-M C complex and the D D R controller. Then it connects to DRAM. The stages are inactive or disabled, active or enabled, and initialization in progress.

Figure 1-17

Operational diagram of romstage

Payloader Stage
The Payloader stage is the first stage on the RISC-V platform running from the DRAM after physical memory is available. Unlike coreboot, where the ramstage boot phase has many other tasks along with loading and running the payload at the end of the ramstage, in oreboot, the payloader stage has only one job: find, load, and run a payload. The payloader stage doesn’t have any high-level firmware device drivers like storage device, audio device, etc. This helps to reduce the complexity and save the SPI footprint compared to other system firmware. Here is a sample payloader stage code written in Rust that loads a payload file by the path specified in the oreboot device tree and jumps into it:
use payloads::external::zimage::PAYLOAD;
let p = PAYLOAD;
writeln!(w, "Loading payload ").unwrap();
p.load();
writeln!(w, "Running payload entry 0x{:x} dtb 0x{:x} ", p.entry, p.dtb).unwrap();
p.run();
Figure 1-18 represents the operations performed by the payloader stage pictorially.

A flow diagram has three labels, inactive or disabled, active or enabled, and initialization in progress. S P I flash and Serial console are connected to S P I and U ART controllers, which along with P R C I connect to U 54-M C core plex and D D R controller, which connects to DRAM. The payloader stage of D RAM connects to F U 540-C 0 0 0.

Figure 1-18

Operational diagram of payloader stage

Payload

An oreboot project by default uses LinuxBoot as a payload, which allows it to load the Linux kernel from the SPI Flash into DRAM. The Linux kernel is expected to initialize the remaining devices using kernel drivers that include block devices and/or network devices etc. Finally, locate and load the target operating system using kexec. LinuxBoot uses u-root as initramfs, which is the root filesystem that the system has access to upon booting to the Linux kernel. systemboot is an OS loader as part of u-root to perform an iterative operation to attempt boot from a network or local boot device.

Figure 1-19 represents the operations performed by the payload (LinuxBoot) pictorially.

A flow diagram has three labels, inactive or disabled, active or enabled, and initialization in progress. S P I flash and Serial console are connected to S P I and U ART controllers, which along with P R C I connect to U 54-M C core plex and D D R controller, which connects to DRAM. The Linux boot of D RAM connects to F U 540-C 0 0 0.

Figure 1-19

Operational diagram of payload stage

The payload operation is expected to end when the Linux kernel part of LinuxBoot calls into the kernel image from the block device or network and executes the first instruction. Figure 1-20 shows the final system hardware component initialization state while it reaches an operating system.

A flow diagram has three labels, inactive or disabled, active or enabled, and initialization in progress. S P I flash and Serial console are connected to S P I and U ART controllers, which along with P R C I connect to U 54-M C core plex and D D R controller, which connects to DRAM. The Kernel of D RAM connects to F U 540-C 0 0 0.

Figure 1-20

System hardware state at the kernel

To summarize, the complete open source system firmware model using oreboot like the bootloader is not only meant to provide freedom from running proprietary firmware blobs on hardware. Additionally, it’s developed using safe system programming languages like Rust. The payloader userland is written in Go and advocates the architectural migration of the system firmware development using a high-level language in the future. Finally, a reduced boot phase allows ample free space in the flash layout, which will provide an opportunity to reduce the hardware bill of materials (BoM) cost with instant boot experience.

Open Source Device Firmware Development

System firmware is the firmware that is running on the host CPU after it comes out from the reset. In traditional computing systems, system firmware is owned by independent BIOS vendors (IBVs), and adopting the open source firmware model will help to get visibility into their code. This will help to design a transparent system by knowing the program is running on the underlying hardware, and it provides more control over the system. Earlier sections highlighted the path forward for system firmware development in the future using open source system firmware as much as possible. In a computing system, there are multiple devices that are attached to the motherboard, and each device has its own firmware. When a device is powered on, firmware is the first piece of code that runs and provides the required instructions and guidance for the device to be ready for communicating with other devices or for performing a set of basic tasks as intended. These types of firmware are called device firmware. Without device firmware being operational, the device wouldn’t be able to function. Based on the type of the devices, a complexity in the firmware is introduced. For example, if a device is a simple keyboard device, then it has only a limited goal and no need to worry about regular updates, whereas more complex ones, like graphics cards, need to define an interface that allows it to interact with the system firmware and/or an operating system to achieve a common goal, which is to enable the display.

The majority of device firmware present on consumer products is running proprietary firmware that might lead to a security risk. For example, at the 2014 Black Hat conference, security researchers first exposed a vulnerability in USB firmware that leads to a BadUSB attack, a USB flash device, which is repurposed to spoof various other device types to take control of a computer, pull data, and spy on the user. A potential solution to this problem is that device firmware should be developed using open source so that the code can be reviewed and maintained by others rather than only the independent hardware vendors (IHVs).

This section will describe the evolution in device firmware development for discrete devices that has a firmware burned into its SPINOR.

Legacy Device Firmware/Option ROM

An option ROM (OpROM) is a piece of firmware that resides either in the system firmware code as a binary blob or on an expansion card, which needs to be copied into system memory and executed using legacy interrupts by system firmware during the platform initialization phase. It acts as an interface between the system firmware and underlying specific hardware device. The BIOS Boot Specification (BBS) was developed to standardize the initialization sequence of OpROM. Figure 1-21 shows a sample discrete graphics where VBIOS is located inside a dedicated chip.

A graphic card is drawn. P C L-E x4/x8 is named at one of the parts of the given graphic card. The ports available are H D M I, D P, Serial Port, Debug port, which connects to Display, Audio, U ART, J TAG, and D D R connects to D D R x. Option ROM connects to S P I, which flows to P C I-E x 4 over x 8.

Figure 1-21

Discrete graphics card hardware block diagram

A common example of OpROM is the Video BIOS (VBIOS), which can be used to program either on-board graphics or discrete graphics cards and is specific to the device manufacturer. In this section, VBIOS is referred to and used to initialize the discrete graphics card after the device is powered on. It also implements an INT 10h interrupt (interrupt vector in an x86-based system) and VESA BIOS Extensions (VBE) (to define a standardized software interface to display and audio devices) for both the pre-boot application and system software to use.

A video services BIOS interrupt sets up a real mode interrupt handler; meaning, to get this interrupt serviced, the system needs to enter into real address mode. As real mode is limited to 20-bit addressing, it provides a limited space for OpROMs. A total 122KB (between 0xc0000 to 0xdffff, sometimes if it’s extended and then stored at 0xe0000–0xeffff, and so on) is shared by all option ROMs. An OpROM typically compacts itself by getting rid of some initialization code (leaving behind a smaller runtime code). During the power-on self-test (POST), the BBS specifies that the BIOS will detect and shadow VBIOS into 0xc0000, and it will traverse the PCI configuration space to check the Expansion ROM base address (PCI config space header type 00 and 20 devices only have an expansion ROM base address to support an add-on ROM) and copy the discrete card OpROM from MMIO space to the predefined OpROM region. The system firmware then scans the region and detects if the OpROM has a PnP option ROM header. The following table describes the PnP OpROM header structure:

Offset

Length

Value

Description

0x00

0x02

0xAA55

Signature

0x02

0x01

Varies

Option ROM length

0x03

0x4

Varies

Initialization vector

0x07

0x13

Varies

Reserved

0x12h

0x02

Varies

Offset to PCI data structure

0x1A

0x02

Varies

Offset to PnP expansion Header structure

  • Signature: All ISA expansion ROMs are currently required to identify themselves with a signature word of AA55h at offset 0. This signature is used by the system firmware as well as other software to identify that an option ROM is present at a given address.

  • Length: The length of the option ROM in 512 byte increments.

  • Initialization vector: The system BIOS will execute a far call to this location to initialize the option ROM. The field is four bytes wide even though most implementations adhere to the custom of defining a simple three-byte NEAR JMP. The definition of the fourth byte may be OEM specific.

  • Reserved: This area is used by various vendors and contains OEM-specific data and copyright strings.

  • Offset to PCI data structure: This location contains a pointer to a PCI data structure, which holds the vendor-specific information.

  • Offset to PnP expansion header: This location contains a pointer to a linked list of option ROM expansion headers.

The system firmware performs a read operation to read the first two bytes of the PnP OpROM structure and verifies the signature as 0xAA55. If a valid option ROM header is present, then the system firmware reads the offset + 02h to get the length of the OpROM and then performs a far call to offset + 03h to initialize the device. After video OpROM has initialized the graphics controller, it provides lists of services like setting the video mode, character and string output, and other VBE functions to operate in graphics mode.

Here is a list of a few supported functions implemented by OpROM. The system BIOS needs to hook INT 10h to call these functions as per programming requirements.

General Video Service Functions (AH = 00 to FF, except 0x4F)

Operation

Function

Subfunction

Set Video Mode

AH=0x00

AL=Video Mode

Set Cursor Characteristics

AH=0x01

CH bits 0-4 = start line for cursor in character cell

bits 5-6 = blink attribute (00=normal, 01=invisible, 10=slow, 11=fast)

CL bits 0-4 = end line for cursor in character cell

Set Cursor Position

AH=0x02

DH,DL = row, column

BH = page number (0 in graphics modes;

0–3 in modes 2 and 3; 0–7 in modes 0&1)

Write String (AT, VGA)

AH=0x13

AL = mode

BL = attribute if AL bit 1 clear

BH = display page number

DH,DL = row,column of starting cursor position

CX = length of string

ES:BP -> start of string

VBE Functions (AH = 0x4F and AL = 0x00 to 0x15)

Return VBE Controller Information

AH=0x4F

AL = 0x00; ES:DI = Pointer to buffer in which to place VbeInfoBlock Structure.

Return VBE Mode Information

AH=0x4F

AL = 0x01; CX = Mode Number; ES:DI = Pointer to ModeInfoBlock Structure.

Set VBE Mode

AH=0x4F

AX = 02h

BX = Desired Mode to set

ES:DI = Pointer to CRTCInfoBlock structure

Figure 1-22 describes the communication between video OpROM and system firmware.

A block diagram has P C I Enumeration with P C I Device limit, Load P C I Op ROM, initialize the card which branches into prepare real mode, call Op ROM entry point and set VESA mode, and display boot splash. Call Op ROM and Set VESA mode both branch into three and two layers.

Figure 1-22

Discrete graphics card hardware block diagram

In this sample implementation, the system firmware calls video OpROM to initialize the graphics controller and uses video services to set the display to show the pre-OS display screen or OS splash screen during boot.

Figure 1-23 decodes the video OpROM from the system memory 0xc0000 location, and Figure 1-24 shows the OpROM initialization code in assembly.

An interface of the computer system displays video BIOS option ROM at address 0 x C 0 0 0 0. There are ten lines of output on the display screen.

Figure 1-23

Display video BIOS option ROM at address 0xC0000

Offset + 03h specified the initialization vector, which will transfer the call into video OpROM initialization code for display initialization, which is referred as jmp 0xbd11.

An interface has the initialization vector address at address 0 x C 0 0 0 3. A series of output lines are displayed on the screen.

Figure 1-24

Initialization vector address at address 0xC0003

This execution of OpROM for device initialization has several limitations while working with modern firmware solutions. Option ROM attacks can be considered an initial infection or ones to spread malicious firmware from one firmware component to another. Compromised OpROM firmware can be viewed as an initial method of infection that remains persistent even after modifying the system firmware. There is still a legacy implementation, where the system firmware and/or payload relies on the option ROM for device initialization and runtime services. Therefore, modern devices like discrete graphics cards and network cards still need to support legacy OpROMs.

UEFI OpROM

The Graphics Output Protocol replaces the legacy video BIOS and eliminatew the VGA hardware functionality from the discrete graphics card or on-board graphics controller. It’s an UEFI implementation to create the generic GOP UEFI display driver image that can be either located on the device ROM or present inside system firmware. GOP has some unique advantages over legacy OpROM.
  • It has a modern and well-defined interface, which is implemented using an industry-standard specification.

  • All GPUs within a platform become “equal,” and there’s no more unique “VGA-enabled VGA.”

  • Code is written in C and doesn’t need a legacy interrupt handler to communicate between the platform and GPU.

  • Implementing UEFI graphics OpROM using EBC (EFI Byte Code) allows a single image to operate on multiple CPU architectures.

  • There are clearer and portable solutions that allow new features to be implemented.

The services implemented by the GOP driver are available only until EFI Boot Time Services are available (prior to ExitBootServices()). However, the framebuffer populated by the GOP driver persists, meaning the OS graphics driver and applications can continue to use the framebuffer for graphics output. The implementation of the UEFI-compliant video option ROM starts with an implementation of the UEFI GOP driver. The GOP driver follows the UEFI driver model and hence installs a driver binding protocol at the entry point of the UEFI driver. The GOP driver binding protocol implements functions such as Supported(), Start(), and Stop().
  • Supported(): The “Supported” method of the GOP driver binding protocol tests to see whether the given handle is a manageable adapter. Also, check that EFI_DEVICE_PATH_PROTOCOL and EFI_PCI_IO_PROTOCOL are present to ensure that the handle that is passed is a valid PCI device. The PCI I/O protocol gets the PCI configuration header from the device and verifies that the device is supported by the present GOP driver.

  • Start(): The “Start” method of the GOP driver binding protocol tells the graphics driver to start managing the controller. The GOP driver uses the device-specific knowledge to perform the following operations:
    • Initialize the graphics adapter.

    • Initialize platform parameters like LID Present, Dock Supported, etc.

    • Initialize the display manager module that enumerates all the supported displays and checks its live status and EDID to detect the enabled display device.

    • Create child handles for each detected and enabled physical output device and install the EFI_DEVICE_PATH_PROTOCOL.

    • Get EDID information from each enabled physical output device and install EFI_EDID_DISCOVERED_PROTOCOL on the child handle.

    • Create child handlers for each valid combination of two or more video output devices and install EFI_DEVICE_PATH_PROTOCOL.

    • Set the initial mode, required to initialize the mode field of GOP.

    • Install the GRAPHICS_OUTPUT_PROTOCOL on the selected device.

  • Stop(): The Stop function performs the opposite operation of the Startfunction. In general, Stop() functions uninstall all protocols, close the protocol instances, release all resources, and disable the graphics adapter.

Figure 1-25 shows an example of GOP driver stack implementation.

A Block diagram of the graphics output protocol has U E F I GOP driver, Graphics Child Handle, Graphics Output Protocol, P C I input-output protocol, other input-output abstraction layers, and Graphics adapter. Each layer is connected by arrows.

Figure 1-25

GOP driver implementation

Apart from initializing the graphics adapter, the GOP protocol publishes three functions: QueryMode(), SetMode(), and Blt(). They allow the system firmware to communicate with the device hardware to configure the display capabilities. These functions replace the legacy OpROM VBE functionality.
  • The QueryMode() function is used to return extended information on one of the supported video modes. It’s important that QueryMode() only return modes that can actually be displayed on the attached display device.

  • The SetMode() function allows system firmware to select the specific mode based on the mode argument, between 0 and numModes.

  • The Blt() function is used for transferring information to and from the video buffer. It allows graphics contents to be moved from one location of the video frame buffer to another location of the video frame buffer.

The GRAPHICS_OUTPUT_PROTOCOL.Mode pointer is populated when the graphics controller is initialized and gets updated with the SetMode() function call. The FrameBufferBase member of this object may be used by a UEFI OS loader or OS kernel to update the contents of the graphical display after ExitBootServices() is called and the Graphics Output Protocol services are no longer available. A UEFI OS may choose to use this method until a graphics driver can be installed and started.

The EDKII build infrastructure tools allows one to convert one or more UEFI drivers in PE/COFF image formats into a single PCI Option ROM image that can be included with a discrete add-in card. When a discrete add-in card, for example, a graphics card, is attached over a PCI slot into a target platform, the PCI Bus Driver detects the presence of PCI OpROM contents, and the UEFI driver is loaded into memory and executed automatically. See Figure 1-26.

A diagram depicts two option ROM headers and two PCIR headers. Option ROM and PCIR header 1 has video B I O S S legacy OpROM image, uncompressed and unsigned. Header 2 has generic G O P U E F I display driver image and compress signed.

Figure 1-26

Hybrid ROM layout

EfiRom is the utility located inside the EDKII source code at BaseTools/Source/C/EfiRom and is used to build PCI OpROM images containing UEFI drivers, Legacy OpROM, or both. It also allows UEFI drivers to be compressed using the UEFI compression algorithm as per the UEFI specification. The following command shows the method to generate a single PCI OpROM image that combines one UEFI binary and one legacy OpROM:
EfiRom -o FinalOpRom.rom -f <vendor_id>-i <device_id> -ec File1.efi -b Legacy.bin

Figure 1-26 shows the layout of the hybrid OpROM image located on a graphics add-on card.

Here is a comparison of interfaces implemented by the UEFI graphics driver part of UEFI OpROM and Legacy VGA BIOS:
 

Set a Display Mode

Retrieve EDID from a Display Device

Display Switch

GOP Driver

GRAPHICS_OUTPUT_PROTOCOL.SetMode()

Using EFI_EDID_DISCOVERED_PROTOCOL

Reentrant with different child handle in EFI_DRVER_BINDING.Start() followed by a SetMode()

Legacy

VGA

BIOS

Set VBE Mode using AX = 0x4F02 and other subfunctions

VBE DDC Extension AX = 0x4F15 and other subfunctions

Implement vendor-specific VGA BIOS extension

Currently, the majority of GPU vendors have migrated graphics device firmware to a GOP driver–based solution for add-on graphics cards or on-board GPU to be legacy-free. The GOP driver images that are part of the add-on graphics card can be signed and authenticated by the vendor and can be verified using Secure Boot. But for the hybrid image on an add-on graphics card, Secure Boot is unable to verify the legacy OpROM image as the legacy VGA BIOS doesn’t support authentication and hence is considered to be a security threat.

Why Is Open Source Device Firmware Needed?

Typically, IHVs are developing firmware for the device that is flashed into the ROM, with an assumption that the device doesn’t need periodic updates. But there might be cases where preflashed device firmware exposed some vulnerability while operating as part of the whole system and communicating with other devices and the host CPUs. Also, these devices have dedicated firmware storage for to keep the device firmware in, which is not accessible by the host CPU and hence unable to provide a patch over the runtime kernel or system firmware during boot. Here are several factors that highlight the need for an open source model while developing the device firmware:
  • Performance: As most device firmware is not able to handle the runtime updates and becomes stale over the period of time, it may not be able to work with the latest processor and chipsets. Open source firmware development would provide an opportunity to update the device firmware code with the latest algorithm and research that would provide better performance compared to proprietary firmware.

  • Security: Open source device firmware doesn’t allow any hidden backdoor for snooping into the system. As device firmware would get regular maintenance, common vulnerabilities are expected to get fixed and updated without any delay.

  • Extensibility: While vendor device firmware comes with fixed sets of capability, open source firmware would expose its capability beyond its fixed scope.

  • Community support: The open source community provides more eyes and hands for maintaining the code.

  • Cost: The product source code is available freely using a GPL license and hence doesn’t require any subscription and licensing fees.

Many wireless routers are using open source device firmware. For example, TP-Link, which is Xiaomi router firmware, is derived from OpenWrt, an open source firewall/router distribution based on the Linux kernel.

Open Source Manageability Firmware Development

In computing, the system owner typically has access to control and manage all the required hardware and software services for the target device. To satisfy the need for hardware management, the system administrator might set up an in-band management system through Virtual Network Computing (VNC) and Secure Shell (SSH) that provides remote access for the device over the network or using serial ports. This mechanism to access the device is typically cost effective because software that is required for remote management is installed on the system itself and works only after the system has booted to an operating system. Hence, in-band management has limited scope, and when the system is off, it’s not possible to be managed by in-band management. It also isn’t capable of meeting the remote IT infrastructure management requirements, where an IT administrator would like to access system firmware settings, reinstall the operating system remotely, or provide a fix for when the system is unable to boot. Figure 1-27 shows the in-band management block diagram.

A block diagram of the Remote system joined with the management server via in-band management. Remote system has management access, applications, operating system, system firmware, and hardware.

Figure 1-27

High-level diagram of in-band remote management

This mode of managing the remote systems doesn’t have any dependency over the underlying firmware running on the remote system. When the network is down or the system is in an off state, one needs physical access to bring the system back into the network; it needs someone to travel near to the device, which might not be a feasible solution for data centers and remote sites. The Natick project from Microsoft is building the world’s first underwater data center. Therefore, out-of-band management provides an alternative path for managing the remote system. Even when the system is not on a network, it is turned off, in sleep mode, hibernated, or inaccessible to any mode of in-band access. This mode of operation relies on the remote management hardware, which is completely independent of the main processor power supply and network connection and can even perform remote operations such as reboot, shutdown, and monitoring the hardware sensors (i.e., fan speed, power voltages, hard disk health, chassis intrusion, etc.). Figure 1-28 shows the out-of-band remote management hardware block diagram.

A block diagram of the Remote system joined with the management server via out-of-band management, written over a two-way arrow. Remote system has applications, operating systems, system firmware, and hardware, and a management controller. The last two levels are connected by two-way arrows.

Figure 1-28

High-level diagram of out-of-band remote management

The modern server motherboards are the default coming with a built-in remote management controller. The out-of-band remote management can use either dedicated network interface controllers (NICs) or shared NICs for remote access. The shared NIC can be used for multiplexing the Ethernet connection between the host system operating system and the remote management controller so that while incoming traffic flows on the hardware, it is routed to the remote management controller before reaching the host system. It also has multiple interfaces like Enhanced Serial Peripheral Interface (LPC/eSPI), PCIe, SMBUS, USB, etc., to communicate with the host system.

Here are the operations that a remote admin can perform with out-of-band enabled:
  • Keyboard-video-mouse (KVM): Out-of-band management allows access to host CPU resources like the keyboard, video, and mouse, which provides broadcasting of video output to the remote terminals, and receiving the input from the remote keyboard and mouse can be used to configure the system firmware settings even prior to booting the OS.

  • Perform remote recovery: An admin can also access remote system disk images from the local boot media or over a network and therefore can be used to recover the system in case the OS crashes and the system reaches an OS recovery.

  • Remote power on/off: The remote administrator can schedule wake and update-like features to ensure the system is always updated with critical security patches. It also ensures the system’s availability over the network 24/7 using resource preservation by keeping the device in low-power mode after an update.

  • Remote sessions: It allows client-initiated remote sessions to monitor, manage, and troubleshoot any pre-OS and OS-related defects.

Out-of-band management requires seamless access of the system, and the remote management controller belongs to the platform hardware. The firmware that is running as part of the remote management controller is considered to be highly privileged components (Ring -3 as described earlier). This firmware remains active during the entire life of the system and can even control the system when it is powered off. Thus, any vulnerability that exists in the manageability firmware can easily remain hidden from the traditional security measures and put the entire system at risk where intruders can take over the system remote management and allow data exfiltration attacks.

This section will provide a brief overview of some manageability firmware that was developed using proprietary firmware and looking at the possibilities of migrating to the open source firmware.

Baseboard Management Controller

A baseboard management controller (BMC) is a special processor that sits on the server motherboard and is responsible for providing the server management. Other components like high-end switches and Just a Bunch of Disks (JBODs) and/or Just a Bunch of Flash (JBOFs) platforms also include BMCs for out-of-band management. The BMC is responsible for monitoring and managing the physical state of a computer, network server, or other sensor-based hardware and passing that information to the system administrator through an independent connection. The key parameters that BMC measures are temperatures and voltages, fan speeds, humidity, inventory data such as serial numbers or product names, and remote powering on/off of the main CPUs. It notifies the system administrator if any of these parameter values has drifted from its allowable known limit. It allows the system admin to take measures to avoid any anomalies in the server stability and reliability. Figure 1-29 shows the hardware block diagram of a server platform using an ASPEED BMC chip (AST2500).

A schematic diagram of the motherboard server has multiple components connected to each other by flow arrows.

Figure 1-29

Server motherboard hardware block diagram with BMC

Figure 1-30 shows the AST2500 BMC chip, which is the leading BMC chip used on the server platform (the latest is AST2600 with ARM Cortex A7) using an ARM11-based SoC with these features:
  • Ethernet: The Reduced Media-Independent Interface (RMII) and Reduced Gigabit Media-Independent interface (RGMII) are interfaces to connect an Ethernet MAC block to a PHY chip.

  • Flash memory: This is the Serial Peripheral Interface (SPI) flash memory that contains the BMC firmware for booting the SoC.

  • Memory: This is 800Mbps DDR3 or 1600Mbps DDR4 memory with 16-bit data bus width. Having more memory provides increased performance.

  • PCIe: The on-chip PCIe 2D VGA provides a local display capability with resolution up to 1920×1200 without adding an extra VGA add-on card.

  • USB: The USB 2.0 virtual hub controller allows up to five devices and a USB 1.1 HID device controller for keyboard and mouse support.

  • LPC/eSPI: This is a Low Power Count (LPC) or Enhanced SPI (eSPI) bus for communicating with the host.

A block diagram has A S T 2500 S o C block with C P U and six divisions, three on each side. On the left are36 k S RAM, S P I, and U S B controllers. On the right are U ART and D D R 3 or 4 controllers and P H Y block. S P I flash connects to the S P I controller. The serial console, D RAM, and Ethernet are on the right side.

Figure 1-30

AST2500 hardware block diagram

BMC allows system administrators to make use of KVM functionality for remote redirection of the keyboard, video, and mouse to a remote network-attached management console. Hence, it allows the remote admin to perform the low-level tasks while the operating system is not yet available.

Intelligent Platform Management Interface

The Intelligent Platform Management Interface (IPMI) is an interface specification that allows manageability firmware to monitor or manage the host system independent of the host CPU, system firmware, and operating system. IPMI is a message-based, hardware-level interface specification that is used by the system admin for out-of-band management of computer systems. This specification, jointly developed by Intel, Hewlett Packard, Dell, and NEC, is intended to perform the following operations:
  • OS-independent scenarios: Regular monitoring of platform-critical components like temperature. Various built-in sensors in the computer system and power supply voltage allow remote access to the system if the host system is powered off. It allows changing the BIOS settings for recovery boot and/or installing new operating systems into the block device.

  • While the OS is running: It allows the admin to access the operating system login console remotely to manage services such as installing virtual drives, populating management data and structures to the system management software, etc.

IPMI supports the extension of platform management by connecting additional management controllers to the system. Figure 1-31 shows the IPMI subsystem block diagram, which consists of BMC as the main controller and other management controllers distributed among different system components that are referred to as satellite controllers. The BMC is the heart of the IPMI architecture that manages the interface between the system management software and the platform management hardware. It provides autonomous monitoring, event logging, and recovery control, and it works as the main channel between the system management software and the IPMB and ICMB. The Intelligent Platform Management Bus/Bridge (IPMB) is an I2C-based bus that provides a standardized interface between BMC and the satellite controllers within a chassis. It also serves as a standardized interface for auxiliary management add-in cards. The Intelligent Chassis Management Bus (ICMB) provides a standardized interface for connecting satellite controllers and/or the BMC in another chassis. By providing the standardized interface, a baseboard can be easily integrated into a variety of chassis that have different management features. The Field Replaceable Unit (FRU) information is used to provide the inventory information, such as vendor ID and manufacturer, etc., about the boards that the FRU information device is located on. A sensor data record (SDR) repository provides the properties of the individual sensors (i.e., temperature, fan speed, and voltage) present on the board. Physical interfaces to the BMC include SMBUSs, the RS-232 serial console, and IPMB, which enables the BMC to accept IPMI messages from other management controllers in the system.

A framework of the I P M I subsystem has two block diagrams. The top diagram has Intelligent Platform Management Controllers, along with Baseboard and Chassis management controllers. The below diagram has the system interface and its branch components. Both blocks are connected by a system bus, I P M I messages, and System-B M C interfaces.

Figure 1-31

IPMI subsystem block diagram

There is a significant concern that the BMC is a closed infrastructure that allows administrators to have direct access to the host systems. A direct serial connection to the BMC is not encrypted, and the connection over the LAN may or may not use encryption and might raise the platform security risks. The following sections provide some serious security concerns with BMC and the reason for open source adaptation in the BMC project.

Figure 1-32 shows the typical server platform remote management that allows remote accesses with BMC implementing IPMI specifications.

A pictorial representation of remote server management. The remote console divides into three types of B M C chips, which connect to three different servers and then flow to the internet and the various clients.

Figure 1-32

Remote server management using BMC-IPMI

The assumption was that the admins would be managing the computer systems over the trusted and controlled network, and the IPMI stack doesn’t pay great attention to ensuring security. Many BMC firmware doesn’t implement Secure Boot. The BMC is the ultimate security liability due to its privileged operations, and a compromised BMC would allow attackers to have access to a remote network connection, which had been in the realm of physical administrative access.

The BMC would allow the remote admin to access the OS console and mount the virtual media, typically used for recovering remotely or installing a custom OS image. A group of security researchers has found a severe BMC vulnerability on a server platform where the virtual media service allows plaintext authentication, allows unencrypted data over network, uses a weak encryption algorithm using a fixed key compiled into the BMC firmware, and possibly allows authentication bypass while authenticated to the virtual media service. These weaknesses would allow an attacker to gain unauthorized access to the virtual media. The BMC hardware allows the creation of virtual USB devices. Hence, upon authenticating using a well-known default username and password for the BMC, an attacker would be able to perform any of a USB-based attacks against the server remotely including data exfiltration, booting from untrusted OS images, or direct manipulation of the system using a virtual keyboard and mouse. A study also reveals that more than 47,000 BMCs from different countries are exposed to the Internet. Also, an attacker who compromises the host system could use it to attempt to compromise the BMC as well, as BMC always remains on without power-off. Hence, it would be difficult to remove such malware from the BMC. A BMC rootkit could provide the attacker with backdoor access that remains hidden from IPMI access logs and insusceptible to host OS reinstallation or password changes. In 2019, a vulnerability was detected on the BMC chip, where malware could be installed on the BMC from the local host via the PCIe or LPC interface. Because of the closed nature of the IPMI implementation, once attackers gain control of the BMC, it’s difficult to know their presence and remove them from the system. Such requirements lead the data centers to adopt the BMC firmware that is getting developed using open source projects.

OpenBMC

The OpenBMC project is a Linux Foundation project, which is intended to replace the proprietary BMC firmware with a customizable, open source firmware implementation. OpenBMC is a Linux distribution for management controllers that is used in devices such as servers, rack switches, telecommunications, etc. In 2014, four Facebook engineers at Facebook’s hackathon event created the first prototype of the open source BMC firmware, called OpenBMC for BMC inside the Wedge (a rack switch developed by Facebook) platform. In 2018, OpenBMC became a Linux Foundation project. OpenBMC uses the Yocto Project for creating the building and distribution framework. It uses D-Bus as an interprocess communication (IPC) method. OpenBMC includes a web application for interacting with the firmware stack. OpenBMC added Redfish support for hardware management.

Facts

Redfish is an open industry-standard specification used for hardware management. It is defined by the Distributed Management Task Force (DMTF). OpenBMC uses Redfish as a replacement for IPMI over LAN.

The features being implemented by the OpenBMC include the following:
  • Host management: power on/off, cooling, LEDs, inventory, and events

  • Compliant to the IPMI 2.0 specification

  • Code update support for multiple BMC/BIOS images

  • Provides web-based user interface

  • REST management: BMCWeb Redfish, host management using REST APIs

  • SSH-based SOL

  • Remote KVM

  • Virtual media, etc.

This section provides the implementation details of OpenBMC running on Wedge. As described earlier in Figure 1-26, BMC hardware is a reduced-feature computer system; hence, OpenBMC is designed as a complete Linux distribution so that it can extend the support for other BMC vendor SoCs and boards. The OpenBMC project includes a bootloader as u-bootb, a Linux kernel with a minimal rootfs that contains all the tools and binaries needed to run OpenBMC, and board-specific packages:
  • Both the bootloader and the Linux kernel include various SoC-specific firmware and hardware drivers like I2C, USB, PWM, and SPI drivers, etc.

  • The open source packages include common applications such as busybox, i2c tools, openssh, and Python.

  • The board-specific package includes initialization scripts and tools that are specific to a board. Examples are a tool that dumps inventory from the EEPROM and a fan-controller daemon to control the fan speed based on environment readings.

Figure 1-33 illustrates the OpenBMC package running on the BMC inside the Wedge platform.

A block diagram divides into four blocks, application, kernel, firmware, and hardware, from top to bottom. Each block has components of the S o C, and Board layers.

Figure 1-33

OpenBMC Stack on the Wedge platform

All packages in OpenBMC are grouped into three layers, as shown here:

Common Layer

This layer includes packages that can be used across different SoCs and boards. For example:

common/recipes-rest

common/recipes-connectivity

common/recipes-utils

SoC Layer

The SoC layer includes packages that are specific to BMC SoCs. Figure 1-29 shows the SoC layer is part of both the bootloader and Linux driver as it implements a specific driver to communicate with the Aspeed BMC chipset. For example:

meta-aspeed/recipes-bsp/u-boot

meta-aspeed/recipes-core

meta-aspeed/recipes-kernel/linux

Board Layer

The packages being included in this layer are specific for the current board. Figure 1-29 shows the configuration, initialization scripts, and tools that are specific for Wedge. For example:

meta-aspeed/recipes-core

meta-aspeed/recipes-kernel

meta-aspeed/recipes-wedge

Generating an OpenBMC image for a specific board requires these three layers: the common layer, a SoC layer for the BMC SoC used in the board, and a board-specific layer for the targeted BMC board.

u-bmc

u-bmc is a project that was developed almost at the same time as OpenBMC. u-bmc is a Linux OS distribution for the BMC that was developed using open source firmware. The goal of u-bmc is to ensure that critical and highly privileged code like the BMC is easy to audit and adheres to modern security. u-bmc was written in Go and replaces the industry-standard IPMI with gRPC to reduce the attack surface and provide improved security.

u-bmc uses u-root as a framework to create a minimal Linux distribution that gets loaded after bare-minimal initialization by the BMC bootloader. Figure 1-34 shows the u-bmc boot flow where it loads the Linux kernel after the basic platform initialization as part of u-bmc.

An interface displays u-b m c firmware boot flow in detailed view. Several lines of software code depict the DRAM, warning, flash, using default environment, In and out, error and net.

Figure 1-34

u-bmc firmware boot flow

Facts

u-root incorporates four different projects as follows:

  • Go versions for standard Linux tools, for example: ls, cp, etc.

  • To compile many Go programs into a single binary

  • To create a Go-based userland that works as initramfs for the Linux kernel

  • Go bootloaders that use kexec to boot Linux kernels

It’s possible to ship the entire server firmware development using open source firmware where the system firmware is developed using coreboot, and u-bmc can be used for the BMC to boot the Linux distribution.

RunBMC

The benefit of open source is not only limited to firmware and software. Using open source projects has security advantages over closed source firmware software, where with more eyes and hands for review, testing and bug fixes provide improved code quality. Within Open Compute Project (OCP), the community has started looking into designing open source hardware that provides more efficient, flexible, secure, and scalable design. RunBMC is an open source hardware specification that defines the interface between the BMC subsystem and OCP hardware platforms, such as network or computer motherboards. The BMC is built around a SoC that provides access to common functionality like RMII, and RGMII provides access to the Ethernet, PCIe, or LPC/eSPI interface to interact with the host system. Typically, the BMC chip is soldered onto the motherboard. The RunBMC design separates the BMC from the host motherboard by creating a RunBMC daughterboard card that interfaces with the host system through a 260-pin SODIMM DDR4 connector. Figure 1-35 shows an example of RunBMC daughterboard I/O connectivity. The RunBMC interface includes specifications such as RGMII, RMII, LPC/ESPI, PCIe, USB and various serial interfaces, and GPIOs for communication.

A block diagram flows from the host system motherboard with 260 Pin S O D I M M transformed into 260 pin Gold digger in the B M C daughter card. The B M C daughter card has various components connected via signal interfaces.

Figure 1-35

RunBMC daughterboard card block diagram

This design is more stable and secure because it modularizes the BMC subsystem, where the entire security effort is now shifted onto a single BMC card. It provides an opportunity to vendors for hardening the hardware security independently by adding security features like Titan, Cerberus, or TPM chips into the daughter card to implement a hardware-based root of trust. Also, a swappable BMC card is easy to replace if detected vulnerable or updated, without impacting the entire host system.

To summarize, out-of-band management for computing systems is an innovation that saves costs and minimizes the computer downtime on failure without physically visiting the data centers. But the availability of monitoring, accessing, and controlling the host system using BMC might increase the platform attack surface due to the closed source nature of BMC firmware and the higher privileged level that it operates. In the past, the security researchers have done ample studies to highlight the BMC vulnerabilities. The OpenBMC project sets the stage for BMC firmware and hardware development using an open source model. Having an open source hardware interface and BMC firmware developed with open source firmware provides visibility into the utmost privileged rings of the platform security, which was always closed otherwise.

Zephyr OS: An Open Source Embedded Controller Firmware Development

This section provides a brief overview of the embedded controller (EC) that is often found in low-power embedded systems and is responsible for managing various tasks that the system firmware and an operating system can’t handle. It’s important to understand the EC hardware control block, its communication with the host system, etc. This knowledge is essential to establish the trust in the system boot process, as the firmware that is running as part of the EC is an independent entity in the computing system that is capable of accessing the platform components directly. The majority of EC projects are developed using proprietary code; hence, it’s important to have visibility into all firmware that is part of the computing system.

Embedded Controller

The embedded controller can refer to the heart of the client and IoT computing device. The EC is the first microcontroller unit (MCU) on an embedded system that receives power when the user presses the power button or any other possible source to power on the system. The EC is responsible for orchestrating the platform power sequencing in recommended order (as per the platform design guide) so that it can release the host CPU from reset. In addition, it does lots of other things.
  • Battery charging: Tasks includes managing the battery charger and the battery, detecting the presence of AC, and reporting its change status.

  • Thermal management: Tasks include measuring the temperature of board components (CPU, GPU, several sensors on board) and taking action to control the fan speeds, CPU throttling, or force power off based on critical sensor data.

  • Keyboard: The EC is also referred to as the keyboard system controller (KSC), which takes care of receiving and processing signals from the keyboard.

  • Hardware buttons and switches: Tasks include receiving and processing signals from hardware buttons (typically laptop/tablet button array) and switches (laptop lid).

  • Backlight, LEDs: The EC implements the LED’s control indicators (RGB) for battery, power, AC, caps lock, num lock, scroll lock, sleep, etc. Also, it is able to control the display and keyboard backlight.

  • Peripheral control: The EC is able to turn on and off several platform components like WiFi, Bluetooth, USB, etc.

  • Debug interface: The EC controller provides a UART port for serial debug and a Port 80h BIOS debug port. These are primarily used for testing, debugging, and remote administration of the device.

The embedded controller is a separate chip that is soldered on the motherboard, which includes a low-power processor, memory (SRAM and ROM), several I/Os, and an interface with the host system through one of the common interfaces such as Low Pin Count (LPC), eSPI, and I2C. It’s being designed as a stand-alone microcontroller that can operate in low-power mode. The EC can access any register in the EC address space or host address space. The LPC/eSPI host controller can directly access peripheral registers in the host address space. Figure 1-36 shows a generic embedded controller block diagram. The embedded controller always remains on if the system has power attached. With the release of the internal reset signal that resets the processor in the EC control block, the processor will start executing code from the ROM. The boot code part of ROM executes a secure bootloader, which downloads user code from an external SPI Flash and stores it in the SRAM. After that, the boot code jumps into the user code and starts executing.

An embedded controller flow diagram depicts that S R A M and R O M are described clearly in the computer interface. It branches into various operational components.

Figure 1-36

Generic embedded controller block diagram

Facts

There are two possible ways that define how the EC should access its user code from SPI Flash.

  • Master attached flash sharing (MAFS): In this mode, the EC won’t have a dedicated SPI Flash; rather, it shares the SPI Flash with the host system PCH. EC will access SPI Flash over the eSPI flash sharing channel.

  • Slave attached flash sharing (SAFS): The EC will have access to dedicated SPI Flash using the SPI interface, and PCH will access SPI Flash over the eSPI flash sharing channel.

After the user code starts executing, it configures the GPIOs as per the platform needs and initializes the host interface. The EC and host system can communicate with each other using EC HOST commands and trigger ACPI events for interrupting host system and memory-mapped regions shared between EC and Host CPU address space. The following is an embedded controller firmware architecture overview so you can understand the internals of EC operations that help to perform independent tasks in a timely manner.

EC Firmware Architecture

Embedded controller firmware is responsible for performing the platform power sequencing and remains active all the time even when the platform control reaches the operating system. It needs to perform several independent tasks such as thermal monitoring, battery management, keyboard control, etc. This section will provide a high-level firmware architecture overview so you can understand the innards operations being managed by the EC.

Tasks

Most of the operations being performed by the embedded controller are in the context of a task. Since embedded controllers are not multicore, the scheduling of the tasks is done using some time slicing algorithm to achieve the multitask execution. Each task has its own fixed stack size assigned. There is no heap (malloc) to use, and all variable storage is explicitly declared at build time. Each task has its given priority, and it continues to run until it exits or a higher-priority task wakes up and takes control away from a lower-priority task. At a high level, it’s a loop that initiates all tasks unless there are wait eventsor a define sleep duration before resuming a task.

Callbacks

Callbacks allow you to register a function to get executed at a later point of time when a particular event occurs such as a callback to handle all button change events. Typically, these callbacks are registered by one module and invoked by different modules. If more than one callback needs to be run at the same time, then it’s getting called as per the priority order.

GPIOs

The board-specific code inside the EC source is to configure the GPIOs to allow SoC power transition and system transitions. The GPIOs can be configured as inputs, outputs, and interrupts. Typically, these are getting configured as part of the board init() function or based on certain callbacks like power off() while the system is transiting its state. The interrupt handles part of each module to read the GPIO status prior to transferring the call into the handle functions.

Modules

Operations being managed by the embedded controller are grouped into modules, and each module is self-contained after the GPIOs are configured. This includes initialization sequence, state machines, and interrupt handlers. Examples of such modules are peripheral management, power sequencing, SMC host, battery management, thermal management, KBC host, etc.

Debugging
EC firmware provides a set of debug services such as serial console, exception handler, and port 80 display.
  • Serial console: This is the traditional approach while developing or debugging EC firmware, where a serial console would be handy to indicate the problem. Also, some EC implementations use the static console buffer for ease of debugging, while the host reset doesn’t clear this buffer and persists across multiple reboots. An interactive EC console would help to run various key commands and system management independent of the host system.

  • Exception handler: If the EC firmware runs into an error, the easiest way to inform the user about the problem is by dumping the current operating stack. The exception handler contains some interesting information like the program counter (pc) and link register (lr), which indicates the code that the EC was running when the panic occurred.

  • Port 80 display: Initialize the port 80 display and use this to indicate any error in the following format: ECxx, where xx refers to the specific error code.

Host CPU to EC Communication
The embedded controller provides a unique feature that allows you to perform complex low-level functions through a simple interface to the host CPU. The most commonly used embedded controllers include different communication channels that connect the embedded controller to the host CPU, allowing bidirectional communications. It helped to reduce the host processor latency in communicating with the embedded controller. There are different methods by which the host CPU communicates with the embedded controller.
  • Host commands

  • Embedded controller interface

  • Shared memory map

Host Commands

The host CPU communicates with the EC by issuing host commands. These commands are identified by a command ID. When a host CPU is intended to issue a command (and data) to the EC, depending on the current operation phase (i.e., system firmware/BIOS or OS), it involves other software components. If system firmware is sending the host commands, it can directly send them to the EC, but from the OS layer, it first communicates via the EC kernel drivers, and then it receives the raw commands and sends them on the EC. The host packet is received by the EC board-specific code, which further sends the command to the common layer that runs the host command. While this is happening, the EC needs to indicate to the host that it is busy processing and not yet ready to give a response. If the host command expects a response, then the EC responds with the result and data to the host CPU. An example of a host command is to read the board ID by sending the SMCHOST_GET_FAB_ID command 0x0D to the EC.

Embedded Controller Interface
The Embedded Controller Interface connects embedded controllers to the host data bus, allowing bidirectional communications. The embedded controller is accessed at 0x62 and 0x66 in the host system I/O space. Port 0x62 is called a data register (EC_DATA) and allows bidirectional data transfers to and from the host and embedded controller. Port 0x66 is called the command/status register (EC_SC); it returns port status information upon a read and generates a command sequence to the embedded controller upon a write. Figure 1-37 shows that this interface is implemented using the ACPI specification. The figure defines more than one type of communication using the host interface.
  • The embedded controller command set allows operating system–directed configuration and Power Management (OSPM) to communicate with the EC. The ACPI defines the commands between 0x80 and 0x84 to tell the EC to perform an operation. For example, the Read Embedded Controller (RD_EC) command 0x80 allows OSPM to read a byte in the address space of the embedded controller.
    • The host waits for the Input Buffer Full (IBF) flag on the EC_SC to be 0.

    • The host writes the command byte to port 0x66 and the address byte to port 0x62.

    • The EC generates SCIs in response to this transaction from the SMC ACPI handler.

    • The SMC command handler passes control to the actual EC SoC code. To receive data from the EC, wait for Output Buffer Full (OBF) to be set, which indicates there is incoming data.

A block diagram depicts the A C P I specifications used for implementing the interface with different host interfaces.

Figure 1-37

Embedded controller shared interface

  • When the embedded control has detected a system event that must need to communicate to OSPM, it first sets the SCI_EVT flag in the EC_SC register, generates an SCI, and then waits for OSPM to send the Query Embedded Controller (QR_EC) command 0x84. The OSPM driver detects the EC SCI when the SCI_EVT (SCI event is pending) flag in the EC_SC register is set and sends the QR_EC command. Upon receipt of the QR_EC command byte, the embedded controller places a notification byte value between 0x00 and 0xFF, indicating the cause of the notification. OSPM driver query ACPI control method with that value in the form of _Qxx where xx is the number of the query acknowledged by the embedded controller. Here’s an example to explain the scenario better:
    • A change in the LID switch status would trigger an GPE from the GPE bit (LAN_WAKE_N) tied to an embedded controller.

    • The OSPM driver queries the EC to know the query number.

    • The host system firmware has implemented a control method (_Qxx) corresponding to the OSPM query.

    • The ACPI control method notifies OSPM that the LID switch status has changed to Notify (LID0, 0x80); LID0 is ACPI device entry for LID.

    • The OSPM driver further calls the LID ACPI device control method to read the LID switch status (LIDS).

    • LID switch status can be read using the EC ACPI command either through port 0x62/0x66 or through the LPC register.

    • Upon reading from the EC, the ACPI control method passes the value to the OSPM driver, and the OS takes the necessary action.

Figure 1-38 describes this communication graphically.

A block diagram flows from two lids labeled open and closed lids. Then it connects to the embedded controller, Host system, O S P M driver, and finally to System firmware A C P I and A S L.

Figure 1-38

Software implementation of the EC control interface

Shared Memory Map

Some systems have memory regions shared between the embedded controller and the host system address space. The size of this memory region is limited and treated as read-only (RO) on the host system side. This memory is maintained by the EC to pass various interesting information such as battery status, thermal sensor data, battery information, fan speed, LID switch status, etc. A system that doesn’t support this shared memory needs to send host commands to read that information.

Challenges with Closed Source EC Firmware

The firmware that is running as part of the embedded controller is working at a higher privileged level where unauthorized access won’t be detected by the security controller running as part of the system firmware or an OS. Also, if proprietary EC firmware code doesn’t implement Secure Boot or verified boot, then it’s allowed to run even unsigned and untrusted images. There have been several vulnerabilities being reported in the past where the EC can run unsigned firmware, and having a compromised EC firmware results in a denial-of-service (DoS) attack on the system. To bring more reliability and efficiency to computing systems and allow visibility into the most privileged code that is running prior to the host CPU reset, the industry is promoting an open source and collaborative embedded controller development using the Zephyr OS.

The next section will provide details about developing EC firmware using an open source advantage, but prior that, let’s understand the reference hardware design that enables independent EC development using a Modular Embedded Control Card (MECC).

Modular Embedded Controller Card

Typically, the embedded controller chip is soldered onto the motherboard, and many system firmware updates also ensure the EC firmware is upgraded. If the platform has detected a bug in the EC firmware boot code and the system firmware doesn’t provide the provision to update the EC firmware, it’s impossible to replace the defective EC firmware as it’s integrated into the motherboard design. To mitigate this problem and have an independent, modular EC firmware development that supports various EC SoC vendors, the MECC card was developed. The host system using open source EC firmware may take advantage of the MECC specification, where different EC SoC vendors can develop and validate their solution through an add-on card rather than creating multiple hardware designs with a dedicated on-board EC. Figure 1-39 shows the MECC card design and interfacing with the host system through the MECC connector. The MECC AIC board design is an independent solution that combines the processor, memory, ROM, and different I/Os as part of the MECC board, for example, serial port for debug, SPI Flash, keyboard connector, etc., and interfacing with the host system is through the MECC (AIC) connector using the LPC/eSPI interface. Hence, it’s easy to replace the AIC if broken; just upgrade the EC firmware using an external SPI programmer.

A block diagram depicts the host system motherboard of the M E C C controller connecting with the M E C C A I C of the embedded controller. The M E C C Board has an embedded controller in the middle and connects to various components by signal interfaces.

Figure 1-39

MECC AIC interfacing with the host system

Open source embedded controller firmware development using Zephyr-OS provides a scalable architecture that enables different MECC cards with different EC SoC vendors. The vendor added Hardware Abstraction Layer (HAL) and EC SoC Board Support Package (BSP) support along with Zephyr RTOS.

Zephyr-Based EC Firmware

The Zephyr OS is a small-footprint kernel that is designed for use on resource-constrained and embedded systems. The kernel has support for cross-architectures and is distributed under the Apache 2.0 license. Zephyr is an open source project managed by the Linux Foundation. Zephyr provides a huge set of features.

Threads: Typically, the operations performed by the embedded controllers are task based, and hence, migrating to Zephyr for embedded controller firmware development would be effective with scheduling algorithms provided by Zephyr for creating a multithreaded environment. The creation of a thread in Zephyr uses a statically defined approach using K_THREAD_DEFINE. The thread may be scheduled for immediate execution or a delayed start. Zephyr provides a comprehensive set of thread scheduling choices that use a time slicing algorithm to achieve a multitask environment.
  • Cooperative time slicing: Each thread should intermediately release the CPU to permit the other threads also to execute. It can be achieved using predefined sleep between tasks or explicitly releasing the CPU.

  • Preemptive time slicing: Use the Zephyr scheduler that allows other threads to get their chance for execution without causing a starvation.

The tasks performed by the embedded controller are non-time-critical tasks; hence, cooperative time slicing is more applicable where each task would sleep for some predefined amount of the time before becoming ready to perform the task again.
  • Memory: Zephyr allows the static allocation of memory with a fixed-stack size. It implements memory protection and prevents the EC stack from getting overflowed, having access permission tracking for kernel objects and device drivers. Provides user-space support using MMU/MPU. A platform without MMU/MPU support combines BSP code with a custom kernel to create a monolithic image where both the BSP and the kernel code share the same address space.

  • Resource definition at compile time: Zephyr ensures all system resources are defined at compile time to reduce the code size and increase performance.

  • Security: A dedicated team is responsible for maintaining and improving the security. Also, having visibility via the open source development significantly increases the security.

  • Device tree: The device tree is used to describe the hardware. Information from the device tree is used to perform various device-related operations inside the BSP and/or application layer.

The embedded controller firmware development using ZephyrOS is a new and a growing area where the introduction and support of new protocols and peripherals help to go beyond the traditional I/Os from what is being available in stand-alone microcontroller units. Figure 1-40 shows the Zephyr-based EC firmware architecture diagram.

An interface for the host system connects the hardware system with firmware. The Host interface has four layers, namely application, kernel, drivers, S o C definition and Board definition, then Vendor H A L and Cross-Architecture Support, and lastly Platform Hardware.

Figure 1-40

Zephyr-based embedded firmware architecture diagram

The modular and configurable architecture of ZephyrOS-based embedded controller firmware development is hardware and vendor agnostic. It allows switching the underlying hardware abstraction layer (HAL) and drivers based on different EC SoCs, while the middleware and logic remains intact. To use a different MECC card with different EC SoC vendors, the developer needs to add its HAL and board support package (BSP) to the Zephyr EC firmware repository.

The following table describes the components inside the EC firmware that are required while performing low-level operations:

Board

Board-specific code and configuration details. This includes GPIO map, battery parameters, routines to perform board-specific init.

SoC/vendor-specific HAL

MCU-specific code that deals with lower-level hardware like registers and hardware blocks.

Drivers

Lower-level drivers for UART, GPIO, Timer, I2C, etc. These drivers are using a device tree and vendor-specific HAL to access the underlying embedded controller hardware block.

The ZephyrOS layer provides upper-level code that manages the thread, memory management, I/Os management, etc. This includes high-level drivers that publish Zephyr APIs to allow the host interface to access the embedded controller. The host interface defines the hardware link between the embedded controller and host system (PCH and CPU) to work on a computing system. This link can be between LPC or eSPI or I2C, which is the section that covers all the basic applications of EC firmware on an eSPI-enabled platform.

Power Sequencing
This section describes the application of embedded controller firmware that handles the platform power sequencing, and at the end of this flow, the host processor is able to come out from the reset. The embedded controller firmware follows the host system platform design guideline that specifies the power sequence and its timings. Any platform state transition has to pass through this module. These events that trigger platform state transitions can be power signals that either come from a host system like ACPI system transitions or are generated by board circuitry like the power button, AC supply, etc. Refer to the following table that demonstrates system transitioning from G3 to S0:

System Transitions Between G3->S0

1. The power good (PWRGD) signal to the embedded controller hardware indicates when the main power rail voltage is on and stable. The processor part of the embedded controller will start executing code from the ROM. The boot code is used to download code (user code) from an external flash via the shared flash interface. The downloaded code must configure the device’s pins according to the platform’s need. Once the device is configured for operation, the user code must de-assert the system’s Resume Reset signal (RSMRST#). Any GPIO may be selected for RSMRST# function. The board designer needs to attach an external pull-down on the GPIO pin being used for the RSMRST# function; this will ensure that the RSMRST# pin is asserted low by default.

2. Perform a deep sleep exit handshake, where PCH sends the SUS_WARN signal to the EC and the EC acknowledges by sending the SUS_ACK signal.

3. The EC indicates the PCH using the BATLOW signal that there is a valid power source or enough battery capacity to power up the system.

4. SLP_Sx (where x is based on the supported sleep state of the host system) signals from PCH to the EC indicate that the host system is transiting from the sleep state as per the platform guide.

5. Wait for ALL_SYS_PWRGD, the all-system power good (ALL_SYS_PWRGD) input generated from the board circuitry indicates to the EC that the SoC power rails are stable.

6. Based on this, the embedded controller will generate the PWROK signal.

7. PCH de-asserts PLTRST# after PWROK is stable. The PLTRST# is the main platform reset to other components.

8. The processor will begin fetching code from the SPI Flash via SPI interface.

Figure 1-41 shows the graphical representation of the system state transition between the embedded controller and host system.

A block diagram flows external circuitry, which connects with an embedded controller that connects with the platform controller hub and then with C P U.

Figure 1-41

Power sequencing with Zephyr- based EC firmware

Peripheral Management

This section describes the embedded controller managing the human interface device, which mainly handles buttons and switches attached to the motherboard.

Button Array Devices

All sorts of buttons that are present on the motherboard and need human interaction are managed by the embedded controller. For example, the Power Button is the input to EC, which is default-driven high with a pull-up. This signal goes low upon pressing the power button and triggers an interrupt. Apart from that, Volume up/down buttons, Home button, etc., are also being managed by this module.

Switches

This module is also responsible for tracking the state of laptop lid switches and other modules like the screen rotation lock.

The main job of this module is to deal with the undesirable effect any mechanical button/switch has when it strikes together (either pressed and released or open and closed), causing electrical rebound before settling down after the electrical transient time. Mechanical switch debouncing is implemented using cooperative threading to track the short and long presses of all buttons registered within the system. It also sends a notification to the host. Callbacks per button/switch are registered within the GPIO driver to track state transitions for button and switches. See Figure 1-42.

A set of three block diagrams. 1, E C firmware application, thread for peripheral management, and register power button for debouncing. 2, E C vendor H A L, G P I O pins, and G P I O Init callback registration. 3, Zephyr O S, G P I O A P I, and G P I O Driver.

Figure 1-42

Peripheral management with Zephyr-based EC firmware

Facts

The embedded controller can be used to detect the docking status of the platform, whether it is in a docking station or not. Based on this status, the EC can perform additional tasks such as switching the system power source to the dock, routing signals from onboard interfaces to the dock, and reporting the docking status to the operating system.

System Management Controller

The section describes the role of the embedded controller as a system management controller, which is used to manage the following items.

Thermal Management

The EC uses the I2C/SMBUS interface to read the platform sensor data, and based on the criticality of the platform state, the EC may have PWM interfaces that can be used to control system fans.

Power Monitoring

The embedded controller ADC signal can be used to monitor the voltage, and based on the usage of the sense resistor, it can also monitor the current consumption of specific power rails. This information could be useful to monitor the battery charging and inform the user or system administrator about any potential problematic power supply condition or detection of a bad charger.

Battery Management

The EC can be used to control the charging of the battery, switch between the battery and AC adapter as the active power source, or monitor the various battery status metrics such as temperature, charging level and battery health, etc.

ACPI Host Interface

The earlier section about “Embedded Controller Interface” provided the required details to understand the host CPU and EC communication using the ACPI host interface. The EC is capable of providing an ACPI-compliant operating system with status and notifications regarding power management events. Also, it is capable of generating wake events to bring the system out from the low-power states.

The SMC host module is implemented as a cooperative thread that registers multiple callbacks within different modules to track events in the system. See Figure 1-43.

A set of four block diagrams. 1, operating system and system firmware. 2, F C firmware application, Thread for S M C host, Register for peripheral events. 3, E C vendor H A L, E C A C P I Opregion. 4, Zephyr OS, e S P I A P I, e S P I driver

Figure 1-43

SMC with Zephyr-based EC firmware

Keyboard Controller
Typically, the EC is also referred to as a keyboard system controller (KSC), as it allows AT-compatible and PS/2-compatible support for the keyboard and mice via reads/writes to I/O ports 0x60 and 0x64. The main responsibility of this module is to inform the CPU when a key is pressed or released. It also supports auxiliary devices such as a mouse. On modern computing systems, embedded controller chips have implemented support for 8,042 commands, which means the EC can receive 8,042 commands from either the system firmware or the PS/2 operating system driver. Figure 1-44 shows the implementation aspect of an embedded controller kbchost application, where the EC firmware application can pass the received command from the operating system driver to the Zephyr PS/2 driver, which performs PS/2 communication with a mouse and/or keyboard. Alternatively, the EC firmware can also receive the command and process it prior to sending a response to the OS driver. For example, kbchost gets the command 0xD4 (Send to Mouse) at KBC Command/Status Register 0x64 that indicates the destination device; then it sends a command 0xF4 (Enable) through the KBC Input/Output Buffer Register 0x60. When the host expects to receive any response, the data arrives through port 0x60.

A block diagram depicts the operating system P S forward slash 2 drivers, E C firmware application, and Zephyr O S, which passes two layers approach 1 and 2.

Figure 1-44

KBC with Zephyr-based EC firmware

Keyboard Scan Matrix
All keyboards have their keys arranged into a matrix of rows and columns; this is known as the keyboard matrix. Because of the number of signals required to represent those rows and columns, the external keyboard uses an on-board keyboard controller (KSC). It continuously scans the state of the whole grid; a circuit in the grid is closed when a key is pressed, and this is eventually sensed by the firmware running in the EC. Once the row and column have been determined, the EC maps the grid coordinates to a scan code, which is sent back to the EC firmware kbchost application and further sent the data to OS PS/2 driver as the eSPI message. Figure 1-45 shows the implementation of the keyboard scan matrix as part of the Zephyr-based OS.

A block diagram depicts the operating system P S forward slash 2 driver, E C firmware application, B S P, and zephyr O S in four columns with a description.

Figure 1-45

Managing a KeyScan event with Zephyr-based EC firmware

This implementation allowed ODM/OEMs to have specific hotkeys as part of the laptop keyboard layout, which is not supported by the international keyboard standard, for example: change screen brightness, enable/disable wireless networking, control audio volume, etc.

To summarize, the embedded controller is a special microcontroller that is part of the majority of mobile computing systems. The firmware that is running on the EC chip is operating at a much higher privilege level, which does various operations that are not possible to perform by even an operating system. This firmware started operating since platform power on and remains active even if the system is at a power-off state; hence, it’s important to ensure the visibility of operations that are running part of the EC firmware. The EC firmware project is being developed with the open source Zephyr OS that has provided visibility into the EC firmware operations. The introduction of Zephyr OS makes EC development easy at the EC vendors and OEMs sides while supporting different EC SoC chips with the same host system using the MECC card.

Summary

This chapter provided an opportunity to understand the different types of firmware that exist and execute on a computing system. Typically, they are categorized as system firmware, device firmware, and manageability firmware. All these types of firmware are running at a higher privilege level compared to the kernel. In this chapter, those privileged levels are being specified as “minus” rings since any vulnerability existing in these layers is tough for any high-level security controller to detect. Hence, this chapter highlighted the need for an open source approach while developing this firmware in the future. This would finally help to restore the trust in the platform and also would provide visibility into the most privileged level firmware, which was not done before.

This chapter proposed two working principles while developing system firmware in the future for embedded systems: hybrid system firmware, where a portion of silicon vendor code is binary and communication with open source firmware is using a standard specification-defined interface, and a true open source system firmware development on an open hardware specification using a modern system programming language such as Rust. Appendix A provides the reasoning behind migrating the system firmware using Rust, an open source system programming language.

Additionally, this chapter specified the well-known mechanism for designing and developing the device firmware using closed source models like option ROM (OpROM). The legacy implementation of the OpROM might increase the platform attack surface. Hence, developing modern OpROM using the open source EDKII source code and toolchain helps to get the initial visibility into the device firmware space, but the ideal goal would be to use the open source firmware model even for developing the device-specific firmware.

The remote management for server platforms and enterprise systems demands out-of-band (OOB) access into the system using a special manageability controller to perform certain tasks when the host system is not available or demands any maintenance. The firmware that is running on these MCUs are at the highest privilege level in the ring and hence always pose the security risk if an intruder had access into the remote system. This chapter provided a brief overview of system architecture of the two widely used microcontrollers, BMC and EC, on the computing system. This will help developers perform a migration to open source firmware development for these manageability controllers in the future.

The goal of this chapter was to explain why firmware architecture is expected to evolve in the future, and it’s fairly possible that the majority of firmware development will migrate to open source. Hence, it’s important that developers understand the industry’s need and prepare themselves for the future.

Chapter 6 represented some innovation in system firmware design and development using an open source firmware approach that addresses the ongoing concerns with extensible firmware architecture that increases the firmware boundary and inherits responsibility while booting the platform.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.118.198