C H A P T E R  9

PCI Express and Thunderbolt

PCI (Peripheral Component Interconnect) is a high-speed bus developed by Intel, in the early nineties, to replace various older and slower bus technologies such as EISA, ISA, MCA, and VESA. The term PCI is often used to describe the family of technologies based on the original PCI specification. Throughout this chapter, when we refer to PCI, we refer to commonalities found in the PCI–based technologies; namely, PCI Express, Thunderbolt, and to a lesser extent ExpressCard. Most people associate PCI with expansion boards plugged into a computer, but it is worth noting that PCI is fundamental to many computer systems—even those without PCI slots, such as iMacs— that have internal PCI buses that connect the CPU to USB, Firewire, and SATA controllers. Recent PCI-based advancements (like Thunderbolt) allow the PCI bus to be extended outside of the computer, much in the same way as USB and Firewire.

PCI enjoyed widespread adoption and solved many of the problems found in older bus technologies; for example, it eliminated the need to configure jumpers on expansion cards, as resources such as memory regions and interrupts were configured automatically by the system BIOS and/or the OS itself.

PCI was extended by the PCI-X and PCI-X 2.0 standards, which allowed for a 64-bit bus width as opposed to the Legacy PCI's 32-bit bus width. PCI-X standards, having been succeeded by the PCI Express (PCIe) standard, have become obsolete. Unlike PCI-X however, PCIe uses a packet-based serial protocol, rather than the parallel interface characteristic of its predecessors. PCIe allows devices on a bus to have dedicated bandwidth instead of sharing bus bandwidth with other devices on the same bus. While PCIe and PCI are substantially different from an electrical and physical standpoint, they are backwards compatible from a software point of view; consequently, drivers require only minor (or no) changes to support newer standards.

As previously mentioned, there are myriads of PCI-related standards. We will discuss only technologies currently sold by Apple, which include PCIe, Thunderbolt, and ExpressCard. Thunderbolt is found in most 2011 or newer Macs. Thunderbolt and ExpressCard are based on PCIe technology and connect to the PCI host bridge. However, ExpressCard is being phased out in favor of Thunderbolt on all Macs, and is now found only in the 17” MacBook Pro. The Mac Pro is currently the only Mac to have physically accessible PCI Express slots after the XServe was discontinued.

This chapter begins with a discussion of the various PCI technologies that apply to the current generation of Macs. We will focus on the parts that are important to understand from a software point of view and necessary to build a functional driver for a PCI-based device. For example, we as programmers need not be concerned with how PCI functions at the electrical level. The second part of this chapter focuses on how we can interface with PCI-based devices in I/O Kit, how to match and configure them, read registers, and deal with the removal of devices. We will also address how to handle interrupts and perform DMA (Direct Memory Access), which are two typical tasks performed by a PCI-based driver.

PCI Express

PCIe was designed to replace PCI and PCI-X, as well as the AGP (Accelerated Graphics Port), a stopgap employed by graphics cards allowing for higher bandwidths not possible with PCI-X. PCIe uses uni-directional, point-to-point connections known as “lanes.” This approach avoids the PCI and PCI-X's shared bus problem; although system designers could somewhat alleviate this issue by putting each physical PCI slot on its own dedicated bus. Still, PCIe is substantially faster than its predecessors.

So far, three revisions of the PCIe standard have been released. The second generation doubled the possible bandwidth for a single PCIe lane from 250 MB/s to 500 MB/s. The third revision doubled that and can handle up to 1 GB/s per lane. PCIe typically uses lane configurations of 1x, 4x, 8x, 16x, and 32x; although, the latter is less common, especially for physical slots. Slots for graphics cards/GPUs are typically 16x lanes wide, as they require massive amounts of bandwidth. The latest revision of the Mac Pro conforms to the PCIe 2.0 standard. The latest version of Mac Pro (5,1) has four 16x lane slots, but only slots 1 and 2 are able to operate at 16x, while slots 3 and 4 operate at 4x.

Thunderbolt

Thunderbolt, a relatively new technology, was initially developed by Intel and later adopted by Apple; the latter is currently the only vendor shipping Thunderbolt-enabled computers. Although the availability of devices is limited, several companies, including Blackmagic Design, Promise Technology, and Western Digital, have announced their support for the technology. Thunderbolt is an external expansion interface that allows PCIe and DisplayPort 1.1 to be tunneled over the same cable. A cable can carry two bi-directional channels of up to 10 Gbps of data, which amounts to a total bandwidth of 40 Gbps per cable. The channels are independent of each other, and it is not possible to aggregate the bandwidth between them. Thunderbolt is also able to provide up to 10 Watts of power to devices connected to the bus. The cable uses the Mini DisplayPort connector, which is indentical at both ends.

The current specification of Thunderbolt allows up to six devices to be daisy chained. Later revisions will showcase a tree-like topology similar to that of USB. However, unlike USB, Thunderbolt allows host-to-host connections like Firewire. Apple has also enabled a target disk mode using Thunderbolt, as well as the ability to boot the operating system from Thunderbolt attached storage. Due to the fact that Thunderbolt devices communicate directly with the PCIe host system, existing devices can be updated to support Thunderbolt with relatively few modifications to the hardware (ignoring the fact that an external case and possibly an external power source are needed). On the software side, very few changes are needed (devices are still managed by the IOPCIFamily); however, one requirement is that the driver must support being dynamically unloaded.

Thunderbolt makes it possible for the Mac Mini, iMac or MacBook series computers to access high-speed storage and storage area networks, as well as high-bandwidth uncompressed video capture, which was previously reserved for the high-end Mac Pro and Xserve.

ExpressCard

ExpressCard is an older expansion interface found in the MacBook Pro series. ExpressCard is being phased out in favour of Thunderbolt; however, laptops with both ExpressCard and Thunderbolt ports are available (at the time of writing). ExpressCard is the modern version of PCMCIA and is based on PCIe. The latest standard supports transfer speeds of up to 5 Gbps.

Configuration Space Registers

All PCI devices (including bridges) have a set of registers known as the configuration space. This space is a minimum of 256 bytes for conventional PCI devices, but on technologies based on PCI-X 2.0 and PCI Express, the configuration space is up to 4096 bytes long and is referred to as the extended configuration space. The first 48 bytes of the configuration space registers are shown in Figure 9-1.

images

Figure 9-1. Standard PCI configuration space registers

The required registers are shown in gray; other registers are optional. The first 48 bytes are standardized and you will find the same layout regardless of whether the device is PCI, PCI-X, or PCIe-based. Many of the registers are no longer applicable because PCI Express is point-to-point based— it doesn't use a shared bus.

Let's look at the mandatory registers from Figure 9-1 in more detail.

  • Vendor ID: Contains a 16-bit identifier unique to each hardware manufacturer. Vendor IDs are assigned by the PCI-SIG (special interest group) of each hardware manufacturer. Apple, for example, is assigned the vendor ID 0x106b. The combination of vendor ID and device ID is often used by operating systems to determine which driver to load for a device. 0xffff is not a valid vendor ID.
  • Device ID: Also 16-bits wide. Unlike the vendor ID, the device ID can be assigned by the manufacturer and is not maintained in a central register.
  • Class Code: A 24-bit register that holds the type classification for the device. The first 8 bits hold the base class. Examples of base classes include Unclassified (0x0), Mass Storage controller (0x1), Network Controller (0x2), Display Controller (0x3), etc. The next 8 bits hold the subclass. If the base class is a display controller, for instance, the subclasses might be VGA (0x0), XGA (0x1), or other (0x80). The remaining 8 bits are used to specify the program interface (register-level interface) of the device if more than one is possible. This is used for USB controllers to verify whether they comply with the UHCI, OHCI, EHCI, or XCHI interfaces, which are register-level specifications that determine how a driver should interact with a device.
  • Subsytem Vendor/Device ID: Follows the same rules and assignments as the vendor and device IDs. The subsystem IDs are used to identify the chip, when many different manufacturers sell products using the same chip (OEM). Prime examples of this are Nvidia and ATI. They manufacture GPU chips that are subsequently used by third-party manufacturers to make the final product. The PCI configuration space of such a device contains the third-party's vendor ID and device ID, but uses either Nvidia or ATI as the subsystem vendor ID, as well as their device ID as the subsystem device ID. This allows ATI's and Nvidia's drivers to be used, even if they didn't manufacture the board directly.
  • Base Address 0-5: Contains up to six I/O regions, which can be either I/O ports or memory regions. The latter is much more common. We will discuss I/O regions in more detail shortly. A base address is often abbreviated BAR (Base Address Register).
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.254.90