So far in this book, you’ve learned how bootkits penetrate and persist on the victim’s computer by using sophisticated techniques to avoid detection. One common characteristic of these advanced threats is the use of a custom hidden storage system for storing modules and configuration information on the compromised machine.
Many of the hidden filesystems in malware are custom or altered versions of standard filesystems, meaning that performing forensic analysis on a computer compromised with a rootkit or bootkit often requires a custom toolset. In order to develop these tools, researchers must learn the layout of the hidden filesystem and the algorithms used to encrypt data by performing in-depth analyses and reverse engineering.
In this chapter, we’ll look more closely at hidden filesystems and methods to analyze them. We’ll share our experiences of performing long-term forensic analyses of the rootkits and bootkits described in this book. We’ll also discuss approaches to retrieving data from hidden storage and share solutions to common problems that arise through this kind of analysis. Finally, we’ll introduce the custom HiddenFsReader tool we developed, whose purpose is to dump the contents of the hidden filesystems in specific malware.
Figure 18-1 illustrates an overview of the typical hidden filesystem. We can see the malicious payload that communicates with the hidden storage injected into the user-mode address space of a victim process. The payload often uses the hidden storage to read and update its configuration information or to store data like stolen credentials.
Figure 18-1: Typical malicious hidden filesystem implementation
The hidden storage service is provided through the kernel-mode module, and the interface exposed by the malware is visible only to the payload module. This interface usually isn’t available to other software on the system and cannot be accessed via standard methods such as Windows File Explorer.
Data stored by the malware on the hidden filesystem persists in an area of the hard drive that isn’t being used by the OS in order not to conflict with it. In most cases, this area is at the end of the hard drive, because there is usually some unallocated space. However, in some cases, such as the Rovnix bootkit discussed in Chapter 11, malware can store its hidden filesystem in unallocated space at the beginning of the hard drive.
The main goal of any researcher performing forensic analysis is to retrieve this hidden stored data, so next we’ll discuss a few approaches for doing so.
We can obtain forensic information from a bootkit-infected computer by retrieving the data when the infected system is offline or by reading the malicious data from a live infected system.
Each approach has its pros and cons, which we’ll consider as we discuss the two methods.
Let’s start with getting data from the hard drive when the system is offline (that is, the malware is inactive). We can achieve this through an offline analysis of the hard drive, but another option is to boot the noninfected instance of the operating system using a live CD. This ensures the computer uses the noncompromised bootloader installed on the live CD, so the bootkit won’t be executed. This approach assumes that a bootkit has not been able to execute before the legitimate bootloader and cannot detect an attempt to boot from an external device to wipe the sensitive data beforehand.
The significant advantage of this method over an online analysis is that you don’t need to defeat the malware’s self-defense mechanisms that protect the hidden storage contents. As we’ll see in later sections, bypassing the malware’s protection isn’t a trivial task and requires certain expertise.
NOTE
Once you get access to the data stored on the hard drive, you can proceed with dumping the image of the malicious hidden filesystem and decrypting and parsing it. Different types of malware require different approaches for decrypting and parsing the hidden filesystems, as we’ll discuss in the section “Parsing the Hidden Filesystem Image” on page 360.
However, the downside of this method is that it requires both physical access to the compromised computer and the technical know-how to boot the computer from a live CD and dump the hidden filesystem. Meeting both of these requirements might be problematic.
If analyzing on an inactive machine isn’t possible, we have to use the active approach.
On a live system with an active instance of the bootkit, we need to dump the contents of the malicious hidden filesystem.
Reading the malicious hidden storage on a system actively running malware, however, has one major difficulty: the malware may attempt to counteract the read attempts and forge the data being read from the hard drive to impede forensic analysis. Most of the rootkits we’ve discussed in this book—TDL3, TDL4, Rovnix, Olmasco, and so on—monitor access to the hard drive and block access to the regions with the malicious data.
To be able to read malicious data from the hard drive, you have to overcome the malware’s self-defense mechanisms. We’ll look at some approaches to this in a moment, but first we’ll examine the storage device driver stack in Windows, and how the malware hooks into it, to better understand how the malware protects the malicous data. This information is also useful for understanding certain approaches to removing malicious hooks.
We touched upon the architecture of the storage device driver stack in Microsoft Windows and how malware hooks into it in Chapter 1. This method outlived the TDL3 and was adopted by later malware, including bootkits we’ve studied in this book. Here we’ll go into more detail.
TDL3 hooked the miniport storage driver located at the very bottom of the storage device driver stack, as indicated in Figure 18-2.
Figure 18-2: Device storage driver stack
Hooking into the driver stack at this level allows the malware to monitor and modify I/O requests going to and from the hard drive, giving it access to its hidden storage.
Hooking at the very bottom of the driver stack and directly communicating with the hardware also allows the malware to bypass the security software that operates at the level of the filesystem or disk class driver. As we touched upon in Chapter 1, when an I/O operation is performed on the hard drive, the OS generates an input/output request packet (IRP)—a special data structure in the operating system kernel that describes I/O operation—which is passed through the whole device stack from top to the bottom.
Security software modules responsible for monitoring hard drive I/O operations can inspect and modify IRP packets, but because the malicious hooks are installed at the level below security software, they’re invisible to these security tools.
There are several other levels a bootkit might hook, such as the user-mode API, filesystem driver, and disk class driver, but none of them allow the malware to be as stealthy and powerful as the miniport storage level.
We won’t cover all possible miniport storage hooking methods in this section. Instead, we’ll focus on the most common approaches that we’ve come across in the course of our malware analyses.
First, we’ll take a closer look at the storage device, shown in Figure 18-3.
Figure 18-3: Miniport storage device organization
The IRP goes from the top of the stack to the bottom. Each device in the stack can either process and complete the I/O request or forward it to the device one level below.
The DEVICE_OBJECT ➊ is a system data structure used by the operating system to describe a device in the stack, and it contains a pointer ➋ to the corresponding DRIVER_OBJECT, another system data structure that describes a loaded driver in the system. In this case, the DEVICE_OBJECT contains a pointer to the miniport storage driver.
The layout of the DRIVER_OBJECT structure is shown in Listing 18-1.
typedef struct _DRIVER_OBJECT {
SHORT Type;
SHORT Size;
➊ PDEVICE_OBJECT DeviceObject;
ULONG Flags;
➋ PVOID DriverStart;
➌ ULONG DriverSize;
PVOID DriverSection;
PDRIVER_EXTENSION DriverExtension;
➍ UNICODE_STRING DriverName;
PUNICODE_STRING HardwareDatabase;
PFAST_IO_DISPATCH FastIoDispatch;
➎ LONG * DriverInit;
PVOID DriverStartIo;
PVOID DriverUnload;
➏ LONG * MajorFunction[28];
} DRIVER_OBJECT, *PDRIVER_OBJECT;
Listing 18-1: The layout of the DRIVER_OBJECT structure
The DriverName field ➍ contains the name of the driver described by the structure; DriverStart ➋ and DriverSize ➌, respectively, contain the starting address and size in the driver memory; DriverInit ➎ contains a pointer to the driver’s initialization routine; and DeviceObject ➊ contains a pointer to the list of DEVICE_OBJECT structures related to the driver. From the malware’s point of view, the most important field is MajorFunction ➏, which is located at the end of the structure and contains the addresses of the handlers implemented in the driver for various I/O operations.
When an I/O packet arrives at a device object, the operating system checks the DriverObject field in the corresponding DEVICE_OBJECT structure to get the address of DRIVER_OBJECT in memory. Once the kernel has the DRIVER_OBJECT structure, it fetches the address of a corresponding I/O handler from the MajorFunction array relevant to the type of I/O operation. With this information, we can identify parts of the storage device stack that can be hooked by the malware. Let’s look at a couple of different methods.
One way to hook the miniport storage driver is to directly modify the driver’s image in memory. Once the malware obtains the address of the hard disk miniport device object, it looks at the DriverObject to locate the corresponding DRIVER_OBJECT structure. The malware then fetches the address of the hard disk I/O handler from the MajorFunction array and patches the code at that address, as shown in Figure 18-4 (the sections in gray are those modified by the malware).
Figure 18-4: Hooking the storage driver stack by patching the miniport driver
When the device object receives an I/O request, the malware is executed. The malicious hook can now reject I/O operations to block access to the protected area of the hard drive, or it can modify I/O requests to return forged data and fool the security software.
For example, this type of hook is used by the Gapz bootkit discussed in Chapter 12. In the case of Gapz, the malware hooks two routines on the hard disk miniport driver that are responsible for handling the IRP_MJ_INTERNAL_DEVICE_CONTROL and IRP_MJ_DEVICE_CONTROL I/O requests to protect them from being read or overwritten.
However, this approach is not particularly stealthy. Security software can detect and remove the hooks by locating an image of the hooked driver on a filesystem and mapping it into memory. It then compares the code sections of the driver loaded into the kernel to a version of the driver manually loaded from the file, and it notes any differences in the code sections that could indicate the presence of malicious hooks in the driver.
The security software can then remove the malicious hooks and restore the original code by overwriting the modified code with the code taken from the file. This method assumes that the driver on the filesystem is genuine and not modified by the malware.
The hard drive miniport driver can also be hooked through the modification of the DRIVER_OBJECT structure. As mentioned, this data structure contains the location of the driver image in memory and the address of the driver’s dispatch routines in the MajorFunction array.
Therefore, modifying the MajorFunction array allows the malware to install its hooks without touching the driver image in memory. For instance, instead of patching the code directly in the image as in the previous method, the malware could replace entries in the MajorFunction array related to IRP_MJ_INTERNAL_DEVICE_CONTROL and IRP_MJ_DEVICE_CONTROL I/O requests with the addresses of the malicious hooks. As a result, the operating system kernel would be redirected to the malicious code whenever it tried to resolve the addresses of handlers in the DRIVER_OBJECT structure. This approach is demonstrated in Figure 18-5.
Because the driver’s image in memory remains unmodified, this approach is stealthier than the previous method, but it isn’t invulnerable to discovery. Security software can still detect the presence of the hooks by locating the driver image in memory and checking the addresses of the IRP_MJ_INTERNAL_DEVICE_CONTROL and IRP_MJ_DEVICE_CONTROL I/O requests handlers: if these addresses don’t belong to the address range of the miniport driver image in memory, it indicates that there are hooks in the device stack.
Figure 18-5: Hooking the storage driver stack by patching the miniport DRIVER_OBJECT
On the other hand, removing these hooks and restoring the original values of the MajorFunction array is much more difficult than with the previous method. With this approach, the MajorFunction array is initialized by the driver itself during execution of its initialization routine, which receives a pointer to the partially initialized corresponding DRIVER_OBJECT structure as an input parameter and completes the initialization by filling the MajorFunction array with pointers to the dispatch handlers.
Only the miniport driver is aware of the handler addresses. The security software has no knowledge of them, making it much more difficult to restore the original addresses in the DRIVER_OBJECT structure.
One approach that the security software may use to restore the original data is to load the miniport driver image in an emulated environment, create a DRIVER_OBJECT structure, and execute the driver’s entry point (the initialization routine) with the DRIVER_OBJECT structure passed as a parameter. Upon exiting the initialization routine, the DRIVER_OBJECT should contain the valid MajorFunction handlers, and the security software can use this information to calculate the addresses of the I/O dispatch routines in the driver’s image and restore the modified DRIVER_OBJECT structure.
Emulation of the driver can be tricky, however. If a driver’s initialization routine implements simple functionality (for example, initializing the DRIVER_OBJECT structure with the valid handler addresses), this approach would work, but if it implements complex functionality (such as calling system services or a system API, which are harder to emulate), emulation may fail and terminate before the driver initializes the data structure. In such cases, the security software won’t be able to recover the addresses of the original handlers and remove the malicious hooks.
Another approach to this problem is to generate a database of the original handler addresses and use it to recover them. However, this solution lacks generality. It may work well for the most frequently used miniport drivers but fail for rare or custom drivers that were not included in the database.
The last approach for hooking the miniport driver that we’ll consider in this chapter is a logical continuation of the previous method. We know that to execute the I/O request handler in the miniport driver, the OS kernel must fetch the address of the DRIVER_OBJECT structure from the miniport DEVICE_OBJECT, then fetch the handler address from the MajorFunction array, and finally execute the handler.
So, another way of installing the hook is to modify the DriverObject field in the related DEVICE_OBJECT. The malware needs to create a rogue DRIVER_OBJECT structure and initialize its MajorFunction array with the address of the malicious hooks, after which the operating system kernel will use the malicious DRIVER_OBJECT structure to get the address of the I/O request handler and execute the malicious hook (Figure 18-6).
Figure 18-6: Hooking the storage driver stack by hijacking miniport DRIVER_OBJECT
This approach is used by TDL3/TDL4, Rovnix, and Olmasco, and it has similar advantages and drawbacks as the previous approach. However, its hooks are even harder to remove because the whole DRIVER_OBJECT is different, meaning security software would need to make extra efforts to locate the original DRIVER_OBJECT structure.
This concludes our discussion of device driver stack hooking techniques. As we’ve seen, there’s no simple generic solution for removing the malicious hooks in order to read the malicious data from the protected areas of an infected machine’s hard drive. Another reason for the difficulty is that there are many different implementations of miniport storage drivers, and since they communicate directly with the hardware, each storage device vendor provides custom drivers for its hardware, so approaches that work for a certain class of miniport drivers will fail for others.
Once the rootkit’s self-defense protection is deactivated, we can read data from the malicious hidden storage, which yields the image of the malicious filesystem. The next logical step in forensic analysis is to parse the hidden filesystem and extract meaningful information.
To be able to parse a dumped filesystem, we need to know which type of malware it corresponds to. Each threat has its own implementation of the hidden storage, and the only way to reconstruct its layout is to engineer the malware to understand the code responsible for maintaining it. In some cases, the layout of the hidden storage can change from one version to another within the same malware family.
The malware may also encrypt or obfuscate its hidden storage to make it harder to perform forensic analysis, in which case we’d need to find the encryption keys.
Table 18-1 provides a summary of hidden filesystems related to the malware families we’ve discussed in previous chapters. In this table, we consider only the basic characteristics of the hidden filesystem, such as layout type, encryption used, and whether it implements compression.
Table 18-1: Comparison of Hidden Filesystem Implementations
Functionality/malware |
TDL4 |
Rovnix |
Olmasco |
Gapz |
Filesystem type |
Custom |
FAT16 modification |
Custom |
Custom |
Encryption |
XOR/RC4 |
Custom (XOR+ROL) |
RC6 modification |
RC4 |
Compression |
No |
Yes |
No |
Yes |
As we can see, each implementation is different, creating difficulties for forensic analysts and investigators.
In the course of our research on advanced malware threats, we’ve reverse engineered many different malware families and have managed to gather extensive information on various implementations of hidden filesystems that may be very useful to the security research community. For this reason, we’ve implemented a tool named HiddenFsReader (http://download.eset.com/special/ESETHfsReader.exe/) that automatically looks for hidden malicious containers on a computer and extracts the information contained within.
Figure 18-7 depicts the high-level architecture of the HiddenFsReader.
Figure 18-7: High-level architecture of HiddenFsReader
The HiddenFsReader consists of two components: a user-mode application and a kernel-mode driver. The kernel-mode driver essentially implements the functionality for disabling rootkit/bootkit self-defense mechanisms, and the user-mode application provides the user with an interface to gain low-level access to the hard drive. The application uses this interface to read actual data from the hard drive, even if the system is infected with an active instance of the malware.
The user-mode application itself is responsible for identifying hidden filesystems read from the hard drive, and it also implements decryption functionality to obtain the plaintext data from the encrypted hidden storage.
The following threats and their corresponding hidden filesystems are supported in the latest release of the HiddenFsReader at the time of writing:
These threats employ custom hidden filesystems to store the payload and configuration data, better protecting against security software and making forensic analysis harder. We haven’t discussed all of these threats in this book, but you can find information on them in the list of references available at https://nostarch.com/rootkits/.
The implementation of a custom hidden filesystem is common for advanced threats like rootkits and bootkits. Hidden storage is used to keep configuration information and payloads secret, rendering traditional approaches to forensic analysis ineffective.
Forensic analysts must disable the threat’s self-defense mechanisms and reverse engineer the malware. In this way, they can reconstruct the hidden filesystem’s layout and identify the encryption scheme and key used to protect the malicious data. This requires extra time and effort on a per-threat basis, but this chapter has explored some of the possible approaches to tackling these problems. In Chapter 19, we will continue to explore forensic analysis of malware, focusing specifically on UEFI rootkits. We will provide information on UEFI firmware acquisition and analysis with respect to malware targeting UEFI firmware.
3.22.66.140