As we discussed in previous chapters, a digital forensic investigator must be able to control the environment in which they operate. The diversity of computer hardware, operating systems, and filesystems requires the digital forensic investigator to have a firm understanding of all the different and potential configurations they may encounter. This requires the digital forensic investigator to have procedures or controls in place to protect the integrity of the digital evidence and the processes used to examine it. If you do not understand the boot process and how the system reacts when it starts or which filesystem is in use on the storage devices, you could make a fatal mistake. You have to understand how they work together. Failure to understand these basic components could lead you to alter the digital evidence. You will also find that you will be less effective when you testify in judicial or administrative proceedings.
In this chapter, we will cover the following topics:
In order to control the environment as we start our investigation, we must understand the environment. Here, digital evidence is being stored, created, and accessed. In most cases, this will be a computer system. I use the term "computer system," and what that comprises is the operating system, the filesystem, and the hardware bundled together to create a computer. To be effective, you must understand the physical media the data is stored on, the filesystem used on the storage device, and how that data is tracked and accessed while on the storage device. Once you understand the process, you can then implement controls to protect the integrity of the digital evidence.
So, what is the boot process? Well, when you push the power button and electricity energizes the system, a series of commands is issued. As it executes the commands, the system is taking steps (just like on a ladder) to achieve the goal of a running operating system. If something breaks any of those steps, then the system will not load.
The first step is the Power-On Self-Test (POST); the CPU will access the Read-Only Memory (ROM) and the Basic Input/Output System (BIOS) and test essential motherboard functions. This is where you hear the beep sound when you turn the power on to the computer system. If there is an error, then the system will notify you of the error through the use of beep codes. If you do not have the motherboard manual, do a search to determine the meaning of the specific beep code.
Once the POST test has successfully completed, the BIOS is activated and executed. Note that the system has not accessed the storage media. All the program executions are taking place at the motherboard level and not in the storage devices. The user can access the BIOS by using the correct key combination as it is displayed on the screen.
Note
The time allowed for you to hit the correct key can sometimes be quite short. If you are unsuccessful, the system will continue booting and will access the storage device. If you are trying to access the suspect's computer system, disengage the storage devices if they are accessible before starting the process. This will ensure that you are not booting to the suspect's storage device and destroying evidence.
The BIOS will have the basic information of the system: the amount of RAM, the type of CPU, information about the attached drives, and the system date and time. The easiest way to document this information is to take a photograph of it as it is displayed on the screen. This is also where you can change the boot sequence. Typically, the system checks the CD/DVD first and then the designated hard drive. This is where you would be able to change the setting of the boot device when we create the boot media later on in the chapter. Changing the boot device tells the BIOS to access the device we are providing, and not the suspect's.
In 2010, the BIOS function was replaced by the Unified Extensible Firmware Interface (UEFI). It provides the same service as the BIOS, but has been enhanced, as follows:
The Secure Boot feature allows us to use authenticated operating systems when booting the computer system. This can be an issue if you are attempting to use an alternative booting device.
As you can see in the following diagram, once the power is turned on and it has completed the POST test, depending on the system, it may boot with the BIOS, or it may boot with the UEFI scheme:
The BIOS will look for the Master Boot Record (MBR) of the boot device. The MBR is located at sector zero and holds information about the partitions, filesystems, and the boot loader code for the installed operating system. Once the MBR is found in the boot loader and has been activated, control is then passed over to the operating system to complete the booting process.
The UEFI will look for the GPT; the GPT will have a protective MBR to ensure legacy systems will not mistakenly read this as being unpartitioned and overwrite the data. It will also contain the partition entries and backup partition table header. A GPT disk can contain up to 128 partitions for a Windows operating system. Just like in the BIOS scheme, once the active partition and boot loader have been found, the operating system will take over the booting process.
Since you now understand the boot process, we still want to control the boot environment with the creation of forensic boot media, which we will discuss next.
It is a widespread practice to remove the hard drive from the system to create a forensic image. However, sometimes, the storage device cannot be removed from the system, and you have to create a forensic image. To accomplish this task, you need to use a bootable CD/DVD or USB device to create a forensic environment in order to create a forensic image.
Using boot media, you will want to ensure that it will create that sound forensic environment and not cause any changes to the source device. As we discussed during the boot process, we want to intercept any potential changes to that source device, and we want to have the system boot inside an environment we control. While it is still possible to boot using a CD/DVD, it is becoming more common to find systems without an optical drive. Without an optical drive, we must use a boot USB device to create a sound forensic environment to access the storage device.
Linux is a standard operating system that has been used to create a USB-based (live) operating system to create the forensic environment needed to examine these devices. As discussed in Chapter 3, Acquisition of Evidence, Paladin is one such tool. It is freely available to download and to purchase if you wish to have it preinstalled on a USB device. Sumuri also provides some limited technical support in the operation of Paladin.
There is also a Windows-based bootable environment known as WinFE (Windows Forensic Environment). WinFE was developed by Troy Larson in 2008 and has spawned other tools such as Mini-WinFE, which was developed by Brett Shavers and Misty (http://reboot.pro/files/file/375-mini-winfe/). The benefit of using the Windows bootable environment is that you now have access to Windows-based forensic tools. It is possible to run X-Ways or FTK Imager from this secure environment. I would not recommend using a tool that is resource-heavy. What I mean by this is that some forensic suites such as EnCase Forensic or FTK require significant resources to run effectively. X-Ways can be run from a USB device, as can some artifact-specific tools such as RegRipper.
As with any tool or procedure, you must validate it to ensure you are getting the expected results. This means that before you go out into the field and boot a suspect's computer utilizing a forensic USB device, you must test it in the laboratory environment to ensure no changes are made. Some of the challenges that you, as the examiner, need to be concerned with when using a bootable USB device include the following:
As mentioned earlier, secure boot is a security feature of the UEFI process that allows trusted software to boot the system. If we want to use a bootable forensic operating system, the secure boot feature must be disabled.
You must enter the UEFI environment by pressing the catch key such as F2 or F12 (this will vary depending on the computer manufacturer). Once you have entered the setup utility, navigate to the Security menu (this might also vary depending on the computer manufacturer) and disable the secure boot option. Some Linux distributions and WinFE have received signed status and will boot a system that has secure boot enabled.
You must document your steps as you go through this process. If you miss hitting the catch key and start the boot process in the host operating system, then you must document that it occurred. Even starting a partial boot will change the timestamps and make entries in various logs in the operating system.
Now that you understand what a bootable forensic device is, let's go ahead and create one in the next section.
To create a bootable forensic device, you will need a USB (I recommend using an 8-GB, or larger, device) and an ISO file for the operating system you wish to install. I will demonstrate using an ISO for Paladin and free software called Rufus (https:/rufus.ie/). Rufus is a utility used to create bootable USB devices.
Once you download Rufus, execute the executable and the program will run:
Something similar to the preceding screenshot (Rufus) will appear, and you will have to select the appropriate choice from the drop-down menus:
Under Format Options, accept the default values and then click on the START button. Once the program completes, you will have a fully functioning, bootable forensic environment.
We have created a forensic boot environment; let's discuss the storage media you will encounter. We will now discuss hard drives.
The term "physical drive storage device" refers to the hard disk drive itself. That is, a physical device that contains platters or solid state storage that holds data. The term "logical device/volume/partition" refers to the formatting of the physical device. A physical device can contain one or more logical devices/volumes/partitions. It is a common misconception that the term "C drive" refers to the physical device, when, in actuality, it refers to a logical partition on the physical device.
Several components make up the interior of the hard drive (as shown in the following figure). If you were to open the case, you would find the hard drive comprised one or more platters. There could be one or more platters stacked together with a spindle in the center. The platters, which are made of a metal alloy or glass, are coated with a magnetic substance in which the heads magnetically encode information on the platters. The heads can write data on both sides of the platter. The spindles of the hard disk cause the disks to rotate at thousands of revolutions per minute; the faster the spindle causes the platters to spin, the higher the efficiency of accessing the data encoded on the platters. To read or write data to the platters, the heads are positioned less than .1 microns from the surface of the platter. Additionally, the actuator controls the heads; it swings across the platter, placing the head in the correct position to read/write the data. The storage devices are manufactured with tight tolerances and can be damaged by sudden sharp movement or a mechanical shock:
A hard drive can have different interfaces, for example, you may run into some of the following:
Solid state drives (SSDs) are storage devices that contained no moving parts. Instead, they are made up of memory chips. As we discussed earlier, a traditional hard drive has several moving parts in which to read/write data to the spinning platters. With an SSD storage device, all of the data is stored in memory chips, allowing for the following:
For an SSD to function reliably, there are several operations controlled by the firmware of the device. We know these functions as follows:
The real-world effect on forensics is that we can no longer recover data that is, or was, in unallocated space. Since these operations are conducted at the firmware layer, as soon as we give power to the device, these operations start automatically. Currently, there is no way to stop the firmware from doing the functions mentioned previously.
The drive geometry of a platter drive details how data is stored on the device; the drive geometry defines the number of heads, the number of tracks, the cylinders, and the sectors per track. The manufacturer performs what it refers to as a low-level format, which creates the basic structure of the disk by defining the sectors and tracks. A track is a circular path on the surface of the platter, as indicated in the following diagram. The red circle (A) is a single track and each side of the platter will have its own set of tracks. They then subdivide the track into sectors. A sector (B) is the smallest storage unit on the device. Originally, a sector used to be 512 bytes in size; however, newer disks are being formatted with a sector size of 4,096 bytes:
The platters have an addressing scheme so that they can locate the data; originally, Cylinder, Head, Sector (CHS) was used. Here, Cylinder refers to the vertical axis of the same sectors on all the platters. Head refers to the read/write heads; each platter has two heads. And, in this case, Sector refers to the number of sectors per track. This addressing scheme worked for large capacity hard drives; however, as the storage capacity increased, the CHS scheme could not scale because of file size limitations, so Logical Block Addressing (LBA) was created. With the LBA scheme, you can address the sectors with a sector number starting from zero.
So, we have discussed the physical components of the device. We will now dive deeper and examine some of the internal aspects.
Three steps are required before the computer system can use the storage device. We have discussed the low-level format conducted by the manufacturer, but now we will discuss partitioning.
Partitioning occurs when we divide the physical device into logical segments called "volumes." With the MBR partitioning scheme, we are restricted to four primary partitions. With one physical device, you can have a primary partition used to host the Windows operating system, and you can have a second primary partition that hosts a Linux operating system. Note that you must have a primary partition to boot into an operating system. When a user selects the booted operating system, this is known as the active partition.
To get around the partition limit, developers created the extended partition. One of the four partition records is designated as an extended partition, which can then be divided into logical volumes.
As we discussed previously, we can find the MBR at sector zero. The MBR contains the information needed by the system to boot. The MBR will be contained in sector zero, so it will be no longer than 512 bytes. The partition table will show us which partition is the active partition. Once the starting sector or the active partition is located, the boot process will continue:
The preceding MBR map depicts sector zero of a hard disk. This is the MBR for the physical disk. The first 440 bytes are highlighted; this is the boot code. The next 4 bytes are the disk signature and identify the disk to the operating system. The following 64 bytes comprise the partition table. Each 16-byte entry refers to a specific partition. Remember, it restricts us to 4 primary partitions utilizing the MBR partitioning scheme. The final 2 bytes is the signature for the MBR. It identifies the ending of the MBR and will be the last 2 bytes of the sector.
In the following table, I have extracted the four partition tables and reformatted the hex values for easier reading. The first byte will designate which partition is the active partition. A value of x/80 identifies the active bootable partition. A value of x/00 shows the non-active (bootable) partition:
Typically, you would see the first partition marked as the active partition; in this case, it is the second partition, which is bootable. The next three bytes represent a starting sector for the CHS calculation. So, when we examine the partition table, we can see that the physical device has as a partition of 0 and a partition of 1 with the entries for partition 2 and 3 being zeroed out. This tells us that there are only two partitions on this physical device.
The fifth byte represents the filesystem on the partition. For partition 0, we can see the hex value of DE, which tells us that it is part of the Dell Power Edge Server utilities. Partition 1 has a hex value of 07, which shows the NTFS filesystem.
If I found the hexadecimal values of 05 or fh, then that would show an extended partition. We would then have to look into the extended boot records of the extended partitions.
Note
You can find a full list of partition identifiers at https://www.win.tue.nl/~aeb/partitions/partition_types-1.html.
The next three bytes are the values for the ending sector of the CHS calculation. The next four bytes show the starting sector of the partition, and the last four bytes show the size of the partition.
The sector values used in the CHS calculation are legacy values for older storage devices. The values showing the start sector and the total number of sectors (partition size) are being used for the current drives using LBA.
Each partition will have a Volume Boot Record (VBR) at sector zero of the partition. The system uses the VBR to boot the operating system in that volume. It is an operating system-specific artifact and is created when the partition is formatted. It will also appear on unpartitioned devices, such as removable media, for example, a USB or floppy disk.
Primary partitions are not the only partitions that you may encounter; you can also encounter an extended partition, which is the subject of the next section.
The limitation of the MBR of only allowing four primary partitions resulted in the creation of the extended primary partition. Here, it takes the place of one (and only one) primary partition and enables the user to create additional logical partitions over the four primary partitions.
The following partition map illustrates the replacement of a primary partition with an extended partition:
The following diagram shows the extended partition. Here, the user has created multiple logical partitions within the extended partition boundary:
The extended partition will not have a VBR. It will have an extended boot record (EBR), which will point to the first extended logical partition. The first extended logical partition will contain information about itself and a pointer to the next extended logical partition. In effect, this will create a daisy chain of pointers from one extended logical partition to the next.
We have now covered the aspects relating to the MBR; let's now go over the GPT formatted aspects.
A GUID is a globally unique identifier and uses a 128-bit hexadecimal value to identify different aspects of the computer system uniquely. A GUID comprises five groups and is formatted as 00112233-4455-6677-8899-aabbccddeeff, and, while there is no central authority to ensure uniqueness, it is doubtful that you would get a repeating GUID.
RFC 4122 defines the five different GUIDs as follows:
The GPT is a partitioning scheme that is used for newer storage devices and is part of the new UEFI standard. The UEFI standard replaces the BIOS, while the GPT replaces the MBR partitioning scheme.
The GPT petitioning scheme uses LBA and a protective MBR that can be found in the physical sector zero. The protective MBR allows for some backward compatibility and helps to remove any issues when dealing with legacy utilities that do not recognize the GPT partitioning scheme. There is no boot code available in the protective MBR. As you can see in the following diagram, this is the first partition entry of the partition table of the protective MBR. The partition is identified by hex value EE, which shows it is a GPT partition disk, as shown in the following GPT hex:
While the MBR contains the partition table within physical sector 0, GPT houses the partition table header at physical sector 1. The GPT header can be identified by the EFI signature of hexadecimal values 45 46 49 20 50 41 52 54, as shown in the following diagram:
The following table shows the layout of the GPT header, which you can use to identify the layout of the desk:
The GPT partition entries are typically found in physical sector 2. The following diagram shows the GPT partition table entries:
Each partition entry is 128 bytes and provides information about the partitions. The following table shows the contents of the partition entries, which include the partition type GUID, the GUID that is unique to that specific partition, the starting and ending sectors, and the partition name in Unicode:
A partition should hold all of the data on the disk within the partition's boundaries; however, there are spaces on the disk outside of the normal partition boundaries where a technical user may hide data. We will discuss those areas next.
HPA and DCO are hidden areas on the hard drive created by the manufacturers. The HPA is used by the manufacturer to store recovery and diagnostics tools and cannot be changed or accessed by the user. The DCO is an overlay that allows the manufacturer to use standard parts to build different products. It allows the creation of a standard set of sectors on a component to achieve uniformity. For example, the manufacturer might use one set of parts to create a 500-GB hard drive, and while using the same components, can also create a 600-GB hard drive. Once again, usually, the user would not have access to this location. Some utilities to do so are freely available, however, and could be used by a user to access these locations and store data.
The following screenshot shows you how an HPA may appear in X-Ways:
The following screenshot shows you how an HPA may appear in FTK Imager:
Let's move on and discuss some potential filesystems that you may encounter.
A hard drive can have multiple partitions on it, and, in each partition, there will be (in most cases) a filesystem. There might be hundreds of thousands to millions of files contained within a partition. The filesystem tracks where every file is and how much space is available within the partition boundaries.
We discussed sectors earlier in the Hard drives section, and they are the smallest units that are available to store data. The filesystem stores data based on clusters. Clusters are one or more sectors. A cluster is the smallest allocation unit the filesystem can write to. Now, there are many filesystems available, and some are restricted to specific operating systems unless the user enables drivers that will allow the operating system to read the filesystem.
We will now look at some of the common filesystems you may encounter.
The File Allocation Table (FAT) filesystem has been around since the early days of home computing, and it is one of the few filesystems that nearly all operating systems can read. It is the de facto standard filesystem for removable devices.
As time has gone by, the FAT filesystem has gone through numerous changes:
We will discuss the FAT32 filesystem for the remainder of this section on the FAT filesystem.
The FAT filesystem is laid out in two areas (as shown in the following diagram, Figure 4.16 – FAT areas):
Next, we will discuss what falls under System Area.
In the system area, we have the Volume Boot Record (VBR). We can find it in logical sector 0 (LS 0), which is the first sector within the partition boundaries. The boot process creates the VBR when the partition is formatted and contains information about the volume and boot code to continue the boot process for the operating system. If it is a primary partition, the VBR will consist of several sectors, typically, sectors 0, 1, and 2 with a backup in sectors 6, 7, and 8. The VBR and backups are stored in a "reserve area," which is typically 32 sectors before the first file allocation table begins:
In the preceding diagram, we can see a volume boot sector, which helps to decipher the following information:
Next, we will take a look at the file allocation table.
The next component of the FAT filesystem is the file allocation table, which immediately follows the VBR. By default, there are two file allocation tables (FAT1 and FAT2). FAT2 is a duplicate of FAT1.
The purpose of the file allocation table is to track the clusters and to track which files occupy which clusters. Each cluster is represented within the file allocation table starting with cluster 0. The file allocation table uses 4 bytes (32 bits) per cluster entry. The file allocation table will use the following entries to represent the cluster's current status:
A cluster is the smallest allocation unit the filesystem can address. A sector is the smallest allocation unit on the disk. A cluster is made up of one or more sectors. It is very easy to get confused if you comingle those terms. Consider the following cluster example:
As users add files to the data area, the system will update the file allocation table. A file may occupy one or more clusters. Additionally, the clusters may not be sequential, so you could have the data of a file spread in different physical locations on the disk; we typically refer to this as fragmentation.
In the following diagram, we can see a representation of the file allocation table; in this scenario, we have a single file occupying three clusters: Cluster 4, Cluster 5, and Cluster 6. You can see that Cluster 4 is pointing to Cluster 5 and Cluster 5 is pointing to Cluster 6. Cluster 6 has the hexadecimal value for end of file (EOF):
In the following diagram, we can see a similar representation of the file allocation table with some changes. We now have two files, with file number 1 occupying clusters 4 and 6. We can see that Cluster 4 is pointing to the next cluster containing the file data, which is Cluster 6. This is an example of file fragmentation. File number 2 is wholly contained within the cluster boundaries of Cluster 5. Cluster 5 will not point to a subsequent cluster; instead, it has the EOF hexadecimal value:
We have covered the system area of the FAT; we will now discuss the data area of the FAT filesystem.
The root directory is housed in the data area because, when it was stored in the system area, it was unable to grow enough to work with larger capacity devices. The critical component of the root directory is the directory entry. If there is a file, directory, or subdirectory, then there will be a corresponding directory entry.
Each directory entry is 32 bytes in length and helps to track the name of the file, starting cluster, and file size in bytes.
In the following diagram, we can see a FAT32 directory with multiple file entries. The filesystem will stop looking for file entries when it runs into a hexadecimal 00, and all values following the hexadecimal 00 will be ignored:
In the following FAT directory map, we can see the layout of the directory entry and a short filename (SFN) directory entry with the specific offsets highlighted:
If the first byte is xE5, then the filesystem will consider that entry as deleted. The remaining bytes of the file or directory name will remain, as will the other metadata.
The short filename must conform to the specifications as follows:
The directory entry will always be stored in uppercase. The attribute byte (offset x0B) is considered a packed byte, which means the different values have different meanings.
The following diagram shows that bit values in the Attribute flag can be combined, and the resulting hex value will reflect the combinations. If a file had the READ ONLY flag and the HIDDEN flag, then that would give us a value of 0000 0011, and, when converted to hexadecimal, we get the value of x03:
When we look at the example at the bottom of the preceding FAT directory map, we find the hexadecimal value of 20 at the offset x0B; when we convert the hexadecimal into binary, we get 0010 0000. This tells us that the file is an archive.
We can also encounter a Long Filename (LFN); the technique for handling the LFN is a little bit more complicated. We will discuss the LFN in the next section.
When a user creates an LFN, the system will generate an alias that conforms to the SFN standard. It will format the alias so that the first three characters after the file extension dot will become the extension. The first six characters will be converted to uppercase and will be used for the alias. The alias will then add a ~ character with a following number. It will start with the number 1 and increase incrementally if there are additional files with the same alias name.
The following diagram shows a directory entry for a file with an LFN; the filename is long filename.txt:
Since this is an LFN, the filesystem will create additional directory entries. In this specific case, there will be two additional directory entries to facilitate the use of the LFN. The first byte of each additional directory entry is the sequence byte. The right nibble is the sequence number. As we look at the directory entry depicted in preceding diagram, the directory entry above the SFN entry has a hexadecimal value of x01. Here, the value of 1 tells us that this is the first value in the sequence. When we move up to the second directory entry, we can see that it has a hexadecimal value of x42, the right nibble informs us this is the second directory entry for this LFN file. The left nibble of the value, 4, tells us this is the last directory entry for the file. In each of the LFN directory entries, you will find that the attribute byte is x0F.
But what happens when a file is deleted? Well, you may be able to recover the file and its associated metadata. In the next section, we will discuss recovering deleted files.
When a file is deleted in the FAT filesystem, the data itself does not get changed. The first character of the directory entry will change to xE5 and the file allocation table entries are reset to x00. When the filesystem reads the directory entries and encounters xE5, it will skip that entry and start reading from the subsequent entries.
To recover deleted files, we need to reverse the process that the filesystem used to delete the files. Remember, it has not changed the file contents, and they still physically reside in their assigned clusters. We now need to reverse engineer the deletion and recreate the file entry and the entries in the file allocation table. To do this, we need to find the first cluster of the file, the size of the file, and the size of the clusters in the volume.
In the following diagram, we have a directory entry showing us that a file has been deleted. We can see xE5 at the start of the directory entry. (Note that this will require the use of a hex editor to make the changes.)
Then, we have to determine the starting cluster, which is x00 x08 (but is shown as x08 x00 in the diagram). This value is referring to cluster number 8. To determine the file size, take a look at the last four bytes, x27 x00 x00 x00 (remember that the FAT filesystem stores data in little endian, which means the least significant byte is on the left, so we would read that value as x00 x00 x00 x27, and when we convert it into a decimal, we have a value of 39 bytes for the file size):
Now we have to determine how many sectors make up a cluster and what the sector size is. You will need to go to the boot record to get that information. The boot record shows us that there are 512 bytes per sector, and there are 8 sectors per cluster, which gives us a cluster size of 4,096 bytes (as shown in the following diagram):
This means that our file will only occupy a single cluster. We then go to the file allocation table and look at the entry for cluster 8 and see that it is zeroed out:
To recover the deleted file, perform the following steps:
When recovering a file with an LFN, it is important to relink the LFN to the SFN. This is because when the additional directories are created to accommodate the LFN, the system creates a checksum based on the data of the SFN. When you change the xE5 value on the SFN entry, you also want to use the same replacement character for the subsequent xE5 entries for the LFN directory entries. The reason you link the LFN to the SFN is that the SFN directory entry contains information such as the date and time, the starting cluster, and the file size.
It is still possible to recover scraps of data that previously existed on the disk but no longer have any artifacts in the filesystem. This information will be stored in slack space, which is discussed in the next section.
Now is the time to bring up slack space. Remember that the smallest unit the filesystem can write to is a cluster and that clusters are made up of one or more sectors. The reason I keep repeating this is that I have seen people who are new to the field get confused about the difference between the two. The reason this is important is that files come in a variety of sizes; almost no files will conveniently fit within the cluster boundaries. So, you will have files that spill over into the next cluster. The space between the end of the logical file and the cluster boundary is called "file slack." This slack space can contain data from the previous file. Until it is overwritten, that data will remain for you to examine.
You might find evidence of document files, digital images, chat history, or emails; that is, for any data that has been stored on the device, you may find remnants in slack space after the user has deleted the file.
This concludes the FAT filesystems section; next up is NTFS.
The New Technology File System (NTFS) is the default filesystem for Microsoft Windows operating systems. FAT32 had some significant shortcomings, which required a filesystem that was more reliable and efficient, along with additional administrative improvements to help Microsoft remain viable in the corporate environment. They initially designed NTFS for the server environment; however, as the hard drive capacity has increased, it is now the default filesystem in the commercial and consumer market for the Windows operating system.
NTFS is far more complicated than the FAT filesystem; however, the overall purpose remains the same:
The NTFS filesystem comprises the following system files:
To identify a partition with NTFS, we need to look at the MBR or the GPT, depending on which formatting scheme was used. In the following diagram, we can see the MBR for the hard drive and the partition table highlighted after the boot code:
Looking at the partition table, we can see that there is a single partition, and, at offset decimal 11 from the start of the partition table, we can see the hexadecimal value of 07. As we discussed earlier in this chapter, this is the filesystem identification for NTFS.
With an NTFS-formatted partition, there is no system or data area like we saw with a FAT-formatted partition. Everything in NTFS is considered a file to include the system data. When we look at the VBR, we can see that it contains information for the system to continue the boot process:
The information in the VBR is a file; the $Boot record contains all of the information that we would expect to find in the VBR. The following $Boot diagram shows the data structure for the $Boot file:
Arguably, the most essential system file in the NTFS filesystem is the $MFT (master file table). The MFT tracks all of the files in the volume to include itself. It tracks each file within the MFT through the use of file entries called a file record. Each file record is uniquely numbered and is 1,024 bytes. Each file record starts with a header, with the ASCII text "FILE", and has an EOF marker of hexadecimal FF FF FF FF. As it adds files to the volume, a new file record is created. If a file has been deleted, the file record will zero out and make it available for reuse. The MFT will look for an empty file record and use it prior to creating a new record. It is possible for the file record to be reused rather quickly, which would overwrite the previous data in the file record.
As shown in the following NTFS file record example, we can see a file record and file header starting with the ASCII values of FILE. If the record were corrupted or had an error, you would see the ASCII value of BAAD. The file header is 56 bytes:
In the following NTFS file record map, we can see the data structure of a file record header:
The file record also contains defined data blocks called file attributes. These store specific types of information about the file. The following file attributes table shows several common file attributes that you are likely to see in almost every record:
Let's take a look at each of these attributes in detail.
$Standard_Information Attribute (0x10): The file attributes follow the file header and contain information about the file and, sometimes, the actual file itself. The following diagram depicts a file attribute. The first four bytes show the attribute type; in this case, it is the $10 Standard Information Attribute, which contains general information, flags, accessed, written, and created times, the owner, and security ID. It is identified by the hexadecimal header: x/10 00 00 00. The file attribute map contains the decoded values:
Here is a map of the values you will find in the attribute:
$File_Name Attribute (0x30): The next attribute is the $30 File Name Attribute. This attribute stores the name of the file attribute and is always resident. The maximum filename length is 255 Unicode characters. It is identified by the hexadecimal header of x/ 30 00 00 00:
The following is a map of the values you will find in the attribute:
$Data Attribute (0x80): The next attribute for this entry is the $80 Data Attribute. The data attribute contains the contents of the file or points to where the contents are located in the volume. This attribute is the file data itself.
If the data attribute content is resident, we only use the attribute header and the resident content header. The resident content of the attribute is the file's data. Only tiny files have a resident data attribute. We will discuss resident versus non-resident data later on in this chapter.
You may find multiple data attributes per file. In this record, the second $80 Data attribute, Dropbox, has added some information to the file:
The following is a map of the values you will find in the attribute:
When examining the $Data Attribute 0x80, the contents of the file may be stored within the MFT file record itself. Since the file record is 1,024 bytes long, it would have to be a tiny file. When the data content of the file fits within the file record, it is called "resident data":
In the current example, we have a file named resident.txt that is 23 bytes in size. This is smaller than the 1,024 bytes of the file record. To look at the data of the file, we need to look at the $Data Attribute 0x80 of the file record, as follows:
On examining the attribute, we can see the ASCII and hex representation of the file content we observed in the preceding resident data example. When dealing with a non-resident file, such as the one depicted in the following diagram, we can see that the nonresident.txt file, which is 145 KB in size, is larger than the 1,024-byte file record:
When you look at the $Data Attribute 0x80 of the file, as shown in the preceding diagram, we do not see the contents of the file, but we have pointers to the location of the file within the volume boundaries. We consider this to be non-resident content. Once the content of the attribute becomes non-resident, it can never become resident again. We commonly refer to the pointers in the file record of the attribute as a "run list" for the data runs of the non-resident data:
You can have a single data run, or multiple data runs, within the $Data Attribute 0x80. Deciphering the run list for the data runs can be tricky. In the following run list, we have the $Data Attribute 0x80 with two run lists:
If the file is not fragmented, then you will have one run list pointing to the data run in the volume. If the file is fragmented (which is very common), then you will have multiple run lists providing information about the starting cluster for each fragment. I have taken the two run lists highlighted in the preceding list and created the following chart:
The first run list comprises the hexadecimal values of 31 07 E8 E3 48. Take the first byte of the header (x/31) and add the left and right nibble (3+1=4). 4 is the number of bytes in the run list entry (this is x/07 E8 E3 48). The right nibble (x/1) tells us that 1 byte represents the number of clusters being used for this fragment. We find a value of x/07 in the length field, which represents 7 clusters for this fragment. The left nibble (x/3) informs us that 3 bytes (x/E8 E3 48) will represent the logical starter cluster of the fragment. At the end of the first run, we have a second run list of x/31 14 44 47 17. Like the prior run list, we take the first byte of the header (x/31) and add the left and right nibble (3+1=4). 4 is the number of bytes in the run list entry (which is x/14 44 47 17). The right nibble (x/1) tells us that 1 byte represents the number of clusters being used for this fragment. We find a value of x/14 in the length field, which represents 20 clusters for this fragment. The left nibble (x/3) informs us that 3 bytes (x/44 47 17) will represent the offset from the previous run list cluster. This process will keep going until the system hits x/ 00 00 00 00, which shows the end of the run lists.
That concludes our adventure into the world of NTFS. If you find yourself with a headache, you are not alone! This is just the basics of the filesystem. You can find entire books that have been written about NTFS, if you want to go into much greater detail.
In this chapter, we looked at how physical disks are constructed and prepared in order to store data. We discussed different partition schemes and how they address the creation of logical partitions. We also learned how filesystems differ and how data is organized.
In the next chapter, we will learn about the computer investigative process and how to analyze timelines, analyze media, and perform string searching for data.
a. True
b. False
a. MBR
b. VBR
c. GPT
d. LSD
a. True
b. False
a. True
b. False
a. Disk
b. Doughnut
c. Data
d. Designer
a. True
b. False
a. Standard information
b. Filename
c. Data
d. Security descriptor
The answers can be found in the rear of the book under Assessment.
Carrier, B. File System Forensic Analysis. Addison-Wesley, Reading, PA., Mar. 2005 (available at https://www.kobo.com/us/en/ebook/file-system-forensic-analysis-1).
3.136.97.64