10 EVIDENCE ACQUISITION BASICS

Disks, file systems and stored data are the building blocks for the majority of digital forensics investigations. In this chapter we’re going to look closely at how these mainstay sources of potential evidence are acquired, processed and analysed. A deep understanding of both file systems and disk geometry are crucial for a forensic investigator in analysing the evidence presented to them. In this chapter we’ll look at these, and talk through performing basic digital forensics acquisitions.

If you’re primarily in an incident response role, you should also become familiar with the contents of this chapter. You’re likely to find yourself best placed to handle evidence acquisition as a first responder, even if you don’t ultimately complete the entire investigation. The reality is that the opportunity to perform some of the tasks we’re going to talk about can often be missed in the midst of an incident, but by being switched on and recognising when the opportunity to acquire evidence presents itself you can jump in and competently do the job. Remember, your work will be held to the same standard as the full-time investigator, so it is vital that acquisition is completed in accordance with published best practices.

THE HARD DISK DRIVE

If you pop the cover off a modern laptop or desktop computer you’ll most likely find one of two types of disk drive: the traditional magnetic disk type that was first introduced in the mid-1950s, which remains in widespread use, or the increasingly popular and more modern solid-state drive. Though the term ‘hard disk drive’ technically refers only to the magnetic kind, you might hear it used to describe both interchangeably. The term ‘solid-state drive’, or SSD for short, is used to refer solely to drives using the newer solid-state technology. There are also hybrid drives that feature a mixture of the two technologies; these include a larger-capacity magnetic disk along with a smaller-capacity SSD cache, used to improve access times for the most commonly accessed files.

Magnetic disks

Traditional magnetic hard disks are remarkable pieces of engineering. They store data by creating extremely tiny magnetic fields on a thin magnetic coating applied to a spinning circular disk known as a platter. Modern disks contain multiple platters. The direction that the magnetic field is applied is used to differentiate between the binary numbers that ultimately make up all stored data, 0 and 1. The surface of each platter is magnetised using a write head, which is a very thin but highly magnetic piece of wire that floats just above the platter. When data is overwritten, the write head simply moves across the surface of the platter and writes directly over the top of the existing data. To keep data in order platters are divided into tracks, which are concentric circles that start at the centre of the platter and radiate out to the edge. Tracks are further divided into sectors, which are segments, or ‘pie slices’, to think of it another way.

For optimum performance a magnetic disk will start recording data on the first available sector, and then continue recording on the next closest free sector. This is to ensure that the read head doesn’t have to jump around all over the place to access an entire file. However, through normal use it is common for chunks of files to become physically displaced across the disk. A cure for this is defragmenting the disk. This process reduces the time required to access a file by moving the fragmented ‘blocks’ of files closer together. Understanding this concept is important when analysing raw disk images.

Solid-state drives

Unlike their magnetic forefathers, solid-state drives feature no moving parts, which improves their reliability, reduces power consumption and makes them weigh less. They use the same type of storage that has been prevalent in USB or flash drives for many years: microscopic transistors that trap a small electrical charge. The presence, or lack, of an electrical charge is used to determine the presence of a binary 0 or 1. A fully charged transistor will not allow any more electricity to flow through it; the drive recognises this and returns a 0. An uncharged transistor allows current to pass through, which is interpreted as a 1. A brand new, completely unused drive features all transistors charged. Charge can remain in the transistor for years, meaning that the data the charge represents will remain on the device for just as long. The main benefit of this approach is that the time required to write data is reduced significantly when compared to magnetic drives. Transistors in solid-state drives can be charged in microseconds, whereas magnetic drive write heads take milliseconds to apply their magnetic fields.

Whereas a magnetic disk can theoretically be written to an infinite number of times, an SSD transistor has a comparatively short life expectancy. Typically, they can only be written to about 100,000 times before they are likely to fail. So, unlike the magnetic hard disk, which tries to keep blocks of a file as close to each other as possible, an SSD spreads the load across all the unused transistors in the drive randomly. This technique, known as wear levelling, avoids consistently storing charge in the same group of transistors, which would make them wear out faster. The computer’s operating system is not aware of this process thanks to the SSD’s on-board controller card. The controller presents the operating system with an abstracted list of hard drive sectors. To the host computer, and the forensic examiner’s write blocker for that matter, the controller card will present the same abstracted list of sectors.

Both magnetic disk drives and solid-state drives can be acquired using our principal digital forensics tool, the trusty write blocker. Once images are acquired, the same software suites can be used to examine the drive contents regardless of its physical form factor.

Disk geometry

Understanding the fundamentals of how data is laid out on a hard disk is a crucial component of an investigator’s overall understanding when analysing that acquired data later.

Sectors

Sectors are the smallest physical unit of storage on the hard disk. Traditionally a sector is used to store 512 bytes of data; however, in recent years a new standard of 4,096 bytes per sector has emerged. This new standard is known as the ‘advanced format’.

Clusters or allocation units

A cluster is the smallest logical unit of storage on a disk, and is made up of multiple sectors. For example, a 4 kB cluster, the default size in many configurations, could be made up of eight 512-byte sectors or a single 4,096-byte advanced format sector. Clusters need not be made up of contiguous sectors.

Slack space

Only one file can be assigned to a given cluster on a disk. Clusters are, of course, fixed in size, whereas file sizes can vary greatly. This means that, more often than not, there is a difference between the number of clusters assigned to a file and the amount of storage that the file actually needs. For instance, on a disk with 4 kB clusters, a 3 kB file would be assigned a single 4 kB cluster. The term ‘slack space’, or ‘file slack’, refers to the unused portion of that cluster. In this example, that would mean 1 kB of slack space in the cluster.

Slack space can have significant value to a digital forensic investigator, which is why it is a highly important concept to understand. Consider the following: a user saves a document that is 8 kB in size; the document is assigned two 4 kB clusters. There is therefore no file slack in this case, as the file size aligns perfectly with the combined size of the two clusters. The user subsequently deletes that file, which causes the operating system to mark those two clusters as unused, but crucially the operating system doesn’t delete the actual contents of the clusters. A new document is saved. This time the file is 6 kB in size, and it is assigned to the same two 4 kB clusters as the old document. Those two clusters now contain the 6 kB of the new document, and 2 kB of the old one as file slack.

The file fragments found in slack space can hold many secrets thought long since deleted, and can therefore be a valuable source of forensic evidence. When you acquire a forensically sound image of a hard disk drive, you will acquire the contents of the slack space alongside those files that are fully intact. Forensics suites allow you to filter slack space and explore the fragments found there, if you know you’re looking for a file that has been deleted by a suspect.

images

Slack space became famous around the world in July 2016, when then FBI Director James Comey gave a televised update on the status of the FBI investigation into former US Secretary of State Hillary Clinton’s usage of a personal email server. The investigation was focused on the fact that Secretary Clinton was accused of storing classified information on a non-government-approved server.

During his remarks, Comey noted how one email server that had formed part of the investigation had been forensically examined. This particular server had been decommissioned three years prior, a process that resulted in the email server software being removed. This meant that emails couldn’t be viewed in their ‘natural’ state, but could be reconstructed from slack space.

On removing the email server software, Comey stated, ‘Doing that didn’t remove the e-mail content, but it was like removing the frame from a huge finished jigsaw puzzle and dumping the pieces on the floor. The effect was that millions of e-mail fragments end up unsorted in the server’s unused, or slack space. We searched through all of it to see what was there, and what parts of the puzzle could be put back together.’51

Hard disk interfaces

In order to connect a hard disk to a write blocker to acquire it, you must first identify the type of hard disk interface present on the target disk. The interface is used to pass data and control signals to the disk. The majority of hard drives that are encountered in modern-day investigations use the Serial ATA interface, known as SATA. Occasionally, older drives that use Parallel ATA (PATA), also known as Integrated Drive Electronics (IDE), may make an appearance. In the case of servers, the Serial Attached Small Computer System Interface (SCSI), or SAS, interface is commonly used.

SATA

The most commonly used disk interface these days, the SATA interface, was first introduced at the start of the 21st century and has been through multiple revisions ever since. The most notable change in each major revision is the data transfer speed. The very first version of SATA supported data transfer rates of 1.5 Gb per second. This transfer rate increased to 16 Gb per second as of version 3.2, introduced in 2013. SATA interfaces are found on both magnetic and solid-state drives.

SATA disks feature both power and data connectors that are typically positioned next to each other. The data connector has 7 pins, and the power connector has 15. This is a significant reduction when compared with the older PATA connector. One of the most annoying aspects of working with PATA drives was the ease with which the pins could be accidentally bent or, worse, snapped. Both the desktop-sized 3.5-inch SATA drive and the laptop-sized 2.5-inch SATA drive use the same connector, which means only one type of SATA write blocker is needed to acquire both form factors.

SAS

Disks with the Serial Attached SCSI interface are commonly found in rackmount servers. SAS was introduced in 2004 and superseded the classic parallel SCSI (pronounced ‘scuzzy’) interface. SAS drives are used in servers for a few reasons. First, they’re more reliable than SATA drives, and secondly, they allow for faster read and write times. In addition, SAS allows more disks to be connected to a single device when compared to the classic SCSI, or even SATA, interfaces, and with longer cables – perfect for building high-availability RAID arrays. Several manufacturers produce SAS-specific write blockers.

PATA/IDE

Older, but still very much out there, Parallel ATA or IDE was the hard disk interface technology used throughout the 1990s. Easily spotted by the wide ribbon cables that connect the disks to a PC’s motherboard, PATA drives have extremely limited capabilities by today’s standards. Only two devices could be connected to a single PATA controller, in a master and slave configuration, and data transfer was limited to 133 Mb/s in the most recent version of the interface.

PATA disks typically feature 40 pins on their connectors, which, as mentioned previously, are easily bent or damaged. Therefore, caution should be used when connecting a PATA drive to a write blocker – repairing these is not a pleasant task, by any means.

images

You should always have sufficient confidence in your abilities when it comes to using write-blocking equipment, acquiring disks and performing any type of investigative work using forensic tools. The first time you use a new piece of equipment, or tool, ideally won’t be in the midst of a real investigation. As with anything in life, practice makes perfect. Fortunately, the internet has you covered. It is very easy to get your hands on several resources to help you practise acquiring and analysing hard disk drives.

Many sites and organisations offer pre-made forensic disk images for forensic training purposes; these image files usually contain evidence that has been planted to encourage you to solve a given fictional case.

Here are just a few:

The Computer Forensic Reference Data Set Project (https://www.cfreds.nist.gov/)

Linux LEO – The Law Enforcement and Forensic Examiner’s Introduction to Linux (http://linuxleo.com/)

Digital Corpora scenarios (http://digitalcorpora.org/corpora/scenarios)

DFRWS Forensic Challenge (http://www.dfrws.org/dfrws-forensic-challenge)

Images and challenges are a great way to hone your skills and test your equipment in a realistic scenario. Of course, once you’re at the stage of having a disk image to work with you’re already past the point of actually creating that image, which is just as important to practise.

If you’re anything like me, you’ll probably have piles of old hard disks in some cupboard that you’ve collected over the years as computers have come and gone from your life. Using these old disks is a great way to practise acquiring disk images, as they’re often smaller in capacity and may feature different file systems, operating system versions and hardware interfaces.

If you don’t have old disks to spare, a great tip is to head to any online auction site and look for used hard drives. They’re cheap, easy to buy in a job lot of five or ten, and are rarely erased properly. I have performed research on many second-hand hard disks from various sources, and I can assure you they provide very interesting practice subjects. Unlike your own disks, having no clue what you’re about to come across on the disk also adds to the realism of your practice activity.

REMOVABLE MEDIA

A digital forensic investigator would be remiss not to consider that potential evidence may be located on removable storage media, especially when such media is located in or around the crime scene. Removable media is, of course, physically easier to hide, and is frequently used to transfer data between computers.

USB

The Universal Serial Bus (USB) was developed in the mid-1990s to solve a problem. Lots of PCs were being built, and lots of external devices for those PCs were being built, but there wasn’t really a common standard defining how they should connect. USB was the solution, and it paved the way for a wide variety of devices that use USB to hit the market. Of particular interest to us as forensics examiners are USB mass storage devices such as flash drives that use solid-state technology, or magnetic hard disks that are packaged to reside outside the computer.

Specially designed USB write blockers allow for the forensic acquisition of USB mass storage devices. These are particularly useful when dealing with external hard disks, since even though these disks usually feature a SATA interface ‘under the hood’, getting at that interface can often involve damaging the plastic chassis of the external disk, which is something to be avoided.

Optical disks

Though they are gradually being dropped in favour of flash-based storage, it is not uncommon to see optical storage disks still in use throughout offices and in residential settings, particularly to facilitate the sharing of digital media files such as photographs, music and video. Data is recorded to an optical disk by way of a laser. The laser etches a microscopic bump, known as a pit, which represents the binary data being recorded, into a reflective material on the underside of the disk in a spiral pattern. This etching process is commonly referred to as ‘burning’. A laser is also used to read the disk. The etched pits do not reflect the laser light; this is detected and a binary 0 registered. Areas of the reflective disk surface without pits are known as lands; they do return a reflection, which is detected to register a binary 1.

The three primary types of optical storage disk in use today are:

Compact disc (CD), which typically features a 700 MB capacity;

Digital versatile disc (DVD), which typically features a 4.7 GB capacity but can store up to 17.08 GB in certain configurations;

Blu-ray disc (BD), which can store up to 50 GB.

The majority of optical drives used to access optical disks are read only, so they inherit write-blocking characteristics out of the box. However, all three of the formats above can be purchased as write once, or can be fully rewritable. To complement this, there are disk drives widely known as burners that can be used to record to a given format of disk. Therefore, the forensic investigator should be aware of both the hardware used to read a given disk and the writable characteristics of the disk if it is being acquired as evidence.

Memory cards

There are multiple flavours of memory cards with varying storage capacities based on flash storage technology. Some are standards based, such as the popular Secure Digital (SD) card, whereas others are proprietary, such as the Sony Memory Stick. The majority of digital cameras and camcorders record to some form of memory card. Therefore, images and video relevant to an investigation, including those previously deleted, may be located on them. Of course, memory cards can also be used to transfer any other type of data.

Write blockers specifically for memory cards are available, and should be used when acquiring a forensically sound image of any memory card, regardless of type. Such write blockers typically support multiple media types. For example, the Forensic Card Reader manufactured by UltraBlock supports the following commonly used memory card formats:

Smart Media;

xD;

Compact Flash;

SD;

MMC (MultiMediaCard);

MicroSD;

Memory Stick.

PROCESSING DISK IMAGES

Once an investigator has acquired a disk image and has taken it back to the lab, it’s time to load the image up for processing in a dedicated forensic suite or other tool. Processing involves taking the raw disk data and extracting from it the various artefacts contained within. A typical disk image will contain a basic file system, operating system components, applications and many user- and system-generated files, all of which may contain valuable evidence. There is a lot of information to be unlocked.

Forensics software suites are designed to sift through as much of this information as quickly as possible, to make the investigator’s job of finding information and evidence relevant to their case go as smoothly as possible.

During the processing phase the case file is built, which includes metadata regarding the contents of the disk image being processed. For example, an index is built that allows for faster keyword searches against the contents of the disk. Without this index, each keyword search would have to be run across the contents of the entire disk, which if you have many terabytes of data is going to take a non-trivial amount of time.

Forensics suites generally allow you to choose which activities are to be performed during processing, for example including file carving (a topic discussed later in this chapter) in the processing job, or creating thumbnails of discovered images. As a general rule, the more tasks you want to complete during processing, the longer it will take.

Once the processing phase is complete, the investigator will be able to interact with a graphical overview of all the discovered artefacts.

FILE SYSTEMS

The system for organising and retrieving data stored on any type of storage media is known as a file system. For digital forensic investigators, knowledge of both the general characteristics of any file system and specialised knowledge of the more common types of file system in use are core competencies.

File system functions

In today’s world there are many flavours of file system in use, each with their own variances in how they go about doing the job of organising files. Some of the most important functions in any file system are listed below.

Mapping files to a physical disk location

The file system is responsible for keeping track of where a file is physically located on a given disk, so the user can both access and update the data in that file. Conversely, the file system must store data about unused disk locations, so it knows where to place new files.

Supporting user-facing file and folder structures

We’re all familiar with filenames, and storing files in folders (or directories). This is another important function of the file system.

Storing file information

In addition to actually storing the file itself, the file system stores information about the file. In other words, it is creating data about data, which is known as metadata. Examples of file system metadata include file creation time, file access time and file modification time, which are all of extreme importance during forensics investigations.

Protecting information

Using file system access permissions, control can be afforded over a user’s ability to access or modify a given file. The file system can also be a layer where file- or folder-level encryption is applied.

Commonly used file systems

While there is no shortage of file systems that could be in use on a given system, as a forensic investigator in the field you are most likely to encounter one of the following file systems. Therefore, time should be taken to fully understand the properties of each.

NTFS

The New Technology File System was developed by Microsoft for use in its Windows NT family of operating systems. It remains the most commonly used file system on Windows servers and desktop machines.

NTFS uses a single table, known as the master file table (MFT), to keep track of all file and directory locations on a given volume. The MFT is also used to store file metadata, such as timestamps and permissions settings. On any NTFS volume there is a backup copy of the MFT to be used in the event that the primary MFT becomes corrupted. The MFT is considered the most important aspect of NTFS for forensic investigators to understand, since it plays a key role in how forensic investigation suites display acquired evidence.

Linux machines can also use NTFS by way of a driver. Apple macOS machines can read NTFS devices, but do not support writing to them by default.

FAT

Before NTFS, the File Allocation Table or FAT family of file systems reigned supreme as the default file system of Microsoft Windows. The name comes from a statically allocated index table used to keep track of the clusters assigned to a file. FAT went through three major revisions, mostly to accommodate ever increasing disk sizes. While it is not the default in Windows any more, the FAT file system lives on and is frequently used on removable USB drives and memory cards. Therefore, all modern operating systems support it, and you are still very likely to come across it during an investigation.

APFS

The Apple File System is the new default file system on Apple’s range of computing products, from watchOS to macOS. It debuted in March 2017 with the release of iOS 10.3 for iPhone, and hit Apple laptops and desktops in September 2017 with the release of macOS 10.13, also known as High Sierra.

APFS is designed to better support two technologies increasingly prevalent in personal computers: solid-state drives and encryption. As a result, there is native support for full-disk encryption, and support for the SSD TRIM command. The TRIM command is used to proactively inform an SSD when blocks of data are no longer in use, and therefore can be wiped, to reduce future wipe time.

APFS also aims to make more efficient use of storage space on a disk by using techniques like cloning during file copies. For example, if a file is copied in an APFS file system, no actual data duplication occurs. Instead, the file system uses metadata to make a note of the copy, but still points to the original file. In the event that either version of the file (the copy or the original) is changed, a new version of the file is created and new storage space is allocated; this technique is known as copy-on-write.

HFS+

Between 1998 and 2017, HFS+ (also known as Mac OS Extended) was the default file system in Apple products. Therefore, it is still highly prevalent, and the most likely type of Apple file system an investigator will encounter. The file system uses a catalogue file to store file and folder metadata in a B-tree storage system.

Of particular interest to the forensic investigator working on HFS+ is the fact that the file system supports journaling, and this has been enabled by default since 2003. Journaling is a mechanism in which changes to a disk are first committed to a journal file, which acts as a buffer to ensure that all disk update transactions are fully completed. In an event such as the rapid removal of a USB storage device the transactions may not be fully completed, and the file system may become corrupt. The journal file will keep track of all uncommitted transactions, which can include chunks of files that were not fully saved. Imagine a suspect quickly trying to hide a removable storage device, for example.

XFS

Linux distributions come in all shapes and sizes, but the most commonly used have adopted XFS as their default file system in recent years. XFS has been around since 1993, when it was first created by Silicon Graphics, Inc. In 2001 it made its way to the Linux platform, but it wasn’t until a few years ago that its use became widespread.

XFS features include metadata-based journaling, which helps the file system to remain consistent in the event of a system crash. It also makes use of a classic Unix file system data structure, the inode. Inodes can be found in most Unix file systems, and store information attributes about a file and where on the disk the file is stored.

File systems and acquisition tools

Regardless of the form factor of the media, or the file system in use, a forensically sound disk image will be an identical copy of the raw contents of the drive. The file system and the individual files will be preserved for analysis. In some cases the investigative suite an investigator uses might not fully support processing the file system of the acquired disk, which makes for a more challenging investigation, but not one that we should give up on by any means. We’ll discuss strategies for dealing with such a scenario shortly.

OPERATING SYSTEMS

When a suspect engages with a computer, they do so in the same manner as any other user, via an operating system. The operating system is responsible for managing the hardware and software resources available to the computer. The core functions of an operating system, such as executing programs, managing memory, providing networking functions and presenting a graphical user interface (GUI), should be well understood by a digital forensic investigator. Determining the operating system in use on a suspect’s machine, either by observation prior to imaging or by reviewing the contents of an acquired image, helps to point the investigator towards operating-system-specific artefacts that contain evidence.

Microsoft Windows

Since it is the dominant desktop and laptop operating system, investigating evidence generated by Microsoft Windows is a familiar concept to both digital forensic investigators and incident responders alike. Remember, the default file system in use by Windows is NTFS, which is well supported by all the major forensics suites. Once evidence is processed by a forensics suite, areas of interest or specific types of file will be presented for enhanced review. Some examples of Windows-specific evidence locations are shown below.

The file system

Microsoft Windows uses a letter-based system to label connected storage volumes. Most people who have used a Windows system will be very familiar with the ‘C:’ drive, which typically represents the system volume. In each volume you’ll find various types of file that are present on the computer. Of course, depending on the nature of the investigation, you’ll be able to select the types of file that are of most interest.

The page file

Also known as the swap file, the page file is used by Windows as a form of virtual memory. Used to supplement random access memory (RAM), the page file contains chunks of memory that have been swapped to the hard drive so that the memory items currently in use can reside in the faster physical RAM. This makes the page file an interesting prospect for forensic investigators. For example, some applications may store passwords in memory in cleartext; if those passwords are swapped to the page file and the computer is shut down, they may very well still be there. There are a variety of page file parsing tools out there.

Event logs

The Windows platform features a standardised file format for recording different types of event, to overcome the problems associated with multiple applications having proprietary logging formats. Since the launch of Windows Vista, that format, known as .evtx, has used an XML-based structure to record multiple details of any given event. There are a multitude of different applications designed to parse .evtx files and home in on specific event types, including open-source tools and native Microsoft Windows tools, and this functionality is built in as a feature of the majority of forensic investigation suites.

For a forensic investigator a common usage of Windows event logs is to look for specific security-related events of interest, such as a user logging on to, or logging off, a computer. In such a scenario the investigator would likely use a tool to parse the raw .evtx file for a relevant log event. Event IDs are used to indicate the type of recorded event; Windows event ID 4624 represents a successful log on, and 4634 represents a user logging off.

Registry

The Windows registry is a database of various operating-system- and application-specific settings that can provide tremendous insight for a forensic investigator. The registry also stores user-specific settings, primarily for the purpose of improving the user’s experience with the operating system, but in doing so it reveals how the user is using the operating system to an investigator.

One example of the value of the Windows registry would be the way in which it keeps records of all devices connected to the computer. This includes USB storage devices, the usage of which is recorded in great detail. The HKLMSYSTEMCurrentControlSetEnumUSBSTOR registry key contains a record of the serial number of each USB device, along with timestamps displaying the first and last time that a device was attached: absolutely wonderful information for determining if a given device should be included in the scope of evidence.

A user doesn’t typically interact directly with the Windows registry; instead, the applications they’re using, or the operating system, will make changes on their behalf. It is entirely possible, however, for a user to manually modify registry keys. A built-in Windows utility, regedit.exe, makes this possible. It is also possible to delete registry keys entirely, so a savvy suspect aiming to cover their tracks might attempt to do this. In such cases it might still be possible to recover the deleted registry key; the registry database is just another binary file subject to the same rules of slack space as anything else.

Prefetch files

Each time an application is run on a Windows system, a so-called prefetch file is created to facilitate faster load times for that application. It does this by storing chunks of the various files an application needs to load into a single file, which means the operating system only has to look in one place. The prefetch file also contains metadata that is relevant for a forensic investigator. A prefetch file can tell you how and when an application was first run, when it was last run, how many times it has been run and from which volume it was run. Prefetch files are typically located in ‘C:Windowsprefetch’, and have a .pf file extension.

Apple macOS

Once considered by many to be a platform reserved for the creative industries, Apple’s macOS (formerly known as Mac OS X) platform has seen a significant increase in popularity in recent years. These days you’re just as likely to find a Mac on the desk of a car dealer as you are a digital designer.

File system structure

The modern-day macOS is Unix based, and as a result the file system is laid out using Unix standards. All connected volumes fall under a root directory, represented as a single slash, ‘/’. In the top level of the volume the operating system places a selection of directories that form the core data layout.

Of particular interest to us would be the /Users directory, which contains home directories for all users on the computer. Evidence in user-generated files will typically be found here. Because it is Unix based, macOS also treats all connected disks as if they were files. For example, an attached USB drive would appear under the /Volumes directory.

Plists

Throughout a macOS system you’ll find lots of files with the extension ‘.plist’. These are property list files, and are raw XML or binary-encoded files used by macOS (or iOS) to store various strings related to a given application. Sometimes these can be user specific, and therefore have relevance to an investigation. Utilities for converting binary plist files back to XML, such as ‘plutil’, are frequently used in the hunt for evidence.

Swap

macOS generates swap files for the same purpose that Microsoft Windows generates the page file. Rather than a single file, macOS can generate up to 10 different swap files, depending on need. These swap files can be found in the ‘/private/var/vm’ directory. This directory also contains a ‘sleepimage’ file, which is used to dump a copy of the RAM contents if the computer is put to sleep: something to be aware of, since this could provide a source of otherwise volatile evidence.

System logs

Again, thanks to that Unix foundation on which it was built, macOS produces a variety of log files in the Unix format. These log files are found in ‘/private/var/log’ and include a system log for generic system messages, and a secure log for keeping track of authentication events on the machine.

Linux

Primarily, but by no means exclusively, used on servers, Linux-based operating systems can be found in a variety of different scenarios. There are various Linux distributions, of course, each with their own different system utilities and nuances, but all of them have a few things in common.

Filesystem Hierarchy Standard (FHS)

This standard defines the conventions used by Linux distributions when laying out a file system. Linux uses a hierarchical file system, in which everything falls under the root, or ‘/’, directory, even if the computer uses more than one physical disk drive. For the first level under the root directory, the FHS defines a series of directory names and provides a description of what types of file should be stored in them. The result is that even if you’ve never worked with a particular Linux distribution before you’ll still be able to navigate around the file system and know where to look for particular evidence items. The first-level directories include:

/bin, used to store essential Linux command binaries;

/boot, used for boot loader files;

/dev, where raw device files are stored (remember, Linux treats storage devices as files);

/etc, used to store system-wide configuration files;

/home, which contains user home directories, and is the most likely place you’ll find user-generated evidence;

/lib, which contains shared libraries;

/media, designed to be used for mount points for removable storage media;

/mnt, typically used to temporarily mount file systems;

/opt, for optionally installed software;

/proc, a virtual file system that is used to store process and kernel information (we’ll discuss this more as we look at live acquisitions);

/root, the home directory of the root account (the superuser on a Linux system);

/run, used to store real-time variable information such as which users are currently logged in;

/sbin, used for system binaries;

/sys, used to store information about device drivers;

/tmp, a temporary file space that is not preserved between reboots;

/usr, used for multi-user utilities and applications;

/var, used to store variable files, in other words things that change during normal operation. For the forensic investigator, /var/log is a favoured location since it is used to store system and application log files.

/etc/shadow

The shadow file is used to store encrypted passwords for users of the Linux system; it also lists the username for each account on the system. These two pieces of information may prove useful to an investigator wishing to uncover passwords used by a suspect. A well-regarded open-source password cracking tool, John the Ripper, is frequently used against shadow files.

Bash history

A list of shell commands previously executed by a user will normally be present in a file called ‘.bash_history’, which is typically found in the user’s home directory. This is very useful for determining the type of activity a user was conducting, if they were indeed using the shell. In investigations that focus on servers, without an installed GUI, this is usually a valuable source of information.

Logs

Unlike Windows, log files on Linux systems tend to be stored in a standard text format, which means they can be parsed without special tools. Most logs are located in the /var/log/ directory, and include a mixture of system-level and application-level event data. Some examples of logs that may be of interest to an investigator include:

/var/log/secure, which includes system-wide authentication events;

/var/log/apache2/access.log, which includes information regarding access events on the Apache web server platform. For instance, each time a user address accesses a web page, information such as source IP, user agent string and the page accessed is recorded in this log file.

FILES

Ultimately, the majority of evidence discovered during any investigation will come from user-generated files, or artefacts recorded by the system regarding user activity. The file systems and operating systems in use may be the same between thousands of computers, but the user-generated content on them will of course be very different. An understanding of both operating system and file system lets us know where we should start to look, but once we’ve arrived there it is up to our own ingenuity, technical skills and investigative brain to figure out the rest. This is what makes digital forensics the exciting and rewarding field that it is.

Cryptographic hashing of files

When a file is processed for forensic analysis a cryptographic hash of the file content is recorded. A hash is a mathematically derived representation of the data contained within the file, returned to a fixed length. Typically this is done using the MD5, SHA-1 or SHA-256 hashing algorithms in most modern forensics suites. The purpose of hashing a file is to prove that the content of the file hasn’t changed from original source to forensic image. Changing a filename will not alter the hash value of a file, but changing the content will.

Known file filters

To help us cut through the noise when working with many thousands of files on a computer, a tool called a known file filter can help us get a head start. Known file filters compare the cryptographic hashes of files collected from the suspect’s machine with a large database of known benign operating system files. This allows us to very quickly discard files that are not going to be of any interest to the investigation.

Known file filters can also be used the other way around, to look for files that are of significant interest to us. An example of this would be a service called PhotoDNA, which was developed by Microsoft to provide hashing of known pornographic images containing children. Many cloud service providers such as Dropbox, Google and Facebook use PhotoDNA in their products to flag such images and take action.

Carving

File systems keep files nice and organised, but what happens when either the file system has become corrupted or a file has been deleted and is no longer referenced in the file system? The answer is file carving, a core function of the digital forensics profession.

File carving involves using tools to scour through the raw data on a disk and carve out either full files or fragments of files. The process works by looking for file header values, known as magic numbers, that match a known value for a given type of file. For instance, the hexadecimal value FFD8 is present in the file header for a JPEG image file. Therefore, if we find an FFD8 in the raw data of a forensic image, it’s likely that at least some of the data that follows will form a JPEG image.

File carving tools use a variety of different techniques in the quest to accurately determine the start and end of a discovered file. Files can, of course, be fragmented across various physical locations on a disk, and without the file system information to tie them together the task of reassembly can be complex. Both free open-source and highly expensive commercial carving tools exist. Both can work, but typically with a paid-for offering you are paying for the quality and intelligence of the carving model and algorithms. File carving functions are also built into all commercial forensic investigation software suites.

Internet browsers

Some of the most sought-after user-generated artefacts are those created by internet browsing activity. With so many digital crimes involving the internet, it should be no surprise that a suspect’s browsing habits might be of interest to an investigator. All operating systems come bundled with a browser, but the user is free to download and use whichever browser they’d like.

Knowing where a particular type of browser stores artefacts such as user histories, cookies and bookmarks will assist the investigator greatly when it comes to building a timeline of internet activity.

Microsoft Internet Explorer uses a database file called ‘index.dat’ to store web history information in a format known as MS IE Cache File Format. These database files can be examined with specialised tools.

Mozilla Firefox and Google Chrome use a series of SQLite databases to record things like form submission history, web browser activity, cookies and downloaded files. SQLite is a format that can be explored with relative ease.

Apple Safari uses a macOS .plist file to store history under a user’s home directory.

ANALYSIS OF ARTEFACTS

While acquiring disk images, and waiting for them to process, are critical parts of the investigative process, the majority of the enjoyment of digital forensics is contained within the analysis phase of the investigation. Here, leveraging our tools, creativity and skill, we can piece together the various evidence items that will allow us to build a solid case. While the differences between the tools an investigator might use for a given case mean it’s hard to describe a single analysis workflow, I will attempt to describe a typical one below.

Knowing where to start

Earlier, we discussed the importance of only investigating in response to a very specific allegation and avoiding the generic ‘we think this guy is bad, please find something that we can fire him for’ scenario. The reasons behind this become very clear when we arrive at the analysis phase of the investigation. Without a starting point it is very difficult to start. Analysis of digital evidence usually starts with a specific timeframe, or file or topic that can be explored via a keyword search. Merely looking through hundreds of files for evidence of something generically defined as ‘wrong’ or ‘bad’ isn’t an enjoyable way to spend your time, and isn’t an effective use of digital forensics tools.

Leveraging the user interface

Forensics suites present us with a number of screens for interacting with evidence at different levels. They’re smart enough to know that we’ll want to be able to sort emails by sender, or by date received. They’ll present files in native formats, or allow us to explore them using a hexadecimal viewer so we can find hidden details that might not otherwise be apparent.

Focusing on relevant items

Normally, artefacts are filtered in and out using the forensics tool, so that only files created, accessed or modified during the timeframe of interest remain. Then, features like known file filters are used to further narrow down the selection. It’s all about minimising noise and distractions and finding relevant data. Once an artefact has been found that could be of interest, the investigator will bookmark it. Bookmarks, as you’d probably expect, make it a simple task to go back to those artefacts of interest, so you don’t have to go through the entire filtering process once again.

Using timestamps forensics suites can build visual timelines of file usage, which make understanding a suspect’s actions on a machine a much simpler task.

In addition to keyword searches, regular-expression-based searches can be used to home in on data that matches a particular pattern – for instance, finding any documents containing credit card numbers. Similarly, features such as explicit image detection look for files that contain pixels matching flesh tones, which would be of interest in an investigation involving pornography.

Overcoming challenges

During the analysis phase the investigator might also come across files that are encrypted. Many forensics suites have built in decryption features, such as rainbow tables and wordlist generators that are used to attempt discovery of encryption keys for those files. Truth be told, if someone has taken the time to encrypt a file, and it falls within the scope of an investigation, it is likely to be worthy of our time to investigate.

From time to time, a forensics suite alone might not be enough to fully analyse the file in question. Therefore, all suites allow the option to export the file for analysis in third-party tools. For example, if you have a proprietary file format that requires a special viewer, it may need to be exported and examined. Here, the cryptographic hashing that occurs during the processing phase is used to validate that the file exported from the forensics suite is exactly the same as the source file.

Preparing to report

At the end of the analysis phase, the investigator will typically export the bookmarks they’ve made using reporting features. The reports generated by a forensics tool are then typically included in the final report produced at the end of the case.

SUMMARY

In this chapter we’ve studied the bread and butter evidence sources that we’ll work with during an investigation. Evidence collected from hard disks, removable media and file systems often forms the foundation of an investigator’s case. Therefore, it is vital that any investigator understands each of them deeply.

Secondly, we looked at the user- and machine-generated artefacts found on those disks, such as files, internet histories and event logs, that can lead us to vital clues in determining what has occurred.

It is true that offline or powered-down acquisition of these sources is ideal; however, this might not always be possible. Therefore, in the next chapter we’ll take a look at factors that may influence the decision to perform a live, or powered-on, acquisition.

51 Federal Bureau of Investigation (2016) Statement by FBI Director James B. Comey on the Investigation of Secretary Hillary Clinton’s Use of a Personal E-Mail System. FBI National Press Office. Available from https://www.fbi.gov/news/pressrel/press-releases/statement-by-fbi-director-james-b-comey-on-the-investigation-of-secretary-hillary-clinton2019s-use-of-a-personal-e-mail-system [1 May 2018].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.174.239