CHAPTER THIRTEEN

A Cyber Forensic Process Summary

THE NEED FOR COMMUNICATION and information sharing continues to be a driving force of technology. This endeavor began with the dawn of man and continues today. From early cave paintings to stone tablets, to the printing press, to electronically stored data and the Internet, our need and desire to share information continues to grow geometrically.

The ability to communicate electronically has accounted for many of the advancements we have in a society today, along with many conveniences. However, whenever there is good, bad is not far behind; as with everything in life there is a fine balance between good and bad. There is no exception with electronically stored data. Crimes will continue to occur regardless of the technological advancements used by those to perpetrate crimes and those sworn to uphold the law. The difference is the process by which the crimes occur and how they must be investigated.

This book has addressed the process by which data originates, is stored, moved, manipulated, and analyzed to assess its relevance as evidential matter. As with all subjects there must be a logical beginning and similarly, a logical conclusion. Our journey began in Chapter 1 and the root of all electronically communicated information, binary data representation.

BINARY

In order to properly investigate electronic data or computational communications it is first necessary to understand how we, as a species, attempt to codify our ability to communicate electronically in a world with only two possible states, a world of binary existence.

Binary is a Base 2 encoding scheme which functions well with a two-state paradigm such as electronics. With electronics there are two possible states, on and off. This is the basis for all electronic communications. When stringing these on/offs together complex communications can be achieved, not only for representing the most basic patterns of human communication but also complex alphabetic and numeric patterns, ultimately enabling people to represent entire languages.

The representation of complex language patterns for digital communications began with the primary building block, the bit, represented by either a one (1) or a zero (0). Simply arranging and grouping ones and zeros together allowed for all electronic data representation, from a simple text document to a high definition movie.

Establishing a method of pairing alphabetic characters with the character’s binary equivalent produced character codes, which have since evolved into more complex character sets, further allowing us to not only expand our ability to represent a greater range of characters but to also control how computers store, manipulate, and transmit data.

Binary representation of numbers and characters is required when working in a world restricted to only two states of description or existence (e.g., electrical or magnetic). Fortunately for us, our human world is more robust, more colorful, and exists in many states, well beyond that of a binary life. It is also more difficult and time consuming if, as humans, we were required to perform all of our figuring, communicating, and so on, with numbers or letters represented by groups and pairings of 1s and 0s (e.g., 01001000 01100101 01101100 01101100 01101111 instead of “Hello”).

BINARY—DECIMAL—ASCII

Converting a binary number into its decimal equivalent is essential for gaining a greater depth of understanding of how data is stored, moved, manipulated, and processed and how this treatment of data is critical to a better understanding of cyber forensics.

A computer only processes binary stored bits, it cannot recognize or process the character “&” in its native form; we humans, on the other hand, do not process binary easily. The middle ground is with decimal values. We can more readily and effortlessly understand the same information, represented and presented in a decimal form.

A decimal value (10 unique decimal characters: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9) is a mathematical computation of binary, not a visual representation of binary.

The key to converting the binary value to its decimal equivalent is the existence (or lack thereof) of a “current” represented by the binary value of a “0” or a “1” switch or binary character.

If a binary value is present in the placeholder, the value is turned on, represented by the value of one. If no binary values occupies the placeholder, then the value is turned off, which is represented by the value zero.

If the binary switch (or value) is ON (a “1”) then the decimal value is ON, meaning it is added or counted when determining the total decimal value. If the binary switch is OFF (a “0”), then the decimal value is not counted or added when determining the total decimal equivalent.

Let’s take for example the binary value 01011000, using the information in Figure 13.1.

FIGURE 13.1 Binary to Decimal Conversion

image

Go through the binary numbers and if the binary number is 1, bring down the power of two and write it in the corresponding box on the decimal value line. If the binary number is 0, put a 0 in the box. Convert the binary number to a decimal by adding up the decimal value you entered into each box. The sum of the numbers is the decimal equivalent of the binary number.

Binary 01011000 equals a decimal value of 88.

Why bother with converting binary to decimal? Computer processors work with mathematical computations, not letters, symbols, and words, yet humans communicate via letters and words. Thus, a computer needs a way in which to mathematically represent human symbols.

A binary value can be mathematically computed into a decimal value, and a decimal value can be assigned to an ASCII value (human symbols). (See Table 13.1.)

TABLE 13.1 ASCII Table Snapshot

image

The decimal value is referenced to the corresponding value in the character chart (ASCII or UniCode) by the Operating System (OS) and/or software being used.

Converting binary to decimal is easy when the binary value to be converted is small, but as the binary value increases in size, the numbers can get rather large and tedious.

For example, assume a binary value of 010110000101100101011010. This value may appear daunting, but it is only equivalent to 3 bytes or 24 bits.

If we were to convert this binary string to its decimal value equivalent by turning “on” position values represented by 1s and leaving “off” those position values represented by 0s, our string of numbers would look like this:

image

When finally totaled, this string of binary values would yield a result of 5,790,042.

The process of deciphering binary values into their decimal equivalent can get very tedious, time consuming, and very expensive, especially if the string of binary values is more than three bytes. Imagine converting a high definition video!

DATA VERSUS CODE

A document or other file has what is sometimes referred to as a header or “code” which is supplemental data placed at the beginning of a block of data being stored or transmitted. In data transmission, the data following the header are called the body. The header, in effect, binds the block of data that follows the header (the body) to the software needed to open it or otherwise access it.

For example, if you create a document using Microsoft (MS) Word, the document cannot be opened using the Adobe Acrobat reader/application. This is because there is code embedded within any document created using MS Word, which tells the operating system that only MS Word (or other compatible software) is needed in order to open the document.

If the code which binds the document to its native software is somehow overwritten or “erased,” the software will not be able to reassemble the document into its native format or into a format readable by the user, thereby causing the document to be inaccessible and unreadable by the user.

Some of these data, such as incriminating text (the occurrence of “XYZ” in the case referenced throughout the book, for example), may however still reside in a document, on a disk, or within the hard drive. For a cyber forensic investigator to properly search for a keyword contained within data seized from an entire hard drive (or even from data narrowed down to a specific folder or specific image within a user’s hard drive), it is best to use HEX to accomplish this herculean task.

HEX

Hexadecimal, or HEX for short, is strictly a human-friendly representation of binary values.

Viewing data as a HEX representation (or value) allows a cyber forensic investigator to go beyond the application or file. It allows for the viewing of all the data contained within a file including remnants of old or even deleted files.

It is important to understand that not all binary values are convertible into readable ASCII. ASCII is a code, based on the ordering of the English alphabet, and not all data contained within a computer is necessarily text (ASCII) based. There are many programs or software applications that are written in programming code that is not ASCII-based.

This programming code is not meant to be viewed in ASCII, it is meant to perform a function. Recall from our earlier discussions that a computer’s functions are all based on math, not the English (nor French, Chinese, Slavic, Greek, Arabic, or any other such) language; code therefore needs to be based on mathematical principles not grammatical ones.

FROM RAW DATA TO FILES

There are hundreds of different formats for data (databases, word processing, spreadsheets, images, video, etc.). There are also formats for executable programs (.exe, .bat, .dll) on different platforms (Windows, Mac, Linux, Unix, etc.). Each format defines how the sequence of bits and bytes are laid out, with ASCII being one of the easiest for humans to decipher or read. A text file is simply a file that stores any text, in a format such as ASCII or UTF-8, with few if any control characters.

There are a wide variety of digital file types containing specific formatting information that allows for file access, storage or “manipulation.” This “manipulation” may occur via the operating system itself, or it may occur via a “parent” program installed on the operating system.

A parent program is the program that is used to create, execute, or otherwise access the file. In most cases a file will contain data and its file signature, from which its parent software (or the operating system) will be able to identify and handle its operation. The file signature information is contained in what is sometimes referred to as a “file header.” The data contained within a file header is not seen by the casual user, yet is very important for the file to function as designed. It is this data contained within the file header that is used to identify the format of the file.

The value of HEX is apparent when a method to extract the readable data from a file may no longer be feasible, occurring for example when the header information is missing or in some way corrupt. Even though the file is unidentifiable and unable to be opened by native or compatible software, the cyber forensic investigator can search for the binary equivalent of some ASCII representation across the entire hard drive.

The investigator would find this value regardless of modified or missing file signatures. As was discussed previously, many times in the course of normal day-to-day operations and file processing a deleted file and its associated metadata will be partially overwritten, perhaps missing the entire file signature or other important formatting information and even some text. However, if the binary values representing a piece of evidence (e.g., “XYZ”) remain within the file’s remnants, then they can be found.

ACCESSING FILES

Most files need to be mounted by an operating system (or some software) to be accessed in normal day-to-day use. In order for this to occur an operating system needs to boot up so that it can identify the file structure and location of the file in order to present the data in a readable manner.

The boot process is important, as it is the process of mounting the evidence for which the investigator will investigate. When accessing information on a system, the mounting of the file system is imperative. The importance of the Master Boot Record (MBR) and its contents, such as the partition table (PT), are all relevant bits of information that can have a crucial bearing on the investigation.

A firm understanding of the boot process is necessary if, for nothing else, knowing when evidence is altered and thereby avoiding contaminating evidence by imaging. A cyber forensic investigator, as with any investigator, will at times be responsible for collecting and capturing evidence.

Data can be written to a hard drive (e.g., potential evidence) during the boot process, altering the evidence. Knowing when and how data is altered on a piece of evidence (hard drive or otherwise) is not only important when investigating evidence, but also important when acquiring evidence.

During the boot process of the primary file system (or partition) data is, in most cases, written to the hard drive, such that dates are changed and files are written and altered. It is critical to a sound investigation not to alter evidence for which you have been entrusted to image in a forensically sound manner.

Booting up a computer, in an uncontrolled manner, could very well contaminate the integrity of the data contained within the evidence (hard drive). It would be analogous to a homicide detective stomping through blood splatter at a crime scene. Even if the detective could explain away his/her foot prints, at the very least, the quality of his or her work and competency would be called into question.

ENDIANNESS

In cyber forensics, how data is stored on a drive is crucial information, as often, the cyber forensic investigator will have to look at raw data (via a HEX editor) for possible evidence, thus; knowing how the information is written to disk, how data are represented and presented physically and logically, is very important.

Understanding the concept of endianness is necessary in order to fully understand how a mathematical based system handles or interprets data, such as whether integers are represented from left to right or right to left.

Not all binary data are treated equally. The way in which binary (HEX, in our view) is handled all depends upon the system architecture, the code. As a system boots it will encounter code that will tell it to execute an instruction set.

Generally, in computing, endianness comes in two flavors: big endian and little endian.

In big endian, the most significant unit (or byte) of a data field is ordered first or left justified. With little endian, however, the least significant unit (or byte) of a data field is ordered first with the most significant byte on the right (i.e., right justified).

Endianness describes how multi-byte data is represented by a computer system and is dictated by the CPU architecture of the system. Unfortunately, not all computer systems are designed with the same Endian-architecture. The difference in Endian-architecture is an issue when software or data is shared between computer systems. An analysis of the computer system and its interfaces will determine the requirements of the Endian implementation of the software.

PARTITIONS

There are subtle differences between volumes and partitions, and sometimes the lines between the two can get fuzzy. Volumes exist at the logical OS level, and partitions exist at the physical, media specific level. Sometimes there is a one-to-one correspondence, but not always.

A partition is a collection of (physically) consecutive sectors and a volume is a collection of (logically) addressable sectors. Herein lies the difference—the data contained within a volume may appear consecutive, but only logically.

A partition is an area of the hard drive that is defined by an entry in the partition table of the MBR, and is recognized system wide. The partition is interpreted by code contained within that same sector, the MBR, and a partition is usually a subdivision. As the name implies it is the process of breaking something larger into smaller pieces.

A volume is an area defined or interpreted by an operating system. A volume is recognized by the operating system and will have a drive letter associated with it. It is often used synonymously with the term drive or disk.

The physical verses logical nature of the partition and volume however are not necessarily always mutually exclusive. The differences or similarities sometimes get fuzzy as they were not created with the idea of the other in mind. In fact, many times they are the same thing.

Perhaps most importantly, a volume contains the file system, which is unique to the operating system and only understood by the specific operating system.

FILE SYSTEMS

A file system is a tool used for storing and retrieving data on a computer. It is the tool that tracks the allocation of the clusters, and it allows for a hierarchy of directories, folders, and files. A file system addresses and manages all the clusters contained within a volume.

A file system is usually defined during the creation of a partition; it is at this point the partition “becomes” a volume. File systems determine how and where files are placed on a hard drive, with the goal of trying to optimize data retrieval speeds. We may know where that document resides logically within a folder structure, but we are oblivious (and justifiably so), as to which specific bits on the hard drive are allocated to this individual document. This is not something the end users need to concern themselves with; however, it is imperative that the file system of the computer knows, otherwise when we click on the Word document icon nothing will happen.

Various filing systems and their components may have different names and their physical placement on the drive may vary, but functionally all file systems require similar pieces—those which identify it, those which identify its data, and those which contain the data itself.

TIME

In cyber forensic investigations knowing the correct time is of great importance. Understanding the timeline of events is imperative in understanding when events occurred with respect to all other events.

Timing inaccuracies are broad and vast. Inaccuracies can be system wide to NTP server inaccuracies, or system specific due to clock skew. Inaccuracies can be specific to a certain geographical location due to confusing time zones or to a specific operating system’s file system, rounding odd seconds to the nearest even second.

Time discrepancies can be as long as time permits or as short as a second (or less), as with the MS-DOS 32-bit timestamp timing inaccuracy. It is highly unlikely that a timing inaccuracy of a second (as seen in MS-DOS 32-bit timestamp) would be so pivotal to corroborating someone’s innocence; however, one thing time inaccuracies have in common is that they can discredit an expert, especially a cyber forensic expert/investigator!

THE INVESTIGATION PROCESS

An exact line-by-line instruction set for running a complete cyber forensic investigation is logically impossible to present, as each organization performing a forensic investigation will have their own approaches, procedures, policies, and methods—some dictated by law, others by internal preferences and protocols. However, there are general Investigative Smart Practices, which may fit into most types of forensic organizations and most types of cases.

A child pornography investigation run by law enforcement versus an intellectual property theft investigation run by a corporate forensic department may both eventually find the evidence necessary to prosecute the guilty; however, the approaches, steps taken, and processes to that end may be entirely different and be supported by completely different documentation.

As peculiar as the differences are between varying organization types so too are the differences between cases. The Investigative Smart Practices presented in this book are meant to be broad in scope and used as guidelines.

Step 1: Initial Contact/Request

The validity and scope of the investigative request is established. This function may be performed by someone outside the cyber forensics field. For example, this can be determined by a judge via a court order or perhaps via the HR department within a large organization.

Step 2: Evidence Handling

The integrity of the evidence must be preserved throughout the entirety of the investigation. This process occurs each and every time the evidence is handled. Preserving the integrity of the evidence is vital, but equally essential is also being able to prove the integrity of the evidence in a court of law.

Step 3: Acquisition of Evidence

This step involves obtaining a forensically sound image of the original evidence. Acquisition of evidence can certainly fall under Step 2, Evidence Handling; however, this step focuses more precisely on the acquisition of the evidence versus the handling of the evidence during acquisition.

Step 4: Data Preparation

Preparing and identifying data for analysis and investigation. This step focuses on “analyzing” data to ensure a valid and complete search. This includes mounting complex files, verifying file types, recovering deleted items, and anything else which would prepare the data for final investigation (Step 5).

Step 5: Investigation

Focuses on finding those data that match specified search criteria. This step tends to be a little more subjective than the others, being that the investigator may need to examine the search results, discard false positives, and identify the critical piece(s) of evidence, which typically is not conveniently named “incriminating evidence.doc.”

Step 6: Reporting

Reporting is the means by which the investigation details its findings and communicates them to the client, management, law enforcement, and/or the requestor. Communicating the findings is highly dependent upon organizational structure. A corporate environment may have reporting requirements or templates dissimilar to those in law enforcement, for example.

Some of these steps will occur concurrently while some may occur out of “order.” As is the case with evidence hash values, each case is unique. It is this uniqueness that makes cyber forensics such a challenging field.

Step 7: Retention and Curation of Evidence

Evidence retention and curation may fall under the all encompassing “evidence handling” step, yet due to unique issues and complexities it is discussed in its own right. This step involves the post handling of evidence, implying any storage, archiving, destruction, or returning of all evidence. Evidence retention will likely include the handling of all evidence associated with an investigation, be it physical hard drives, digital files, and perhaps the investigator’s hand written notes.

As with all steps, requirements are highly dependent upon the type of forensic practice (e.g., law enforcement, corporate, government, private) and may vary from case to case. There will be varying legal, contractual, procedural, financial, and business requirements, which will ultimately set the boundaries for this final evidence handling step.

Step 8: Investigation Wrap-Up and Conclusion

Investigation wrap up is broad in its meaning and covers those post investigation activates loosely structured around defending the cyber forensic investigator’s work.

This can occur in the form of being an expert witness in a criminal case, an interview by HR or Internal Audit for a corporate investigation, or perhaps a peer review. This may also include an internal self examination such as “lessons learned” or a quality control assessment.

SUMMARY

Traditional forensics professionals use fingerprints, DNA typing, and ballistics analysis to make their cases. Cyber forensic investigators rely upon various technologies for collecting, examining, and evaluating data in an effort to establish intent, culpability, motive, means, methods, and loss, resulting from crimes conducted by, with, or through any device that is capable of accessing, retrieving, processing, and storing electronic data. The cyber forensic investigator’s role is in the discovery, collection, and analysis of data, leading to the identification of digital evidence.

It is our hope that this book has provided you with a solid basis for establishing or expanding an understanding of how raw data are unraveled through a cyber forensic process, resulting in digital evidence.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.191.86