Chapter 13: Stealing Information

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 13

Stealing Information

We’ll Cover

How to look for patterns and identify artifacts

How to determine where the data went

How to detect which data has been taken on external devices

The theft of information can involve a lot of diverse activities, but in this chapter, we will cover the most common example: the theft of corporate information by a (soon-to-be) ex-employee. We’ll look at how to determine where the stolen data has gone and how to recover evidence after information has been stolen.

What Are We Looking For?

We are looking for evidence of an employee stealing correspondence, customer contacts, drawings, contracts, spreadsheets, e-mails, source code, and other company-owned information—whatever they consider of value to them or a future employer. In a nutshell, you’re interested in determining whether an employee transferred files to some external media or otherwise transferred the data out of the control of the original owner or employer to make the information available to use in the future.

These cases typically involve certain patterns that you should keep in mind as an investigator. For example, you’ll usually see increase in user access to files, which occurs in large blocks. A large number of files may be accessed on a specific day; then, perhaps another set will be accessed the day after. These files may be copied on to external storage devices (USB, FireWire, eSata, and so on) or uploaded to file hosting/webmail web sites. Typically, if the employee is transferring data via e-mail, he or she rarely uses the corporate e-mail system. Most people understand that the corporate e-mail system is, or can be, monitored and backed up. The user obviously doesn’t want to have their activities intercepted by corporate IT personnel, or otherwise preserved for posterity. Most employees will use a corporate e-mail alternative such as Yahoo! Mail, Gmail, or Hotmail to send the data they consider valuable out of the corporate system.

In the vast majority of these cases, the suspect believes that their personal web-based e-mail leaves no trace on the actual computer. Luckily for us, that’s not the case. Most commercial forensic tools have webmail analysis and carving features, and they work in the same way. They search the drive image for the unique headers used by the various webmail systems across both active HTML files and searching for valid header/footer combinations across the drive for deleted files. This has changed somewhat with the wide usage of Asynchronous JavaScript and XML (AJAX), which does not require a new HTML page to be created for each request. We can still recover the e-mails the suspect has viewed, but the artifacts that contain them are more likely to be overwritten more quickly as they typically exist only in the pagefile/swap. These artifacts are encoded in JavaScript Object Notation (JSON) and are well documented.

LINGO

Asynchronous JavaScript and XML (AJAX) is the magic behind the ability to update part of a page without reloading the entire page. AJAX is used in almost all Web 2.0 applications. It works when a JavaScript function at a set interval or action makes a request to a web site that replies to the request in XML. The XML is then parsed and the updated data is shown to the user. The parsed XML objects returned from a AJAX call are normally stored in JavaScript Object Notation.

Data that can help you work a case involving stolen information can be varied and can come from diverse types and locations. Consider all the forensic artifacts available to you when trying to determine whether a suspect has taken data with them. Table 13-1 shows some places you should check (most of these artifact sources are discussed in this chapter).

Table 13-1 Artifacts and What They Tell You

Combining these artifacts tells a compelling story about what was taken, how it was taken, and when it was taken. Putting these stories into a report (as discussed in Chapter 16) may be enough to convince an ex-employee to return stolen information or to get a judge to grant an order forcing him to do so.

Tip

Two popular products can help you find the most popular kinds of webmail for review. Internet Evidence Finder from JAD software and Evidence Center from Belkasoft will search a forensic image and find all of the known webmail fragments on the disk for you. You can do this by hand as well, but using these products can save you lots of time.

I won’t go into webmail analysis in this section, because in a best-case scenario, one of the recovered HTML pages will be the user’s Inbox, which will have columns identifying the sender, date, subject, and whether an attachment is included. Inbox views are typically static and written to the disk in their entirety, which makes it easier for us to recover. The individual e-mails will either be static pages for older webmail systems or JSON objects. A static page will be written to the disk in the cache folder and will be recoverable in its entirety. If parts of the page are dynamic—aka JSON—they will typically exist only in the pagefile and memory dumps. Attachments are handled separately and typically have their own static page or JSON object created, indicating their successful uploading and possible virus scan.

LINGO

The Inbox view refers to finding a page viewed by the suspect when he viewed the webmail. The page is typically static and can be recovered through standard data carving.

In Actual Practice

Remember that you can recover only what the suspect received from the web server. Text typed into a form, such as sending or replying to an e-mail, is not usually sent back to the sender but is instead directly submitted to the web server. You can’t forensically recover what is not viewed and stored on the suspect’s system.

When it comes to webmail, it’s either there or it’s not; to determine whether you should look for it, use the Internet history records (see Chapter 11). If you can find Internet history records showing access to a webmail site, you should begin trying to find the artifacts it created. Using tools like NetAnalysis to rebuild cached pages helps with this, and using the suspect’s e-mail address as a search term is a great way to find all the Inbox pages, because it is normally displayed in the header of the page.

If by luck the corporate e-mail system is used, the e-mails are usually routine correspondence. The user includes his home account as a blind carbon copy for the purpose of “archiving” important information in case something happens to the corporate e-mail system, or if the user begins to forward e-mails to a personal account. This is simple to detect: search for the suspect forwarding or BCCing (blind carbon copying) any messages to a personal account. If you can’t locate their personal accounts, a quick review of the suspect’s Internet history should help you identify them.

Determining Where the Data Went

Another common way that suspects take data is via USB flash drives or USB hard drives. The USB flash drive has made it easy to take large amounts of data. Luckily, established procedures of forensic artifacts to inspect can help us determine whether this has occurred.

LNK Files

LNK (or .LNK) files are created whenever files or folders are opened on a Windows 95 or later system through Windows Explorer. The following information is contained in a LNK file:

The full path to the file, which can be the local drive, network share, removable media, and so on

The type of drive the file is being accessed from: removable media, CD-ROM, local drive, or network share

The size of the file in bytes

The volume name and serial number of the drive from which the file is being accessed

The MAC address of the system where the file is stored if it’s being accessed over the network

Date information related to when the associated file was created, modified, and accessed: it will also have its own date information, identifying when the .LNK itself was created, modified, and accessed

LINGO

Windows Explorer is the interface you work through on your desktop. If you are viewing files and folders through My Documents, My Computer, or any other graphical interface to your files, you are using Windows Explorer. Don’t confuse Windows Explorer with Internet Explorer, which is a web browsing application.

LNK files are found in both the active file system as well as free space on the drive. Most commercial forensic tools have the built-in ability to recover LNK data from the free space. LNK files have the same names as the files they are accessing, so if an MS Word document named Document1.doc is accessed, for example, the corresponding LNK file would be named Document1.LNK.

Note

If you recover, or data carve, a LNK file from free space, you will not have the name of the LNK file or the metadata of the LNK file. You will still have the contents of the LNK file, however, so you’ll know what file it was pointing to and the relevant metadata. Although some investigators will downplay the importance of LNK files, they are some of the most important pieces of evidence provided on any Windows system. They offer detailed examples of specific file access, especially when you can use them to determine what files existed on an external storage device you no longer have access to.

In Actual Practice

LNK files don’t just matter for external devices. For example, I worked on a case where a single LNK to a SQL Server application led to further investigation of a server not originally in my investigations scope. On review of the transaction logs, I saw that after logging on, the user ran a query of all the company client information. The resulting data was then saved to a CSV file, which I found by following any files that were created after the time in the transaction log. Back on the suspect’s system, I found a LNK to the CSV, which identified the file as residing on a USB flash drive. These separate LNK files brought the user activity into contextual meaning, where I could re-create what happened first, second, third, and so on.

In most forensic tools, the easiest way to get to the LNK files is to filter the files for the LNK extension. The LNK files associated with a user’s recent folder will be found at the following locations:

Windows 95/XP/Vista <path to User Directory>Recent

Windows 7 <path to User Directory>AppDataRoamingMicrosoftWindowsRecent You can also find LNK files in the following places:

The Microsoft Office directories in the user’s profile directory for files opened within Microsoft Office

The Start menu directory tree for each user for every executable

The Desktop folder found in the user profile for any desktop shortcuts and for any other application that decides to make use of LNK files

As shown in Figure 13-1, FTK parses LNK files for you and presents the most forensically relevant portions of its contents. In this example, you can see that an e-mail .msg file was accessed from a removable disk.

Figure 13-1 LNK file showing access to a .msg file on a removable disk

Before moving on to the next section, you need to understand the significance of the dates found on a LNK file and within it. Table 13-2 explains them.

Table 13-2 LNK File Dates

If you clearly explain the meaning of a LNK file timestamp in your report, you can much better explain when and where a file or a series of files was copied to an external storage device.

Shellbags

Shellbags can reveal information about external devices that few other artifacts can. Shellbags are found in the Windows Registry and store user preferences for folder display in Windows Explorer, such as the size of the window or how items were listed. For a folder to exist in the shellbags, it must have been opened in Windows Explorer at least one time by the user. The shellbag subkey information is specific to each user and can be found in their NTUSER.DAT registry file for Windows XP and later and also in the USRCLASS.DAT registry for Windows 7, both located within the user’s profile directory.

Like LNK files, shellbags contain specific information regarding when a folder was first accessed and last updated, plus the folder name, full path, and so on. Why, then, do we need to locate shellbags for review? Although LNK files show us only which files and possibly directories have been opened, shellbags show us every directory a user accessed, whether they opened a file in it or not. When you are attempting to understand the scope and breadth of what files may have been copied onto external storage devices or what other contents an external storage device contained, this is incredibly useful.

The registry files are located in the following folders:

LINGO

Shellbags are a series of registry keys that keep track of a user’s preferences on how each directory he or she has opened should be displayed. They allow us as investigators to determine which directories a user has accessed using Explorer and their creation, modification, and access dates.

Windows XP/Vista <User Profile Directory>NTUSER.DAT

Windows 7 <User Profile Directory>AppDataLocalMicrosoftWindowsUSRCLASS.DAT

You can use several tools to analyze shellbags, including Paraben’s Registry Analyzer and TZWorks’ ShellBag Parser (sbag.exe). For this example, we will use the sbag.exe, a command-line tool available for download from www.tzworks.net/prototype_page.php?proto_id=14.

Parsing the contents with sbag.exe involves three steps:

1. Export the user’s NTUSER.DAT, and/or USRCLASS.DAT for Windows 7, from the forensic image into a directory where sbag.exe is located.

2. Execute sbag.exe, as shown in Figure 13-2. You can see we redirected the output of the program to a file named ntuser.csv instead of to the screen.

Figure 13-2 Running sbag.exe

3. Open the CSV file into a spreadsheet program such as Excel to review the contents.

In this case, the NTUSER.DAT file is located at D:SBAGNTUSER.DAT, and the output file, NTUSER.CSV, will be created in the same folder, as illustrated in Figure 13-3. We will have to repeat the process for USRCLASS.DAT, which is located in the same directory if this came from a Windows 7 image.

Figure 13-3 The resulting files stored in the directory

After the shellbags data has been parsed and exported by the SBAG.EXE, it can be opened natively by any application that can interpret the CSV data. Note that for sbag.exe, the field separator is actually the pipe symbol (|) and not a comma. In Figure 13-4, the CSV is being opened in Microsoft Excel. You can see the entries for D: drive, the removable disk, in lines 388–395. In addition to the directory path, you can also see the creation, modification, and access dates of the directory last captured by the system. Equally as useful is the registry modification time; this shows the last time the directory was accessed by this user.

Figure 13-4 Reviewing the contents of the parsed shellbags in Excel

To put the last two sections together, make sure to read Chapter 14 and learn about the USBSTOR registry key and the SetupAPI logs to be able to tie your external devices to their makes, models, and serial numbers.

Scenario: Recovering Log Files to Catch a Thief

Sometimes, the only evidence that data has been copied comes from text fragments in partially overwritten logs. A great example is a case I worked a few years ago: a soon-to-be-former employee saw the writing on the wall and decided to pull the ejection handle before his boss had a chance to fire him. In this case, a web development firm had one big customer comprising more than 70 percent of its business, but it was going to be replaced by another web development company in 90 days.

One of the developers knew his position would soon be eliminated, so he applied for a job at the company that was taking over the contract. However, the new company would deal with him only as a contractor on a probationary basis.

Eager to impress his new bosses with his productivity, and to get a head start on all the other developers, the young man decided he needed a leg up in the development cycle. Using his inside knowledge of how the web site was configured, he logged onto the production web site and used an FTP program to download files he needed for a major upgrade to the site that was being planned. In doing so, he also selected the entire site contents in the download, which happened to include more than 100,000 customer records that were kept in individual archival files on the web server. The archival files were generated automatically at midnight, as a result of a subroutine that executed daily.

The next morning, the hosting provider reviewed their firewall and web server logs and noticed the large amount of data that had been accessed and notified the web site owner. They provided lists of all the files that were downloaded and the IP address from which they were downloaded.

At this point, the company contacted a law firm to obtain a court order and get the customer information associated with the IP address on the date in question from the ISP. After being contacted by the legal team, the developer denied accessing the web site and downloading the files. He then agreed to allow examiners to take a forensic image of his laptop.

All of the customer records that were on the web server had a standardized naming convention that always started with the same set of characters, so those were used as keyword search criteria. During the search, thousands of hits to the standardized naming convention were found in a partially overwritten log from the FTP program he used, WSFTP. This log recorded that the files had been downloaded directly from the company web server to a USB hard drive that was attached to the laptop that evening.]

After presenting the findings to the client, the legal team feared that the 100,000-plus customer files and web site code might have been copied onto other media. They presented their arguments to the court, and a judge authorized a seizure of all media at the developer’s residence for analysis. The U.S. Marshals Service served the order on the developer, and all media in his home was taken into custody for review.

After confronting the developer with the log, he admitted to the download.

During our analysis of the media seized from his home, we discovered that the USB hard drive that the files had been copied onto originally had by then been reformatted. We were able to recover more than 70,000 of the customer files and provided the findings to the legal team that there was no additional evidence of the files having been copied onto any other media.

In Actual Practice

Recovering log files can be as easy as finding them deleted in the directory they were created in or as complicated as recovering them from the free space. However, to do either, you have to know what kind of logs you are looking for. For example, if you are looking for FTP server logs, you would find a current log on the system and find some unique term within it to search for to find other deleted partial logs in the free space. Tools such as the scalpel utility can be helpful in log file recovery, because you can define a pattern that matches your log file entry and have it export out all the entries that match it.

You might notice that I’ve gotten specific regarding a specific log type. This is because log types vary widely, depending on what made the log. Even saying an FTP server log is not very specific, because there are Microsoft FTP Servers, FileZilla FTP servers, Serv-U FTP Servers, and so on, each with its own log format. The important thing to remember is that you need to find out what types of logs you could be interested in, find samples of the log entries, and then search to find them so you can review their contents and reconstruct the suspect’s activities.

We’ve Covered

In this chapter, we’ve gone over the most common ways that employees take information with them when they leave an employer. These cases make up the bulk of our case load, and most companies are surprised to discover who the thief ends up being. The best-case scenarios involve a former employee taking contacts and knowledge to try to stay in contact with customers they are forbidden to solicit based on signed agreements. The worst-case scenarios involve employees stealing decades of intellectual property and trade secrets to compete unfairly and steal business away.

How to look for patterns and identify artifacts

Identify the most likely sources of data exfiltration.

Track down webmail to recover files being e-mailed out or sent to the cloud.

How to determine where the data went

Review Internet histories to find webmail and cloud storage sites.

Review user activity to determine data transfer and access.

How to detect which data has been taken on external devices

Use shellbag data to identify which external storage devices have been connected and when they were connected.

Determine what existed on external storage devices.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 13: Stealing Information

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 13: Stealing Information