© Abhijit Mohanta, Anoop Saldanha 2020
A. Mohanta, A. SaldanhaMalware Analysis and Detection Engineeringhttps://doi.org/10.1007/978-1-4842-6193-4_5

5. Windows Internals

Abhijit Mohanta1  and Anoop Saldanha2
(1)
Independent Cybersecurity Consultant, Bhubaneswar, Odisha, India
(2)
Independent Cybersecurity Consultant, Mangalore, Karnataka, India
 

Malware misuses and manipulates OS functionalities and features. A malware analyst needs to be aware of all of it. Operating systems and Windows internals are vast subjects, and we need not digest all of it. This chapter focuses on selective Windows operating system fundamentals, which are needed for a malware analyst. In this chapter, we cover system directories, objects, handles, and mutexes, and important system processes that are (mis)used by malware. We also look at Win32 APIs and system DLLs, which are commonly used by malware to perform malicious activities.

Win32 API

In the previous chapter, you learned about DLLs, which are libraries that provide APIs. The Windows operating system provides a vast set of APIs called Windows APIs, popularly known as the Win32 API. These APIs are available on both 32-bit and 64-bit Windows OS. Software developers extensively use these APIs to create Windows software that we all use. But they are also used by malware authors to create malicious software.

As a malware analyst analyzing samples, you encounter a lot of APIs that the malware uses during all the phases of analysis. Now not every usage of an API indicates maliciousness because sometimes clean samples also use these very same APIs. It is important to figure out the use case of an API and the context before concluding that the API usage is malicious, and the sample is malware. Similarly, for these APIs, when it is used in combination with other APIs (i.e., if you see a certain sequence of API calls, it might indicate maliciousness).

So as an analyst, just don’t look at just the use of an API call, but rather the usage or context of an API call. You also need to look at the arguments passed to the API and the sequence of API calls and any other related context to make a strong conclusion. But how and where do you obtain the API calls made by malware during analysis?

Obtaining API Logs

You encounter API names while performing static analysis on an executable PE file. For example, you can look at the import table to look at the APIs used by the PE file. Also, you can disassemble the sample to view the APIs used by the sample. But statically looking at these APIs won’t give you an idea about the usage and context of the API call we described in the earlier section. This is where you need dynamic analysis to execute the sample and observe its behavior or debug and reverse engineer the sample to look at its full context.

For dynamic analysis, we use tools like APIMiner in this book using which we can obtain API logs of a piece of malware under dynamic analysis. In Part 5, where we talk about reverse engineering samples, we use tools like OllyDbg and IDAPro to obtain these APIs used by malware. We cover this in detail in the next set of chapters that deal with both static and dynamic analysis of samples and under Part 5 which talks about reverse engineering

Now there are hundreds of Win32 APIs provided by Windows OS and its SDKs (software development kits). In the next sections, we look at how and where Windows provides these Win32 APIs and how we can obtain detailed information about these APIs, including their usage and parameters used.

Win32 DLLs

Most of the Win32 APIs are provided as a part of DLLs provided by Windows and its SDKs. These DLLs are present under the C:WindowsSystem32 folder. As an exercise, you can open the folder and search for one a DLL called kernel32.dll, which provides important APIs used by a lot of programs, also illustrated by Figure 5-1. Go ahead and open this sample using CFF Explorer as we did in Chapter 4, look at its export directory and other PE properties. It is a regular good DLL but provided natively by Windows OS.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig1_HTML.jpg
Figure 5-1

DLLs like kernel32.dll contain Win32 APIs are located under System32

There are various other important DLLs provided by Windows OS natively. The following lists some of the important DLLs used by both software and malware.
  • NTDLL.DLL

  • KERNEL32.DLL

  • KERNELBASE.DLL

  • GID32.DLL

  • USER32.DLL

  • COMCTL32.DLL

  • ADVAPI32.DLL

  • OLE32.DLL

  • NETAPI32.DLL

  • COMDLG32.DLL

  • WS2_32.DLL

  • WININET.DLL

The following lists some of the DLLs provided by the Visual Studio (VS) SDK runtime environment. An xx indicates different versions based on the various versions of VS SDK installed.
  • MSVCRT.DLL

  • MSVCPxx.dll

  • MSVBVM60.DLL

  • VCRUNTIMExx.DLL

The .NET Framework used by programs written in languages like C# and VB.NET provides its own set of DLLs. All the DLLs mentioned provide several APIs that we encounter when analyzing malware samples. Documenting all of them in this book is not feasible. In the next section, we teach you how to fish for information on a DLL and all Win32 APIs using MSDN (Microsoft Developer Network), the official developer community and portal from Microsoft that holds information on all developer resources and Win32 APIs as well.

Studying Win32 API and MSDN Docs

Given an API name, the best location to find information about it is by using MSDN, Microsoft’s portal/website for its developer community, which includes documentation for all its APIs. The easiest way to reach the MSDN docs for an API is by using Google or any other search engine with the name of the API, as seen by Figure 5-2.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig2_HTML.jpg
Figure 5-2

Using Google search engine to reach MSDN docs for a Win32 API

As seen in the figure, it should usually take you straight to the MSDN docs for the API in its results. Clicking the first link takes you to detailed information on the CreateFile() API, as seen in Figure 5-3.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig3_HTML.jpg
Figure 5-3

MSDN doc for CreateFile() Win32 API

The CamelCase naming style used with Win32 APIs is very descriptive of the functionality of the API. For example, the CreateFileA() API has the words create and file, which indicates that the API, when used/invoked/called, creates a file. But sometimes, the name of the API might not fully describe all its functionality. For example, the API can also open an existing file on the system for other operations like reading and writing, which you can’t figure out from the name of the API. So, names need not always be fully descriptive. But common sense and the name of the API usually are a good first step in understanding what the API intends to do.

From Figure 5-3, you have the docs for the API from the MSDN website itself, which describes the full functionality of the API. As an exercise, we recommend going through the full docs using your browser to see how the docs look, including the various information it holds and so forth.

Parameters

The parameters accepted by an API have a data type that defines the kind of data it accepts for that parameter. For example, in Figure 5-3, some of the parameters accepted by CreateFile() belong to one of these types: DWORD, LPCSTR, HANDLE. These are basic data types available in Win32. You can refer to the list of basic data types at https://docs.microsoft.com/en-us/Windows/win32/winprog/Windows-data-types.

At the same time, parameters can also accept more complex data types like structures, unions, and so forth. In the CreateFile() API, you can see that the fourth parameter, lpSecurityAttributes, accepts data of type LPSECURITY_ATTRIBUTES. If you refer back to the MSDN page for this type, you see that this is a pointer to type SECURITY_ATTRIBUTES. The structure definition for SECURITY_ATTRIBUTES is seen in Listing 5-1.
typedef struct _SECURITY_ATTRIBUTES {
  DWORD  nLength;
  LPVOID lpSecurityDescriptor;
  BOOL   bInheritHandle;
} SECURITY_ATTRIBUTES, *PSECURITY_ATTRIBUTES,  *LPSECURITY_ATTRIBUTES;
Listing 5-1

The Structure Definition for Complex Data Type SECURITY_ATTRIBUTES

As you can see, SECURITY_ATTRIBUTES is a complex data type that is made up of smaller fields that themselves are of basic data types.

It’s important to understand the parameters and their data types, because while analyzing and reversing samples both statically and dynamically, these parameters define why the API is used and if it is used for a benign or a malicious reason.

API Parameters Govern Functionality

APIs accept arguments from its caller. Let’s use the word parameters; although it doesn’t mean the same, it can be used interchangeably with arguments. For example, the CreateFileA() API takes five parameters (as seen in Figure 5-3): lpFileName, dwDesiredAccess, dwSharedMode, lpSecurityAttributes, dwCreationDisposition, dwFlagsAndAttributes, and hTemplateFile.

CreateFileA() can create a new file, but it can also open an existing file. This change in functionality from creating a file to opening a file is brought about by passing different values to the dwCreationDisposition parameter. Passing CREATE_ALWAYS as the value for this parameter makes CreateFileA()create a file. But instead, passing OPEN_EXISTING makes it open an existing file and not create one.

ASCII and Unicode Versions of API

In Figure 5-1 and Figure 5-2, searching for CreateFile() instead gave you CreateFileA(). If you search in Google and MSDN for CreateFileW(), it shows you the docs for this API. Basically, you have the same API, but the characters suffix A and W as the only difference between them. Why is this the case?

Win32 provides two versions of an API if any of the parameters of the API accepts a string. These are the ASCII and the Unicode variants of the API, which come up with the letters A and W, respectively. The ASCII version of the API accepts an ASCII version of the string, and the Unicode version of the API accepts Unicode wide character strings. This can be seen in the API definitions for CreateFileA() and CreateFileW()in Listing 5-2, which only differs in the data type for the lpFileName parameter. As you can see, the ASCII variant of the API uses the type LPCSTR, which accepts ASCII strings, and the Unicode variant uses the type LPCWSTR, which accepts Unicode wide-character strings.
HANDLE CreateFileA(
  LPCSTR                lpFileName,
  DWORD                 dwDesiredAccess,
  DWORD                 dwShareMode,
  LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  DWORD                 dwCreationDisposition,
  DWORD                 dwFlagsAndAttributes,
  HANDLE                hTemplateFile
);
HANDLE CreateFileW(
  LPCWSTR               lpFileName,
  DWORD                 dwDesiredAccess,
  DWORD                 dwShareMode,
  LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  DWORD                 dwCreationDisposition,
  DWORD                 dwFlagsAndAttributes,
  HANDLE                hTemplateFile
);
Listing 5-2

The ASCII and Unicode Variants of CreateFile() API

While analyzing malware samples, you might see either the ASCII or Unicode variant of the API being used, and from a functionality and use-case-wise, it doesn’t change anything. The API still functions the same way.

Native (NT) Version of the APIs

CreateFileA() and CreateFileW() are APIs that are provided by the DLL kernel32.dll. But there is another version of this API called NTCreateFile() in the DLL ntdll.dll. These APIs provided by ntdll.dll are called NT APIs and are considered low-level APIs. Low level because they are much closer to the kernel. The way it works is when you call CreateFileA() and CreateFileW(); they internally end up calling NTCreateFile() from ntdll.dll, which then calls the kernel using SYSCALLS(covered later in the chapter).

From a malware analysis perspective, while you are analyzing and debugging samples either while reverse engineering or via API logs in a sandbox(covered in dynamic analysis), you might see either the higher-level APIs or the lower-level NT APIs, but they all mean the same.

Extended Version of an API

Some of the Win32 APIs have an extended version. The extended version of an API has an Ex suffix in its name. The difference between the non-extended and extended version of an API is that the extended version might accept more parameters/arguments, and it might also offer additional functionality. As an example, you can check MSDN for the API VirtuaAlloc() and its extended counterpart VirtualAllocEx() . Both of these allocate more virtual memory in a process, but VirtuaAlloc() can only allocate memory in the current process. In contrast, the extra functionality of VirtuaAllocEx() allows you to allocate memory in other processes as well, making it a malware favorite for code injection (covered in Chapter 10).

The Undocumented APIs

We said that all the Win32 APIs are well documented by Microsoft in MSDN, but this is not necessarily true. There are many undocumented APIs in many undocumented DLLs on Windows. The most notorious being the NT APIs in ntdll.dll.

But though these APIs are not documented by MSDN and Microsoft, hackers and researchers have reverse engineered these DLLs and APIs and documented their functionality, including the NT APIs. Whenever you get an API like this, the first good place to check for it is a search engine like Google, which should direct you to some blog post by a hacker/researcher if the API is an undocumented one.

At http://undocumented.ntinternals.net, there is material that documents the functionality of all the NT APIs in ntdll.dll. Figure 5-4 shows an excerpt for the NtCreateSection() API, which is commonly used by malware for a technique called process hollowing (see Chapter 10).
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig4_HTML.jpg
Figure 5-4

Documentation for Undocmented API NtCreateAPI()

Do note this is an old site, and the documentation is quite similar to an older version of MSDN. You find the API parameters start with IN, OUT, and both IN and OUT. IN indicates if the parameter is an input for the API, and OUT indicates the parameter holds output used by the caller after execution of the API.

Important APIs to Remember

There are a multitude of Win32 APIs available, and you encounter a lot of them as you analyze samples. We provide lists of APIs that you need to remember. For each of the APIs that appear in the lists in this section, carry out the following tasks as an exercise: if available, find the corresponding NT API, the extended Ex API, and the ASCII and Unicode variants, and then explore the parameters and the data types for each API that you find.

The following are well-known Win32 APIs that perform operations on files.

  • CreateFile

  • WriteFile

  • ReadFile

  • SetFilePointer

  • DeleteFile

  • CloseFile

The following are well-known Win32 APIs that perform operations on the Windows registry.
  • RegCreateKey

  • RegDeleteKey

  • RegSetValue

The following are well-known Win32 APIs that perform operations on a process’s virtual memory.
  • VirtualAlloc

  • VirtualProtect

  • NtCreateSection

  • WriteProcessMemory

  • NtMapViewOfSection

The following are well-known Win32 APIs that perform operations related to processes and threads.
  • CreateProcess

  • ExitProcess

  • CreateRemoteThread

  • CreateThread

  • GetThreadContext

  • SetThreadContext

  • TerminateProcess

  • CreateProcessInternalW

The following are well-known Win32 APIs that perform operations related to DLLs.

  • LoadLibrary

  • GetProcAddress

The following are well-known Win32 APIs that perform operations related to Windows services. They are also commonly used by malware to register a service (as discussed later in the chapter).
  • OpenSCManager

  • CreateService

  • OpenService

  • ChangeServiceConfig2W

  • StartService

The following are well-known Win32 APIs that perform operations related to mutexes.
  • CreateMutex

  • OpenMutex

Behavior Identification with APIs

Clean or malware files always exhibit behavior that is an outcome of several tasks performed with the help of APIs. As a malware analyst, you encounter hundreds of APIs in logs while performing dynamic analysis and reverse engineering as well. But knowing the functionality of an API is not sufficient. You need to understand the context of the API, the parameters supplied to an API, and the set of APIs used in the sequence of APIs—all of which can lead to an easier, faster, and stronger conclusion if the sample is malware or not.

Let’s look at an example. Process hollowing is one of the most popular techniques used by malware. It creates a brand-new process in suspended mode. The API that creates a process is the CreateProcess() API. To create a process in suspended mode, the malware needs to pass an argument to it, dwCreationFlags having the value of CREATE_SUSPENDED, which tells the API to create the process and suspend it. Now a clean program rarely creates a process in suspended mode. Just because a program used CreateProcess() doesn’t indicate anything malicious. But the context/parameter (i.e., the CREATE_SUSPENDED argument in this API) indicates maliciousness and warrants further investigation.

Similarly, consider the API WriteProcessMemory(), which allows a process to write into the memory of another remote process. If this API is used stand-alone, it doesn’t indicate maliciousness because clean programs like debuggers also make use of this API to make modifications to the memory of another process. But if you see other APIs also used along with this API like VirtualAllocEx() and CreateRemoteThread(), you now have a sequence of APIs that are rarely used by clean programs. But this sequence of APIs is commonly used by malware for code injection, and thus indicates maliciousness.

Using Handle to Identify Sequences

Every resource on Windows is represented as an object, which can include files, processes, the registry, memory, and so forth. If a process wants to perform certain operations on an instance of any of these objects, it needs to get a reference to this object, otherwise known as a Handle to the object. These handles are used as parameters to APIs, allowing the API to use the handle to know what object it is using or manipulating.

From an API behavior correlation perspective, especially when it comes to malware analysis, the usage of handles can help us identify APIs that are part of a sequence. API calls that are part of a sequence most often end up using/sharing common handles that point to the same instances of various Windows objects.

For example, take the case of the four APIs shown in Listing 5-3. As you can see, there are two calls to CreateFile(), which returns a handle to the file it creates. You can also see two more calls to WriteFile(), which takes as an argument the handle to the file it wants to write to, which was obtained from the calls to CreateFile() previously. As you can see, API calls (1) and (4) are part of a sequence, and API calls (2) and (3) are part of another sequence. We identified these two sequences by looking for the common handle shared by these API calls.
1) hFile1 = CreateFile("C: est1.txt", GENERIC_WRITE, 0, NULL,
                    CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);
2) hFile2 = CreateFile("C: est2.txt", GENERIC_WRITE, 0, NULL,
                    CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);
3) WriteFile(hFile2, DataBuffer,
             dwBytesToWrite, &dwBytesWritten, NULL);
4) WriteFile(hFile1, DataBuffer,
             dwBytesToWrite, &dwBytesWritten, NULL);
Listing 5-3

Identifying API Sequences by Correlating Shared Handles Between API Calls

While analyzing malware, you find a lot of API calls, and a good first step is to identify sequences using indicators like shared handles. This technique of using handles can identify sequences across a vast range of APIs.

Windows Registry

Windows Registry is a tree-based hierarchical database available on Windows systems. It holds information and settings. Many of the OS components and services started on the system are based on config/settings held in the registry. Not just the OS, but most software uses the registry to hold various config/settings related information related to their software. Some parts of the registry can also be found on disk, while some are created dynamically in memory by Windows after it boots up. In the next few sections, we investigate and dig into the Windows Registry and work our way around this maze.

Logical View of Registry

Windows provides a built-in registry viewer tool/software called Registry Editor, which you can start by clicking the Windows logo at the bottom right of your screen and typing regedit.exe, as seen in Figure 5-5.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig5_HTML.jpg
Figure 5-5

Opening the Registry Editor tool on Windows

As seen in the registry editor, the registry entries are arranged in a tree structure with top-level roots known as hives , as illustrated in Figure 5-6. If we want an analogy for the registry, the file system or the folder system is a good example, with the hives being the top-level root folder, with the subfolders and files under it containing various information.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig6_HTML.jpg
Figure 5-6

The Hives as seen in the registry using the Registry Editor tool

Registry Hives

Hives are the root directories in the registry structure. There are five root hives.
  • HKEY_CLASSES_ROOT (HKCR) stores information about installed programs like file associations (associated programs for file extensions), shortcuts for application. A copy of this hive is found under HKEY_CURRENT_USERSoftwareClasses and HKEY_LOCAL_MACHINESoftwareClasses.

  • HKEY_LOCAL_MACHINE (HKLM) stores information that is common to all users on the system. It includes information related to hardware as well as software settings of the system.

  • HKEY_USERS (HKU) this hive contains Windows group policy settings. A copy of this hive is also present under HKLMSOFTWAREMicrosoftWindows NTCurrentVersionProfileList

  • HKEY_CURRENT_CONFIG (HKCC) this hive contains the hardware profile that the system uses at startup.

  • HKEY_CURRENT_USER (HKCU) this hive contains the information of the currently logged-in user. This hive is also stored on the disk at the location %UserProfile% tuser.dat, where the UserProfile is the home directory of the currently logged-in user. You can obtain/print the value of UserProfile by typing the command listed in Figure 5-7 in the command prompt.

../images/491809_1_En_5_Chapter/491809_1_En_5_Fig7_HTML.jpg
Figure 5-7

Command to obtain the value of the System Environment variable UserProfile

Data Storage is Registry

The data is stored in the hives under keys and subkeys using name-value pairs. The keys or subkeys hold certain values. Regedit displays the name of the key, data type of the value, and data stored in the value, as seen in Figure 5-8.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig8_HTML.jpg
Figure 5-8

Data stored in the registry using name-value pairs under keys and subkeys

Adding Storage to the Registry

You can add/modify your own data to the registry using the registry editor. You can also add/modify the registry programmatically using Win32 APIs. There are many more APIs related to registry querying, data addition, data modification. We leave this as an exercise to search MSDN for various Win32 APIs related to dealing with the registry. Malware uses the registry often to set and modify key values. So it’s very important to know these APIs by memory.

Figure 5-9 shows how to add a new key or a name-value under a key by right-clicking a key. As you can see it offers six data types for the values: String Value, Binary Value, DWORD (32-bit value), DWORD (64-bit value), Multi-String Value, and Expandable String Value. As an exercise, you can play around by adding new keys, subkeys, adding new name-values under the keys using the various data types and even modifying existing registry name-values.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig9_HTML.jpg
Figure 5-9

You can add new keys and name values under existing keys using Registry Editor

Malware and Registry Love Affair

The registry holds rich information on the system, including various tools on the system, a perfect information source for malware. Malware also frequently use the registry to modify the registry by altering existing keys and name-values, and also by adding their own new data, with new keys and name-values.

Altering Registry Information

Malware can modify the registry information to alter the system behavior in its favor, and they do it using Win32 APIs. The most common ones frequently seen in malware are altering the registry values meant to execute software during system boot or user login, called the run entry. Malware modifies these values so that the system automatically starts the malware at system boot. These techniques are called persistence mechanisms in Windows, and we cover it in detail in Chapter 8. Malware is known to alter the registry to disable administrative and security software.

Querying Information in Registry

We already know that the registry stores information about various system-related information, including system hardware and software tools installed on the system. If your OS is installed on a virtual machine like your analysis VM, the traces of the virtual machine are in the registry.

For example, malware can query for these registry keys and find out if their victim OS is installed on a virtual machine. If so, the malware can assume that it is possibly being analyzed in a malware analysis VM, since VMs are more commonly used by power users like malware analysts and software developers. In this case, the malware might not exhibit it’s real behavior and can fool the analyst. We cover such tricks in Chapter 19.

Important Directories on Windows

A default installation of Windows has a lot of system files that are necessary for the OS to run. These files are placed in particular directories which the operating system is well aware of. The directory structure is very important so that system files and user files can be segregated and stored in an organized manner.

Malware, when executing, is known to try a deceptive approach by copying themselves into various folders/directories on the system, naming themselves after OS system files so that they stay on the system without getting noticed. It is useful for an analyst to know some of the important directory names and what they should contain so that they can catch any such malware behavior. Let’s go through some of these important folders on the system, their content, and what they are supposed to hold.

system32

system32 or the path C:Windowssystem32, holds most of the system programs and tools in this directory, including Notepad.exe, which we use to open text files on Windows. smss.exe, svchost.exe, services.exe, explorer.exe, winlogon.exe, calc.exe are some of the system programs placed in this directory by Windows.

Program Files

Program files or path C:Program Files or C:Program Files (x86) contains software that is meant to be used by users. Whenever you install new software, it usually gets installed in this folder. Tools like Microsoft Office, browsers like Chrome and Firefox, and Adobe PDF Reader choose this directory by default during their installation process.

User Document and Settings

We have a string of directories under this category that is used by applications to store user-specific data. Some of these folders, like AppData and Roaming, are used by malware to copy themselves into these folders and execute them from these folders. The following lists some of the folders where <user> is your user login account name.
  • My Documents C:Users<user>Documents

  • Desktop C:Users<user>Desktop

  • AppData C:Users<user>AppData

  • Roaming C:Users<user>AppDataRoaming

Some of these paths like the AppData and Roaming are hidden by Windows and are not visible unless you enable the option to show hidden files and folders as described in the “Show Hidden Files and Folders” section in Chapter 2. Alternatively, you can access these folders by manually typing in the path in the Windows Explorer top address bar, as seen in Figure 5-10.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig10_HTML.jpg
Figure 5-10

Accessing hidden folders directly by typing in the Path in Windows Explorer

What to Look for as a Malware Analyst

Malware is commonly known to misuse system files and directories for their nefarious purpose in order to trick users and analysts. Such behavior of an executing malware can be observed in the analysis process by using tools like ProcMon. As an analyst, watch out for any such anomalous behavior. Keeping your knowledge updated on the real names of OS system programs and their folders, and their paths, helps you quickly point out any anomaly in the standard behavior and zero-in on malware.

One such behavior includes malware dropping their payloads and files into various system folders on the system to hide from users as well as analysts.

Malware is known to name itself after OS system programs to mislead analysts. But the original system Windows path where these programs are located is only C:Windowssystem32 and nowhere else. From a malware analysis perspective, if you see a process that has one of the names that match any of the OS system programs or more, verify the path of the program to make sure it is located in the directory C:Windowssystem32. If it is any other directory, most likely, the process is malicious, masquerading itself as a system process.

Malware is also known to name itself similar to system programs but with minor variations in the spelling to trick users and analysts. For example, svohost.exe, which looks very similar to the system program/process svchost.exe.

Windows Processes

By default, your Windows OS runs many system processes that are needed by it for the smooth functioning of the system. Most of these processes are created off programs located in system32. Malware can run on a system by masquerading as a system process, or in other cases, modifying existing running system processes to carry out its malicious intentions by techniques like code injection and process hollowing. It’s important for a malware analyst to identify newly created processes or make out changes in attributes of existing legitimate processes running on the system to identify malware traces.

We look at how malware modifies an existing running process in Chapter 10. Now let’s look and identify some of the important system processes and their basic attributes, which can help us set a baseline on what clean system processes and their attributes are so that we can find anomalies that identify malicious processes. The following lists some of the important system processes.
  • smss.exe

  • wininit.exe

  • winlogon.exe

  • explorer.exe

  • csrss.exe

  • userinit.exe

  • services.exe

  • lsm.exe

  • lsass.exe

  • svchost.exe

Let’s look at some of the unique and basic attributes of these system processes that uniquely identify them.

Attributes of a Process and Malware Anomalies

A process can have many attributes, some of which we have already come across in Chapter 4, like PID, parent process, the path of the executable, and virtual memory. There are more attributes that we can aid in our analysis process. We can use Windows Task Manager, Process Explorer, Process Hacker, CurrProcess, and so forth. The features of each tool are different. You might find some of the attributes available via one tool and not the other. You might have to use a combination of tools when analyzing malware. Let’s now configure Process Hacker to show us additional important attributes like session ID and path to the columns it shows by default. To add/remove an attribute not available, right-click the column bar, as seen in Figure 5-11.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig11_HTML.jpg
Figure 5-11

Right-click the column bar in Process Hacker to add new attributes/columns

If you select the Choose Columns option (see Figure 5-11), it should open a window that lets you select and add/remove new attributes, as seen in Figure 5-12.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig12_HTML.jpg
Figure 5-12

Choose Columns window in Process Hacker that lets you add new attributes

Make sure that the active columns are PID, CPU, Session ID, File name, User name, Private bytes, Description, and I/O total rate, as seen in Figure 5-12. After adding the columns, you can move the columns laterally by dragging them so that they appear in the same order that we mentioned and as seen in Figure 5-13.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig13_HTML.jpg
Figure 5-13

Process Hacker after we have configured the columns and ordered them

As an exercise, play around with Process Hacker, open the processes in the tree view (if it is not displayed in a tree view, you can fast double-click the Name column to enable it). Go through the list of processes, check out the various session IDs, check how many processes are running having the same name, check their paths out, and so forth.

In the next few sections, let’s look at what these attributes mean and what we should look for as a malware analyst.

Process Image Path

This is the path of the program from which the process is created. By default, the binaries of the system processes should be in C:Windowssystem32. Now we know that the system32 folder contains OS system processes. While analyzing malware, if you find a process that has the name of an OS system process, but with an image file path that is not in the C:Windowssystem32 folder, you should treat it as suspicious and investigate it further.

For example, malware names itself as the system program svchost.exe, but it is copied and run by the malware from a folder that is not C:Windowssystem32, which is a dead giveaway that the process is malicious and that we have a malware infection.

Process ID (PID)

PID is a unique ID provided to a process . You cannot infer anything much from this because it is always random. But two of the system processes have fixed PIDs, with SYSTEM IDLE PROCESS having a value of 0, and SYSTEM having a value of 4. The system should have only one instance of these processes running on the system. So if you notice any process with the same name, but having a PID other than 0 and 4, treat the process as suspicious that requires further investigation.

Sessions (Session ID)

Windows is a multiuser operating system, and multiple users can log in at the same time. A session is created for each user who logs in the system, identified by Session ID, the fourth column in Figure 5-13.

But before Windows assigns a session for a newly logged-in user, while Windows starts, it creates a default session 0, which is a non-interactive session. Session 1 and greater are also created for the first user who logs in. Any more user logins are assigned session numbers in increasing numerical order. But no user can log in to session 0.

Now all the important startup Windows services and system programs are started under session 0. Session 0 is started when the system boots prior to user login. Most Windows system processes like svchost.exe run under session 0. The processes winlogon.exe, one of the two csrss.exe, explorer.exe, and taskhost.exe belong to the user session, while the rest of the system processes belong to session 0. This is illustrated by the process tree in Figure 5-14.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig14_HTML.jpg
Figure 5-14

Hierarchy of system processes

As a malware analyst, if you see a process (supposed to be system process) like svchost.exe, smss.exe or services.exe or any other that is meant to be run under session 0, but it is now running under another session, it is highly suspicious and warrants further investigation.

Parent Process

SYSTEM IDLE PROCESS is the first process in the system, whose direct child process is SYSTEM, and they have PIDs 0 and 4, respectively. The rest of the process involves its children. If you draw a tree of system processes in their launch order, it should look like Figure 5-14. Do note that some of these processes like svchost.exe can have multiple instances running.

The figure shows some of the important Windows processes and their parents and the session in which they are created. While you see this hierarchy in Process Hacker, you might find some user processes have no parent processes, since these parents have exited and died. The task of such parent processes is to only start their children, set them up and exit.

As a malware analyst, while performing malware analysis, you might find some of the malware programs might name themselves with the same name as one of the OS system programs and run. But you also learned that we have a tree hierarchy that should be satisfied, where some of the system processes have very specific parent processes. If you see a process with the same name as a system process, but its parents don’t match the process/parent tree hierarchy specified in Figure 5-14, the process is highly indicative of being malware.

Now you can make a counter-argument that we can also catch this by using the process image path (i.e., even though it has the same name, it’s program can’t have the same image path as a system program in system32). Malware can get around this as well, where even the image path of the program is that of an actual system process in system32. Regardless of whether process hollowing is used by malware or not, if we use this process/parent tree hierarchy, we can figure out if there is a malicious process running on the system.

Number of Instances in a System Process

Most of the system processes have only one instance executing at any point in time. The only exception to this is csrss.exe, which has two instances running. Another is svchost.exe, which can have multiple instances running. So svchost.exe is a soft target for malware. A lot of malware names itself svchost.exe, with the idea that its process gets lost among clean instances and thereby escapes detection by the user/analyst.

As a malware analyst, other than svchost.exe, we can use the number of system processes to catch malware. If we find more than two instances of csrss.exe or more than one instance of any other system processes (except svchost.exe), then most likely the extra process instance(s) is a malware instance and warrants further investigation.

Windows Services

Services are special processes that run in the background and are managed by the OS, including having the ability to automatically start on boot, restarting it if it crashes, and so on. Some of the services may also be launched before the user logs into the system, since these services are tasked with the job of setting up the system. You can consider services as equivalent to daemon processes on Linux.

You can see all the services registered on your system by using the Services tool, as seen in Figure 5-15.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig15_HTML.jpg
Figure 5-15

Opening Services tool on Windows that lists and manages services

With the Services tool, you can view and manage the properties of all the services registered on the system and seen in Figure 5-16.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig16_HTML.jpg
Figure 5-16

Services tool can view and manage registered services

Now each service that is registered can either be an executable file or a DLL file. All services registered are run by the services.exe process, which takes each registered service and launches it either directly, in the case of an executable file, or by using the svchost.exe process, as seen in Figure 5-17.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig17_HTML.jpg
Figure 5-17

All services are run using svchost.exe wrapper process, with parent services.exe

Executable Service Under SVCHOST.EXE

When an executable file is registered as a service, you can view the path of this executable service using the Services tool by double-clicking the service. This opens the Properties window for the registered service, which gives you the path of the executable file that should be launched as a service, as seen in Figure 5-18.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig18_HTML.jpg
Figure 5-18

The path to an Executable File registered as a service as seen in its Properties

For an executable file that is registered as a service, you see it launched as a separate child process under svchost.exe, just like the WmiPrvSE.exe process you saw in Figure 5-17.

DLL Services under Svchost

Services can also be hosted as DLLs under svchost.exe. You can think of svchost.exe as an outer wrapper around the actual service’s DLL file that you register. If the registered service is a DLL, you will not see a separate child process under svchost.exe. Instead, the service DLL is run as a part of a new or one of the existing svchost.exe process instances, which loads the DLL into its memory and uses a thread to execute it.

To list the service DLLs that are run by a single instance of svchost.exe, you can double-click a svchost.exe instance in Process Hacker and go to the Services tab, as seen in Figure 5-19.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig19_HTML.jpg
Figure 5-19

List of DLL services currently executed by this svchost.exe process

But how do services.exe and svchost.exe get the path to the DLLs that are registered as services that it should load and execute? All the DLL services that are registered are entered and categorized in the registry under the HKLMSOFTWAREMicrosoftWindowsNTCurrentVersionSvchost key categorized by Service Groups, as seen in Figure 5-20.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig20_HTML.jpg
Figure 5-20

List of Service Groups that are registered on the system

The netsvcs service group is registered and holds multiple service DLLs. This netsvcs service group is the same service group Process Hacker identifies in Figure 5-19. Now each of the service groups has a list of DLLs registered under them, as you can see in its value: AeLookupSvc CertPropSvc.

The full list of DLLs registered for this Service Group can be obtained from the HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServices<service_name>ParametersServiceDll key, as seen in Figure 5-21, where <service_name> can be AeLookupSvc, CertPropSvc, and so forth.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig21_HTML.jpg
Figure 5-21

List of Service Groups registered

Malware as Windows Services

Malware commonly registers itself as a service, either as an executable service or a DLL service. They do this because services provide an OS native method of managing the malware, making sure that it can start on system boot, restart if it crashes, and so on. Services provide a tried-and-tested persistence mechanism for malware. It is an added bonus if it is loaded by svchost.exe, which is a system process, thereby escaping the curious eyes of casual users and analysts.

The three most popular ways that malware registers services are by using the regsvr32.exe command, the sc.exe command, or programmatically by using Win32 APIs. The regsvr32.exe command and the sc.exe command need to register a service (see Listing 5-4).
sc.exe create SNAME start= auto binpath= <path_to_service_exe>
where, SNAME is the name of the new service
regsvr32.exe <path_to_service_dll>
Listing 5-4

Command-Line Tools to Register a Service

The following are some of the registry keys in which service entries are made by the system.
  • HKLMSYSTEMCurrentControlSetservices

  • HKLMSoftwareMicrosoftWindowsCurrentVersionRunServicesOnce

  • HKLMSoftwareMicrosoftWindowsCurrentVersionRunServices

As an exercise, let’s now try registering a service Sample-5-1 from the samples repo. This service opens Notepad.exe as a process. To carry out this exercise, add the .exe suffix extension to the sample and copy this file into C:, after which it has the path C:Sample-5-1.exe. To create and start the service, run the commands shown in Figure 5-22. Make sure that you open the command prompt in administrator mode before running the commands.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig22_HTML.jpg
Figure 5-22

Registering and starting a service using sc.exe command

You can confirm that the service is now registered from the Services tool, as seen in Figure 5-23.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig23_HTML.jpg
Figure 5-23

You can verify that BookService is registered in the Services tool

You can also verify that our service entry has been made in the registry at the HKLMSYSTEMCurrentControlSetservicesBookService path, as seen in Figure 5-24.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig24_HTML.jpg
Figure 5-24

You can verify that the registry entry for BookService is now created

You can right-click the BookService entry in the Services tool in Figure 5-23, and then click Start. Then open ProcessHacker to verify that the service created Notepad.exe as a process. Do note that you can’t see the Notepad.exe GUI (graphical user interface) because Windows doesn’t allow services to interact with the user via GUI for security reasons, and since Noteapad.exe is a graphical tool, Windows creates Notepad.exe but without displaying it. But we can confirm the creation of Notepad.exe as a process using Process Hacker as seen in Figure 5-25.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig25_HTML.jpg
Figure 5-25

Process Hacker displays that Notepad.exe was started by BookService

When analyzing a malware sample, watch if the sample registers itself as a service using any commands like regsvr32.exe and sc.exe. If it does, trace the exe path or the DLL path to the file registered as a service. Usually, malware registers secondary payloads/binaries as a service, and these secondary components may need to be analyzed separately. The use of these commands by malware to register a service can be obtained by tools like ProcMon or by looking at the strings in memory, which we explore in a later chapter.

Also, keep an eye out for any of the service-related Win32 APIs that can register a service. The Win32 APIs used by malware are obtained with an API tracer like APIMiner.

Syscall

The kernel is the core functional unit of an OS. The operating system interacts directly with the hardware. Writing code for interacting with hardware is a tedious task for programmers. A programmer might need to know a lot of details for the hardware like its hardware specifications before writing code that interacts with it. The OS usually talks to the hardware via device drivers, which are usually loaded in kernel mode.

Now user space programs are not allowed to interact with these devices directly since it is dangerous. At the same time, accessing this hardware must be shared across multiple users/processes on the system. To allow user space to talk to these devices, the kernel has made syscalls available. Syscalls talk to the actual hardware resources via the drivers, but in a controlled manner, thereby protecting it. Using a syscall as a communication interface protects the incorrect usage of important resources of the system and the OS since the kernel validates the input parameters to the syscall and makes sure it is acceptable by the resource. The transition from the user space code to the kernel space is illustrated in Figure 5-26.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig26_HTML.jpg
Figure 5-26

The User to Kernel Mode transition using SYSCALLS

Mutants/Mutex

In the section Using Handle To Identify Sequences earlier in this chapter, we describe objects and handles. Everything in Windows is an object. One such important object is a Mutex.

A mutex is a synchronization object in which two or more processes or threads can synchronize their operations and execution. For example, a program wants to make sure that at any point in time only a single instance of its process is running. It achieves this by using a mutex. As the process starts, it programmatically checks if a mutex by a fixed name (e.g., MUTEX_TEST) exists. If it doesn’t, it creates the mutex. Now, if another instance of the same program comes up, the check for a mutex named MUTEX_TEST would fail since another (first) instance of it is already running, which has created the mutex, causing the second instance to exit.

Malware use mutexes for the exact use case we just described. A lot of malware don’t want multiple instances of itself to run, probably because it doesn’t want to reinfect the same machine again. The bigger reason is it is pointless to have multiple instances of the malware running.

When analyzing malware, we watch out for mutexes created by looking at the Handles tab in Process Hacker, where if a mutex is present, it lists it as a handle. Alternatively, under dynamic analysis, you can figure out if malware is using a mutex when it calls certain Win32 APIs.

As an example, let’s try Sample-5-2 from the samples repository. Add the .exe extension to this sample, and double-click Sample-5-2.exe to run it as a process. The output is seen in the upper half of Figure 5-27.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig27_HTML.jpg
Figure 5-27

The output from the first and second instances of Sample-5-2.exe

It succeeds in creating the mutex and holds onto it. The Handles tab in Process Hacker also shows this mutex, as seen in Figure 5-28.
../images/491809_1_En_5_Chapter/491809_1_En_5_Fig28_HTML.jpg
Figure 5-28

The mutex handle visible in the Handles tab in Process Hacker for Sample-5-2.exe

Now run the same Sample-5-2.exe again by double-clicking it, but without killing the previous instance of Sample-5-2.exe. The output of this second instance can be seen by the bottom half of Figure 5-27, which shows that it failed to run because it found that there was another instance running that had already opened the mutex.

Summary

The chapter continues from where we left off in Chapter 4. In this chapter, we covered Win32 APIs and how to obtain the documentation for a Win32 API using MSDN. We have also covered how to obtain information for undocumented Win32 APIs, which are commonly used by malware.

You learned about the Windows Registry, the database used for storing settings, and other information provided by Windows. We explored how to alter/modify the registry and how malware misuses the registry for its operations. You learned about the various system programs and directories available on the system and how they are misused by malware to hide in plain sight. We have also covered the various attributes of system processes using which we can establish a baseline to identify malicious processes running on the system. You learned about Windows Services, another feature provided by Windows OS that malware use to manage their processes and persist on the system.

We covered objects, handles, and mutexes. You learned how to identify mutexes by using tools like Process Hacker. Finally, we covered system calls and how user space programs talk to the kernel space by using them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.14.219