Chapter 12. Memory Scanning and Disinfection

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12. Memory Scanning and Disinfection

“ Have no fear of perfection, you'll never reach it.”

—Salvador Dali

Memory scanning is a must for all operating systems. After a virus has executed and is active in memory, it has the potential to hide itself from scanners by using stealth techniques¹. Even if the virus does not use a stealth technique, removing the virus from the system becomes more difficult when the virus is active in memory because such a virus can infect previously disinfected objects again and again. In addition, a file cannot be deleted from the disk as long as it is loaded in memory as a process. Similarly, a Registry key related to a malicious program cannot be deleted if the malicious code puts the same key back into the Windows Registry as soon as the keys are removed by the antivirus program.

As discussed in Chapter 5, “Classification of In-Memory Strategies,” many viruses use the directory stealth technique under Windows 95 and Windows NT. We have also seen the first implementations of Windows 95 full-stealth viruses.

In early 1998, Mikko Hypponen, Ismo Bergroth², and I discussed possible future threats for which we needed to prepare. One of the most worrying threats was the idea of a computer worm that never hits the disk. Even on-access scanners would be unable to protect systems from them because no files would be created on the disk before the worm was executed on the system. We figured that such a worm would probably use the HTTP protocol, exploiting a vulnerability of a Web server. However, our basic problem statement was even simpler: Web browsers such as Microsoft Internet Explorer render HTML content before saving files to disk. As a result, malicious code might be invoked before on-access scanners could block them.

In 2001, the W32/CodeRed worm proved this theory, followed by W32/Slammer, which used a similar approach. Without memory scanning, such threats cannot be detected by antivirus software, although some might argue that antivirus software is not the right solution to stop these threats, preferring another technology, such as intrusion prevention. Even more importantly, the memory scanner needs to make sure that an active worm copy is detected on the system. When CodeRed sends itself even to nonvulnerable Microsoft IIS systems, the body of the worm's code will be on the heap of IIS at exactly the same location where the active copy would run. I have seen AV solutions from major vendors that terminated IIS because an inactive worm copy was detected in the process address space of a nonvulnerable installation!

This chapter discusses the different ways that 32-bit viruses stay in memory as a particular process and describes possible ways of detecting and deactivating them.

At the end of 1998, we saw the first implementation of a native Windows NT virus that runs as a service: WinNT/RemEx³. Although it is possible to detect such a virus in memory even from a user-mode application, the problem becomes more difficult with a native Windows NT/2000/XP/2003 virus implemented as a device driver running in kernel mode. Such viruses cannot be detected in memory in user mode—only in kernel mode—because the system address space is protected from read and write access under Windows NT–based systems, unlike under Windows 95. This is probably the most important reason that a memory scanner should be implemented under Windows NT as a kernel-mode driver. In this chapter, I will discuss both user and kernel-mode implementations of a memory scanner under Windows NT–based systems.

12.1 Introduction

It did not take long for virus writers to realize that a virus can replicate faster if it stays active in memory, intercepting operating system calls. In fact, the very first viruses, such as Brain and Jerusalem, already stayed resident in memory.

Most successful file and boot sector infectors use all kinds of hooking strategies. A non-TSR virus has a much smaller chance of becoming in the wild under DOS. By hooking the file system functions, a virus can easily “see” the access to a particular program or system area and infect it on the fly. Of course, this means that most of the important, frequently used applications and system areas become infected very quickly. Therefore, the chance that such a virus can pass from one system to another before it is noticed by the user is much greater. Another advantage of a resident virus is that it can use stealth techniques to hide itself from scanners and integrity checkers. Full-stealth functionality is implemented in many old viruses such as Frodo, and it will be used in future 32-bit and 64-bit Windows viruses.

The Tremor virus was one of the first 16-bit DOS, full-stealth polymorphic viruses. When it is active in memory, it hides itself completely. The size of an infected application and its content both remain “virtually” the same, as long as the virus is active in the memory. While the virus is active in the memory, virus scanners cannot easily detect the infected files. An additional problem is that on-demand virus scanners access all important applications and system areas when scanning them, so the active virus can replicate to those objects during the scanning itself. (The viruses that infect files being accessed for whichever reason are called fast infectors.) Thus it was obvious to antivirus product developers that memory scanning and disinfection had to be implemented in virus scanner products.

Memory scanning was a relatively simple task to perform on DOS. Because DOS uses the Intel processors in real mode, it cannot access more than 1MB of physical memory, and it does not support virtual memory at all. Furthermore, DOS does not implement any protection mechanism for the operating system code. The actual DOS kernel and all the applications should share the same limited memory and can interfere with (accidentally overwrite) each other because they have the very same rights on the machine.

Memory scanning was easy to develop for DOS because the memory can be directly addressed and accessed by a virus scanner for both read and write operations. Most scanners did not even check whether the actual memory region had any active loaded code or data at all, but did a full signature scanning of the full physical memory, byte by byte. A few years later, thousands of virus signatures had to be located in memory, and antivirus products tried to search the active areas of memory for most signatures to speed up scanning and avoid false positives. Such memory scanners walk through the MCB (memory control block) chain. Under DOS, memory is allocated in arenas (sections of memory). Each arena begins with an arena header called MCB. Getting a pointer to the first MCB is possible only by an undocumented DOS interrupt (Int 21h/52h function). This function was first considered a DOS “internal” function and was undocumented at that time. Sadly, the need to figure out the undocumented interfaces is an everyday issue with Microsoft systems. (Not surprisingly, many undocumented interfaces also must be discovered to implement an efficient memory scanner for Windows NT–based systems.)

It was relatively easy to develop a memory scanner for DOS; it is more difficult to do the same for Windows 95 and particularly complex to implement for Windows NT. Windows NT–based systems manage “virtually unlimited” memory. The virtual address space is 4GB in total (except on 64-bit architectures). Of course, today's NT-based systems (2000/XP/2003) use around 128 to 256MB physical memory on average. Home systems with 1GB physical memory are not uncommon—they're great systems for games.

The rest of memory is virtual only, managed by the operating system by using the less costly (but much slower) storage on a hard disk. A real Windows NT memory scanner should scan the virtual address space of all running processes. Because the virtually continuous memory is not necessarily physically contiguous, a Windows NT memory scanner should scan by using virtual addresses instead of physical addresses, as DOS memory scanners used to do.

Because the Windows NT virus scanner used to be a port of the “original” DOS scanning engine, it could happen (in fact, I have seen it happen) that a fairly good Windows NT programmer blindly ported the DOS memory-scanning engine under NT. Even if it is not obvious to implement, scanning the first physical 1MB memory under an NT-based system for viruses is, of course, insufficient by itself. In the following section we will look at the basics of Windows NT virtual memory management to provide a fair understanding to how memory scanning is implemented in antivirus software.

12.2 The Windows NT Virtual Memory System

You could ask, “Why is virtual memory useful?” It certainly is not necessary; many operating systems do not use virtual memory and still manage to work. DOS does not support virtual memory, but even so, it survived on the market for almost two decades. A constant problem for developers, however, has always been the limitations of physical memory. In fact, it seems that nothing is ever enough when it comes to memory. Applications are getting larger and larger, so a number of techniques have had to be developed to handle limited physical memory situations. One of the best-known techniques is the overlay mechanism: A particular program is divided to several chunks, and only one can be actively accessed at a time. Whenever a chunk of the program is needed, it is read into physical memory, overwriting the previously loaded one in memory. The virtual memory management of the operating system is supposed to solve these problems for all running applications by dividing the memory into a set of pages. Thus a particular application need not take care of its memory management by using the old techniques.

Virtual memory has other benefits:

• Process isolation: Processes have separate address spaces and therefore do not interfere with each other.

• Memory protection: The processor is used in two modes, thus the operating system is clearly separated from the user applications.

• No memory limitation: Pages that are currently not in use should not be allocated; data can be shared between applications.

How does Windows NT implement virtual memory? Modern processors support virtual memory (VM) management. VM could be developed without processor support, but it would be very slow. When the processor is running in virtual memory mode, all addresses are assumed to be virtual addresses and must be translated to physical addresses each time the processor executes a new instruction. This is why CPU support for VM is crucial for fast system performance.

On 4GB VM systems, the CPU looks at a 32-bit address as though it were made up of three parts:

• A directory offset

• A page table offset

• A page offset

(The PAE, or Physical Address Extension, mode adds a fourth layer of indirection.)

Translating a virtual address from page directory to page frame is similar to traversing a b-tree structure where the page directory is the root, page tables are the immediate descendants of the root, and page frames are the page tables' descendants. Figure 12.1 illustrates this organization.

Figure 12.1. Page directory.

The first step in translating the virtual address is to extract the higher-order 10 bits to serve as the first offset. This offset is used to index a 32-bit value in a page of memory called the page directory. Each process has a single, unique page directory under Windows NT (which is mapped to the 0xc0300000 address under Windows NT 4 Intel platforms). The page directory itself is a 4K page, segmented into 1,024 four-byte values called page directory entries (PDEs). The 10 bits provide the exact number of bits necessary to index each PDE in the page directory (2¹⁰ bits=1,024 possible combinations).

Each PDE is then used to identify another page of memory called a page table. The second 10-bit offset is subsequently used to index a 4-byte page-table entry (PTE) in exactly the same way that the page directory does. PTEs identify pages of memory called page frames. The remaining 12-bit offset in the virtual address is used to address a specific byte of memory in the page frame identified by the PTE. With 12 bits, the final offset can index all 4,096 bytes in the page frame.

Through three layers of indirection, Windows NT offers virtual memory that is unique to each process. On IA32, the page directory has up to 1,024 PDEs, or a maximum of 1,024 page tables (without PAE enabled). Each page table contains up to 1024 PTEs, with a maximum of 1,024 page frames per page table. Each page frame has its own 4,096 one-byte locations of actual data. That gives 4GB of address space (1,024 * 1,024 * 4,096).

12.3 Virtual Address Spaces

In Windows NT, the virtual address space of the system is divided into two parts: the low 2GB user address space and the high 2GB system space (see Figure 12.2). When the CPU is running in user mode, only pages of the user address space are accessible, so applications cannot interfere with the operating system components that are accessible only in kernel mode. When a user-mode application (such as WINWORD.EXE, NOTEPAD.EXE, and others) calls an API, it first calls into a subsystem DLL. The subsystem DLL API translates the documented function to an undocumented one in the native API set as part of NTDLL.DLL. When necessary, the native API calls the Windows NT executive, and the processor is switched to kernel mode. The Windows NT standard 32-bit linear address space division is illustrated in Figure 12.2.

Figure 12.2. Standard address space division on 32-bit Windows NT–based systems.

The 4GB address space division can be changed by using Windows NT Enterprise Edition and a special boot.ini option. In this case, the user address space is 3GB, which leaves 1GB for the system address space. This is done to support applications that use very large databases and can work more efficiently this way. Windows NT Enterprise Edition address space division (/3GB)⁴ is illustrated in Figure 12.3.

Figure 12.3. Windows NT Enterprise Edition loaded with the /3GB option.

One of the new features of Windows 2000 on Alpha APX systems is the extension of the VM address space to a total of 32GB, rather than the current 4GB, called VLM⁵^, ⁶ (see Figure 12.4). The upper user space is not paged and can be used only for data, not for code.

Figure 12.4. The VLM memory layout.

In all of these models, the user address space maps a particular process at a time. Each time a user-mode application is executed, NT creates a virtual address space for the new process. The same virtual address can be used by any number of applications, but the virtual address does not necessarily point to the same physical page in memory. When process A accesses a page at 0x00400000 (the usual base address of applications) process B's page at 0x00400000 may not even be valid at all. Process A cannot interfere with process B by using the same address because it is valid only in its own context. On a single CPU, only one virtual-to-physical mapping can be in use. Each time a particular thread is scheduled for execution, a context switch occurs, changing the actual virtual-to- physical mappings to the process context in which the scheduled thread is running. To provide kernel-mode components (and drivers) with an environment in which they know that their memory references are always valid for the upper 2GB of address space, NT provides a portion of page tables that hold the same information in each context.

The Virtual Memory Manager handles the system address space differently from the user address space. (See Figure 12.5, which illustrates⁴ a normal system address space layout on IA32.) In that address space, Windows NT's code components are loaded together with all the kernel-mode drivers. Because kernel-mode drivers have the same privilege and view of the system address space, they can interfere with the operating system's code or with one another. A sample list of loaded system components is shown in Listing 12.1.

Listing 12.1. Partial List of Loaded Drivers and Their Base Addresses in a 32-bit Address Space

Figure 12.5. Normal layout of kernel-mode memory space.

Note that NTDLL.DLL (native API) appears on the loaded driver list. Though this DLL is loaded in user mode, it is strongly related to the transition of several functions to kernel mode. The native API acts as a “middle man.”

Probably the most demanding problem of virtual memory management is the paging mechanism. Windows NT has the ability to reclaim pages of memory that are no longer needed. To reclaim a page, the Memory Manager changes individual entries in the page table, marking them as invalid. If the page belongs to an executable and is not a dirty page (a page whose content has changed during execution), nothing else needs to be done but marking the page as invalid. Otherwise the changed page must be written into a file, most likely to the page file (pagefile.sys).

When the page is accessed again, a page fault is generated. Then the actual virtual address is checked, if it is available. If it is mapped in from a file (like most DLLs and applications), the page is read from the particular file in which the data exists. Otherwise, the information from a file or from a page file will be read in, and Windows NT will rerun the instruction that generated the fault.

Windows NT can share a physical page of memory between several processes. This means that several copies of an executable application will not reserve the exact same amount of memory each time. Instead, the same physical pages are seen from all views. When the contents of a page should change, not every process context will change, only the instance that needs the change. This is done by reserving a new physical page and moving the data from the copy-on-write page to the new page. Thus the change happens in the new copy.

12.4 Memory Scanning in User Mode

The first question of memory scanning is how to access a particular process' data in memory. As discussed previously, process A cannot interfere with process B. How can a user-mode scanner read the contents of all the other processes? The answer is an API called ReadProcessMemory(). This API is usually used by debuggers to control the execution of the traced application by the debugger. The ReadProcessMemory() API needs a handle to a process, which can be gotten by the OpenProcess() API and the PROCESS_VM_READ access right. OpenProcess() needs the ID of a process. From where do we get a process ID?

The answer was not obvious for quite some time because the actual DLL (PSAPI.DLL) in which documented process enumeration APIs have been placed is not part of the standard Windows NT environment. (It was introduced later in Windows 2000.) The lack of PSAPI.DLL and the missing documentation suggested to me how NT actually does this itself. Because Task Manager and several other applications can display all the running processes and their IDs, it was obvious that it is possible to do so without PSAPI.DLL. In fact, it turns out that most APIs in PSAPI.DLL are just wrappers around native service APIs placed in NTDLL.DLL, such as NtQuerySystemInformation().

The native API set is not documented by Microsoft and is mostly used by subsystems. Most applications do not link to NTDLL.DLL directly for this reason. In fact, Microsoft suggests using the documented interfaces. However, Task Manager (TASKMGR.EXE) is linked to NTDLL.DLL directly, even if the information could be obtained by using performance data. Oh, well!

Task Manager uses the NtQuerySystemInformation() native API to get a list of all running processes and their process IDs. A user-mode application can link itself to NTDLL.DLL or simply use GetProcAddress() to get the address of the API to call it.

When the process ID of a particular process is available, ReadProcessMemory() can be used to read the actual address space of that particular application. To do so, a memory scanner should know the exact location of the used pages of an application. Fortunately, the VirtualQueryEx() function provides information about the range of pages within the virtual address space of a specified process. It needs an open handle to a process and returns the attributes and the sizes of regions.

It also needs PROCESS_QUERY_INFORMATION access for this operation. Free and reserved pages can easily be eliminated with this function, and those should not be accessed, but the rest must be checked. This can be done by using the ReadProcessMemory() API on those pages.

12.4.1 The Secrets of NtQuerySystemInformation()

NtQuerySystemInformation() (NtQSI) is not documented by Microsoft, and it is not necessary to use it because a user-mode application can link itself to PSAPI.DLL, which in turn will call NtQSI. As we will see later on, however, this function can be useful in a kernel-mode implementation of a memory scanner, so it is worth talking about it a bit.

NtQSI has four 32-bit (DWORD or ULONG) parameters.

The first parameter could be named SystemInformationClass. This parameter specifies the type of information to be returned by the function. (It has several possible values; 5 specifies the running process list query.)

The second parameter is the address of the returned buffer, which should be allocated by the caller; let's call this SystemInformationBuffer.

The third parameter is the allocated size in bytes. The fourth parameter is an optional value, PULONG BytesWritten, which is the number of bytes returned in the caller.

NtQSI() returns an NTSTATUS value. When the returned value is not STATUS_SUCCESS (0), it is usually STATUS_INFO_LENGTH_MISMATCH, which means that the allocated buffer length does not match the length required for the specified information class. Therefore, NtQSI() must be called with bigger and bigger buffers in a loop until the information can be placed into the allocated buffer completely by the Windows NT kernel.

On correct return, the necessary information is placed in the buffer in the form of a linked list. The first DWORD value specifies the relative pointer of the next process block information from the start of the buffer. The DWORD value at offset 0x44 of each block is the process ID (of course this position is dependent on the platform and is different on IA64). With this ID, several additional APIs can be called, which is why it is the most important.

After all of this, here is the “hand-made” definition for NtQuerySystemInformation():

Other important information, such as the loaded images (EXE and DLLs) and their base addresses, can be examined by other native API calls by using this process ID, such as RtlQueryProcessDebugInformation(), which uses allocated buffers created by RtlCreateQueryDebugBuffer() and deallocated by RtlDestroyQueryDebugBuffer() APIs. Of course, these are all undocumented native APIs.

12.4.2 Common Processes and Special System Rights

On a typical Windows NT–based system, several processes are already running, even if the user has not logged in. The most important of these processes are the System Idle Process, the System Process, SMSS.EXE, CSRSS.EXE, WINLOGON.EXE, and SERVICES.EXE. A Windows NT scanner should scan all of these address spaces and also any other running processes executed by the user.

The trick is that some of these processes cannot be opened by OpenProcess() to get a handle for the other APIs with the necessary access. In Microsoft Press documentation⁷ (such as the Advanced Windows NT Third Edition), it is stated that some of the processes are secure processes and therefore cannot be opened for QUERY_INFORMATION or VM_WRITE operations. (These processes include WINLOGON.EXE, CLIPSRV.EXE, and EVENTLOG.EXE.) Such processes first need an additional system security privilege to be adjusted. (This information is missing from the Microsoft documentation.)

In particular, the SeDebugPrivilege privilege value must be adjusted to the SE_PRIVILEGE_ENABLED attribute. The SeDebugPrivilege is available to administrators and equivalent users or to anyone who has been granted this privilege by an administrator. However, even under an administrator account, the default attribute of this privilege is not enabled, so OpenProcess() will fail to open secured processes. To enable this privilege, the OpenProcessToken() function must specify TOKEN_ADJUST_PRIVILEGES, and then LookupPrivilegeValue() can be used to check whether the user has the privilege at all. If the user has the rights to do it, this privilege can be set to SE_PRIVILEGE_ENABLED by the AdjustTokenPrivileges() API.

It makes very good sense to protect some standard applications this way. For instance, Windows NT simply crashes if WINLOGON.EXE stops working. A modification caused by any user-mode application inside a random location of WINLOGON.EXE's address space could cause the system to crash! Of course, this would not be great. In any case, WINLOGON.EXE can even be written in memory when this privilege is enabled, but the privilege would have to be granted to all users first to scan such applications in memory. If WINLOGON.EXE were infected, the infected process could not be detected. Giving debug privileges to all users would definitely not make the system more secure. This is why a memory scanner is much better developed as a kernel-mode driver, where PROCESS_ALL_ACCESS is easily gained because drivers are running with the highest rights on a Windows NT machine.

12.4.3 Viruses in the Win32 Subsystem

This section introduces the different ways that viruses can become active as part of a particular process. Most 32-bit user-mode applications run in the Win32 subsystem, which is the most important subsystem of Windows NT. It is created and used by default and unlike the other subsystems, cannot be disabled. This is the subsystem in which Win32 viruses can be active.

The Win32 subsystem consists of the following major components: CSRSS.EXE (the environment subsystem process); the kernel-mode device driver WIN32K.SYS; and subsystem DLLs (such as USER32.DLL, ADVAPI32.DLL, GDI32.DLL, and KERNEL32.DLL), which translate documented Win32 API functions into the appropriate undocumented kernel-mode system service calls to NTOSKRNL.EXE and WIN32K.SYS. There is one other very important part of the Win32 subsystem: NTDLL.DLL, primarily for subsystem DLLs. NTDLL.DLL is used in the other subsystems of Windows NT also and by native applications that do not run in a subsystem. (Listing 12.2 shows some of the system processes—the loaded DLLs with their base addresses and sizes.)

Listing 12.2. Some System Executables with Their DLLs

12.4.4 Win32 Viruses That Allocate Private Pages

Some Win32 viruses allocate private pages for themselves with the PAGE_EXECUTE_READWRITE attribute. When the infected application is loaded, the virus code is activated from the executed application code. The virus then allocates new pages for its own use and moves its code there. Write access to those pages is important for the virus because it stores data in itself that must change, and read-only pages cannot be written to.

For instance, W32/Cabanas.3014.A⁸ allocates a 12,232-byte block that is represented as three pages (3*4,096=12,888 bytes) from the address space of the infected process (see Listing 12.3). Because Cabanas uses the MEM_TOP_DOWN flag when it allocates memory with the VirtualAlloc() function, the actual three pages will be available at the very end of the user address space, usually somewhere around the 0x7FFA0000 address.

Listing 12.3. W32/Cabanas at the Very End of the User Address Space

Cabanas hooks some of the KERNEL32.DLL APIs to itself by patching the import table entries of the host program to its own routines. Whenever the host application calls any of the hooked APIs, the virus has the chance to replicate on the fly to another application or to call its directory stealth routines.

The W32/Parvo.13857⁹ virus allocates 132,605 bytes from the address space of the infected process (more exactly: 33 pages, 135,168 bytes) because it needs a lot of memory for its polymorphic engine and for its communication modules.

W32/Parvo does not use the MEM_TOP_DOWN flag, so its allocated pages will be reserved from the first free gap of the user address space that is large enough (at address 0x002F0000 in the infected NOTEPAD.EXE in this particular example, as shown in Listing 12.4).

Listing 12.4. W32/Parvo Inside NOTEPAD's Address Space

The virus code will be active with the name of the original infected and executed application. Only one copy of the virus is active at a time. The original host will be executed as the child process of the infected application under a random name, as shown in Listing 12.5.

Because the host program will be executed almost immediately, the virus can silently infect other applications from its own process and propagate itself to other locations with its communication module, based on the use of WSOCK32.DLL APIs.

Listing 12.5. W32/Parvo Runs Original NOTEPAD.EXE as JRWK.EXE

12.4.5 Native Windows NT Service Viruses

A new class of Windows NT viruses activate by dropping executable images loaded as a native Windows NT service, as done by WNT/RemEx³ (commonly known as the RemoteExplorer). The RemEx virus runs as a user-mode service called ie403r.sys, as shown in Listing 12.6. The virus sleeps for a while and then wakes up and tries periodically to infect other applications.

Listing 12.6. WNT/RemEx Running as ie403r.sys Service

12.4.6 Win32 Viruses That Use a Hidden Window Procedure

A few viruses such as { W32,W97M} /Beast.41472.A¹⁰ install a hidden window procedure for their own use and use a timer. Timers were available back in 16-bit Windows versions, and they were sometimes used to simulate multithreaded functionality. As explained in Chapter 3, “Malicious Code Environments,” this virus runs as a complete process and uses OLE APIs to inject embedded macros and executable code (the binary virus code itself) into Office 97 documents. Because the virus can infect Office 97 documents from its active process, a macro virus-specific scanner and disinfector has a hard time removing it from documents if it cannot detect and terminate the virus in memory first.

12.4.7 Win32 Viruses That Are Part of the Executed Image Itself

W32/Heretic.1986.A was the first virus to infect KERNEL32.DLL correctly under Windows NT. KERNEL32.DLL is used by most applications; most of the crucial Win32 APIs are exported from it. When KERNEL32.DLL is infected, most executed applications will be attached to it because they need to call APIs from it.

Heretic patches the export address table of KERNEL32.DLL so that the CreateProcessA() and CreateProcessW() functions will point to the last section of the DLL where the virus code is placed, as shown in Listing 12.7.

Listing 12.7. W32/Heretic.1986.A Modifies the Export Address of CreateProcess APIs

When these functions are called by the host program, the virus has the chance to infect other applications on the fly. The virus enlarges the last section (.reloc) of KERNEL32.DLL and puts its code there, modifying the characteristics of that section to both MEM_EXECUTE and MEM_WRITE types. Listing 12.8 shows the virus code in memory at the end of an infected KERNEL32.DLL.

Listing 12.8. W32/Heretic.1986.A at the End of Infected KERNEL32.DLL in Memory

Another class of Win32 viruses stay active as part of an infected executable image, as done by the W32/Niko.5178 virus. (See Listing 12.9 for an illustration). The W32/Niko virus is activated from an infected portable executable (PE) application. The virus adds itself to the last section of the PE application and modifies the characteristics of the last section to MEM_WRITE. This allows the virus code to be modified in memory. The virus does not allocate memory for its full code but only for small data blocks whenever they are needed.

Listing 12.9. W32/Niko.5178 Virus in an Infected ASD.EXE Application in Page 0x0040F000

Niko is one of the first computer viruses to be multithreaded. The virus creates two threads for its own use, as shown in Listing 12.10. One is the trigger thread, which is supposed to display a message on a particular day; the other is the infection thread. The host program is executed after the virus creates the threads.

As long as the host program is running, the virus's infection thread will also be active. If the host application (main thread) terminates, all threads of the process will be killed by Windows NT, so the virus will be no longer active. The virus can replicate to other files only from those applications that are running and used for a longer time. In such a situation, the infection thread will infect other applications from the background.

Listing 12.10. W32/Niko.5178 Virus Creates Two Threads (68 and 123 in This Example)

12.5 Memory Scanning and Paging

With certain restrictions, a user-mode memory scanner can be developed by using the functions described previously. The scanner should be able to distinguish between the committed pages and the free pages and must do a full scan on each running process' committed pages because virus code could be placed in any of them.

Because Windows NT's Memory Manager reclaims unused pages and pages are not read in memory until they are accessed, the speed of the memory scanning will largely depend on the size of physical memory. The more physical memory a particular computer has, the faster the memory scanner will be—the number of page faults will be much higher if the computer has very limited physical memory. Figure 12.6 shows that unused pages, pages for which the access flag was cleared by the Memory Manager for some time, are reclaimed from all applications. For instance, WINLOGON.EXE's Mem usage is only 356KB, as shown in the example.

Figure 12.6. Checking memory usage before memory scanning.

Figure 12.6A shows how the memory usage of all running processes changed when SCANPROC.EXE (a user-mode memory scanner) scanned them. WINLOGON.EXE's Mem usage went up as much as 7,792K, and the number of page faults caused in the process grew to a few thousand (see Figure 12.6B and Figure 12.7B). This is a short-term side effect of memory scanning.

Whenever SCANPROC.EXE accesses a new page that is not yet in the physical memory, it causes a page fault. At that point, the Memory Manager will read the page into the physical memory, causing the memory usage (Mem Usage) to grow also. Of course, the memory usage of a process will become smaller and smaller because most pages will not be accessed again after some time, so they will be reclaimed. Windows NT's Memory Manager has several worker threads to maintain the balance of the memory usage among processes. Fortunately, memory scanning does not cause critical problems for Windows NT's memory management.

Figure 12.7. Checking memory usage during memory scanning.

12.5.1 Enumerating Processes and Scanning File Images

An alternative solution is to enumerate the running processes on the system and scan the actual files from which the content of the executables are mapped. This technique works effectively against most Win32 threats, but it cannot deal with injected code, such as CodeRed.

12.6 Memory Disinfection

This chapter would not be complete without some words about the deactivation possibilities of different virus types. A memory scanner should work closely with an on-access virus scanner and should always know the same set of viruses that are known by the file scanner components of the antivirus product. The on-access virus scanner can detect most known viruses, even if the virus code is active in some processes. But it cannot stop the virus from infecting new objects because the active virus can infect the disinfected object again. Typically antivirus software cannot detect a virus in applications before the virus code is written to them; however, a new copy of the known virus code cannot be executed as a process because the on-access scanner will be active.

A particular virus can probably become active on a machine in the following situations:

• The virus scanner has not been installed on the computer, but the virus code has already executed.

• The virus is new, and the scanner needs an update installed to detect it.

12.6.1 Terminating a Particular Process That Contains Virus Code

Probably the easiest way to deactivate the virus in memory is to kill the particular task in which the virus code is detected by the memory scanner. This can be done easily by using TerminateProcess() API and the appropriate rights (PROCESS_TERMINATE access is needed). Terminating a task is a risky procedure, however, and should be used with great care. Because active virus code is most likely attached to a user application, important user data could be lost if the infected process were simply killed. Any application could keep several database files open, which most likely could not be kept consistent if the process were killed. Consequently, TerminateProcess() should be used in situations in which the virus code is active as a separate process, such as the WNT/RemEx or W32/Parvo viruses.

Some viruses, such as W32/Semisoft variants, try to avoid termination by executing two different virus processes. Whenever one virus process is terminated, the active copy of the virus will restart the terminated one, protecting itself very efficiently. This is why memory scanning should assume an on-access virus scanner in the background that will not allow the new virus task to be executed again.

12.6.2 Detecting and Terminating Virus Threads

If a virus creates its own threads in a process, the memory scanner should be able to eliminate the threads belonging to the virus itself and terminate those threads in the process. The previously mentioned W32/Niko virus (Listing 12.10) creates two threads for itself. One thread is used for the trigger routine and will terminate by itself. The infection thread will be active as long as the process (with at least one thread of its own) is running. A thread handle is needed with the necessary THREAD_TERMINATE access to terminate a particular thread of a process.

OpenThread() is not available in the subsystem DLLs on most NT-based systems. The function is undocumented and available only from the NTDLL.DLL as NtOpenThread(). Listing 12.11 is my own, “hand-made” declaration.

Listing 12.11. "Handmade" API Definition for NtOpenThread()

To eliminate the virus threads from the clean application threads, the memory scanner should check the Win32StartAddress of each thread. Win32StartAddress is available in the performance data, but it is easier to get by using another, undocumented API. This API, called NtQueryInformationThread(), has five parameters:

• The first is a thread handle with THREAD_QUERY_INFORMATION access.

• The second parameter is the QueryWin32StartAddress class value, which is 9.

• The third parameter is the address of the return value.

• The fourth parameter is the size in bytes (four) of the information to be returned.

• The last parameter is BytesWritten, a PULONG optional value that can be NULL.

NtQueryInformationThread() will return the correct start address of a particular thread, as shown by the tlist.exe application (available in the Windows NT resource kit). (Listing 12.9 is an output of tlist.exe used on a process in which Win32/Niko virus is active.) In the example, the starts of the two virus threads are 0x0040f021 and 0x0040f01c, respectively. Both of these addresses point into the active virus image, each to a jump instruction (0xe9) that will in turn give control to the entry points of the virus thread functions.

By checking the Win32StartAddress of a thread, the memory scanner can determine whether or not a thread belongs to a virus because the start address of the thread will point into the active virus image in memory. In the case of Niko, the virus code is executed as the main thread of the host application, so the Win32StartAddress (0x0040f000) of the main thread (entry point) should not be terminated because that same thread is used by the host program. The final step is to terminate the thread with the TerminateThread() API and THREAD_TERMINATE access.

Essentially, the preceding procedure can be used safely to detect and kill CodeRed threads in the process address space of Microsoft IIS.

Listing 12.12 is a partial log of the threads inside the INETINFO.EXE process (Microsoft's IIS) after infection by both CodeRed I and CodeRed II on the same system. Any thread is identified as an active one and detected based on the signature of the virus code found at a thread start address. This ensures avoidance of potential ghost positives. (Ghost positives could result because unsuccessful worm attacks could still place worm code on the application heap in inactive form.) Attempts to freeze the detected CodeRed threads were successful in stopping the worm from spreading further and in gaining sufficient CPU time for patch installation processing.

Note the high context switch number for worm-related threads, even after only a few seconds of infection. CodeRed II infections were fresh and have a lower context switch number. Note that most CodeRed II threads have almost identical context switch values.

Listing 12.12. Two W32/CodeRed Variants and Some of Their Threads

In some tricky cases, the threads cannot be killed immediately. An increasingly common trick is to inject a thread into a standard Windows process to prevent the killing of another worm process. If the protection thread is terminated, then, the worm process immediately reinjects the thread. In this case, the thread needs to be frozen first and the process of the worm terminated before the frozen thread can be killed. But of course there are even bigger complications than this, for which there are no simple solutions.

12.6.3 Patching the Virus Code in the Active Pages

The most difficult case of deactivation is when the virus is active as part of a loaded EXE or DLL image or the virus allocates pages for itself on a per-process basis and hooks some imports of the host application to itself. In these situations, the active virus code must be patched in memory so that the virus is deactivated. This procedure must be very carefully developed because an incorrect patch of the virus code in memory could cause a new variant to be created accidentally by the memory disinfection itself.

When the virus hooks APIs to itself by patching the host application's import address table (IAT), the IAT should be fixed in each of the infected processes. This will remove the virus code from the API chain. This operation must be done very quickly. Perhaps the safest way is to suspend each thread of an infected process at the time of this fix. When the IAT is fixed, threads can be resumed. WriteProcessMemory() can be used to write into the necessary pages in this situation. The disinfection should be done from instance to instance of the virus. The protection flags of each page that need modification must first be checked. If the page has PAGE_READONLY access, the protection flag should be changed to PAGE_READWRITE. The VirtualProtectEx() function can be used with PROCESS_VM_OPERATION access in such cases.

A much more difficult case is when a particular subsystem DLL is infected by the virus, as in the case of W32/Heretic. Some other worms patch the socket communication library (WSOCK32.DLL), as done by the W32/Ska.A virus¹¹.

In the case of the W32/Heretic virus, KERNEL32.DLL is infected so that the export addresses of two APIs are patched in the file itself (not in memory only). When a particular process gets the address of such an API with the GetProcAddress() function, it will get a pointer to the virus code. Because some applications determine the addresses of certain APIs during initialization, they will “remember” such addresses as long as they are running. This is why the export address table of KERNEL32.DLL should not be fixed during memory disinfection; in some situations, the virus could be activated again regardless of this particular fix. Instead of fixing the export table, the disinfector should patch the active virus code in memory very carefully. This can be done by modifying the virus code at the entry point of its hook routines, so the control will be given to the exit of the hook functions where the virus calls the original API entry point. That way, the virus can no longer replicate. Of course, this procedure is virus-specific and needs exact identification of the virus code.

12.6.4 How to Disinfect Loaded DLLs and Running Applications

A loaded subsystem DLL is shared in memory and cannot be written to. The image can be disinfected in memory but not in the file itself because the disinfector cannot open the file for writes. The easiest solution to this particular problem is to build a list of such applications and ask the user to reboot. For instance, the disinfection can be done by a native disinfector even from user mode. A list of native Windows NT applications is executed even before any subsystem is loaded. Some of the standard Windows NT applications, such as AUTOCHK.EXE, are native applications.

An alternative solution is to build a scanner and disinfection system on top of Windows PE (Microsoft Windows Preinstallation Environment), which allows easy access to NTFS disks with clean memory. In fact, Windows PE allows many features that other systems cannot; however, WinPE needs a special license.

Yet another alternative is Bart Lagerweij's BartPE (also known as PE builder)¹².

12.7 Memory Scanning in Kernel Mode

Memory scanning in kernel mode is very similar to user mode implementation in its basic functionality. It will always be safer to perform memory scanning in kernel mode. Furthermore, a kernel-mode memory scanner can scan the upper 2GB of kernel address space for viruses. Currently only a few viruses have kernel-mode components on NT-based systems, but it is very likely that more such viruses will be developed in the future as file system filter drivers. This section explains the major problems in developing a kernel-mode memory scanner for current Win32 viruses running in user mode. I will introduce the basic procedures that are important in scanning the upper 2GB of address space for kernel-mode viruses.

12.7.1 Scanning the User Address Space of Processes

In kernel mode, the user address space scanning of each process can be done similarly to user-mode memory scanning. In fact, many system functions can be used by adapting them in kernel mode. There are several ways to get the process IDs of each running application. One possibility is to use the NtQuerySystemInformation() API, which is exported from NTOSKRNL.EXE by name and therefore is as easily callable as ZwQuerySystemInformation() (ZwQSI) from a kernel-mode driver. Of course, the function is undocumented, so the necessary declarations must be specified and included first; otherwise, the linker cannot link the driver correctly.

12.7.2 Determining NT Service API Entry Points

Unfortunately, some of the important APIs needed for memory scanning are not exported by name from the kernel (NTOSKRNL.EXE) for the use of a kernel-mode driver. When a user-mode application calls the VirtualQueryEx() API in KERNEL32.DLL, the call is redirected to the NtQueryVirtualMemory() API in NTDLL.DLL.

Surprisingly, this API is not available from the kernel (NTOSKRNL.EXE). The function is there for the use of the NTOS, but it is not exported for other drivers. Evidently, NT's designers did not consider situations when “messing” with the Virtual Manager's operations is necessary.

A driver can solve this problem in two different ways. It can be linked against NTDLL.DLL, which is the easiest way. The other possibility is to develop a function similar to the user-mode GetProcAddress()—with some important differences—that can get the function ID of a particular NT service by traversing the export table of the NTDLL.DLL in the system context. Such a function can pick up the NT service function ID, which is placed into the EAX register with a MOV instruction at the entry point on IA32 systems. This way the driver can specify the correct address of the function inside the Windows NT executive (NTOSKRNL.EXE ) as KeServiceDescriptorTable+NtServiceID.

Listing 12.13 is an example of an INT 2E function call in NTDLL.DLL, NtCreateFile().

Listing 12.13. A Sample Service Call on NT on IA32

Windows XP implements similar stubs in NTDLL.DLL in IA32, but the code uses dynamically created “trampolines.” The syscall sequence will not use an INT 2E if the processor supports the sysenter instruction. In such a case, the NTDLL functions instead call into one of the last pages of the user-mode process to execute code. The content of this page is previously generated on the fly according to the features of the processor, as shown in Listing 12.14. Indeed, this page is not part of any DLLs on the system.

Intel implemented a new instruction called sysenter in Pentium II processors. It is a faster way to switch to kernel mode, so XP saves a few CPU clocks in millions of API calls, making the system faster.

Listing 12.14. A Sample Service Call on Pentium II Processors

Note that the ID still remains available at the native API entry points (27h in this example).

12.7.3 Important NT Functions for Kernel-Mode Memory Scanning

Several functions are very useful for scanning the memory of processes.

NtQueryVirtualMemory() queries the pages of a particular process. This function is not documented, but it is only a translation of the VirtualQueryEx() API to ZwQueryVirtualMemory(), which is placed in the kernel (NTOSKRNL.EXE). Its name is shown by the Windows NT kernel debugger because the debug information contains the name of the function. This function (like several others), however, is not exported by name from the kernel (NTOSKRNL.EXE).

Other useful functions are NtTerminateProcess(), NtOpenThread(), NtSuspendThread(), NtResumeThread(), and NtProtectVirtualMemory(). Most of these functions are translations of their user-mode equivalents but remain undocumented. The header declarations must be done one by one for each of these functions. Furthermore, ZwOpenProcess() can be used to gain a handle to the processes.

12.7.4 Process Context

In NT, kernel-mode drivers run in three different classes of context⁴:

• System process context

• Specific thread (and process) context

• Arbitrary thread (and process) context

Depending on the circumstances, the lower 2GB of virtual memory maps any user process or no user process at all. The memory scanner should be able to switch to the context of a particular process to map the process to the lower 2GB of the virtual memory. One way to do this is to use the undocumented KeAttachProccess(). The necessary header declaration of this API is

VOID KeAttachProcess(
IN PEPROCESS Process
);

This kernel API first needs a PEPROCESS parameter (a pointer to an EPROCESS structure). This can be converted by another undocumented API called PsLookupProccessByProccessId() by passing a normal process ID as the first parameter¹³:

NTSTATUS
PsLookupProcessByProcessId(
IN ULONG Process_ID,
OUT PVOID *EProcess);

Whenever the kernel-mode memory scanner needs to read a page, it should switch the context to the particular process it wants to access. KeDetachProcess() returns from any context to the system context:

VOID KeDetachProcess(
VOID
);

The query function must be carefully developed to work correctly in all problematic circumstances. Because the process pages can be queried as previously described, unavailable pages should not be accessed. Otherwise, the memory scanning would be terribly slow with far too many exceptions slowing down the system.

An alternative is simply to use the ZwOpenProcess() function to get a handle to each process to be scanned.

12.7.5 Scanning the Upper 2GB of Address Space

The upper 2GB of the address space contains executable code, such as the NT executive, system drivers, and third-party drivers. The list of drivers can be queried using Object Manager functions. Alternatively, NtQuerySystemInformation() can be used with the information query class 11 (0x0B), which returns the list of loaded drivers with their base addresses.

It is not very easy to query the pages of that area because there are no API interfaces to do so. It would be feasible to query the page tables, but that leads to service pack–dependent coding and further stability concerns. The easiest solution is to check the base address of each driver and parse their structures directly in memory. Because any driver has complete access to the upper 2GB of address space, this is possible and can be done easily by parsing the section header table of each driver in memory. In principle, this is what SoftIce Debugger does to show the loaded drivers list.

Scanning the paged and nonpaged pool area is not trivial, either. The easiest solution is to find a reference to the virus code, such as a hook routine on a handler that points to the virus code from a fixed location.

12.7.6 How Can You Deactivate a Filter Driver Virus?

Such a question might sound strange because no existing virus is known to use this approach. But the method is definitely possible, and we can be sure that such a virus will be developed. (In this section, I assume that the reader has basic knowledge of Windows NT drivers.)

The problem is that filter drivers cannot be unloaded—at least this is the suggestion of Microsoft, so it should be considered a very strong opinion. File system filter drivers are attached to the device object of a particular file system driver (ntfs.sys, fastfat.sys, and so on), or they are attached to another filter driver's device object, building up a chain of filter drivers. In fact, a particular filter can be attached to many device objects of other drivers. (Figure 12.8 shows an example.)

Figure 12.8. A sample chain of filter drivers shown by OSR's DeviceTree utility.

A filter driver can be easily detached from the end of the list, but it is not safe to do so. An additional problem is that a filter driver between two other filter drivers, or between a file system driver and a filter driver, cannot be detached because this would simultaneously detach all drivers after itself on the chain. Therefore, it was necessary to find another solution. After several attempts, I found an approach that works.

The execution of a driver begins in its DriverEntry function. Within this function, filter drivers typically create a new device object (a hook device) and then attach it to the device object of the device to be filtered by calling the IoAttachDevice(), IoAttachDeviceToDeviceStack(), or AttachDeviceByPointer() functions.

File system filter drivers must support fast I/O so that they implement a FAST_IO_DISPATCH table with function pointers to their own fast I/O entry points. After performing the fast I/O filtering in a particular fast I/O hook routine, the filter driver must call the original fast I/O entry point of the driver to which the filter driver's hook device was attached. Interestingly, Windows NT itself does not save the pointer to the lower device object. Each driver must save these pointers, and it is recommended to keep this pointer in the DeviceExtension of the hook device. The DeviceExtension, however, is an absolutely driver-specific structure, and each driver can define it to its own preferred format—or not use it at all. All this makes our task more difficult.

It seems the only way to safely “deactivate” a filter driver is to “filter it” in a nonstandard way that does not let the driver receive control in any of its filtering routines. Instead, the driver to which the particular filter driver was attached must be called. To do this, the refiltering driver (DeactivatorDriver) must patch the filter driver's driver object (VirusDriver). All MajorFunction[] entries of the VirusDriver should instead point to the HookDispatch routine of the DeactivatorDriver. Additionally, the FastIoDispatch field of the VirusDriver should point to the fast I/O table of the DeactivatorDriver.

When this patch is performed correctly, the fast I/O entries of the DeactivatorDriver will get control instead of the VirusDriver's own. The major problem is that each fast I/O routine of the DeactivatorDriver should call the fast I/O routine under the VirusDriver by traversing the device object chain of the VirusDriver. The AttachedDevice field of all file system drivers' device objects must be checked to see whether a VirusDriver's hook device is attached to them. When the AttachDevice field of a file system driver's device object is equal to any of the VirusDriver's hook device object pointers, the device object pointer of the file system driver should be saved. Whenever the DeactivatorDriver's fast I/O is called, the fast I/O can be redirected to the driver to which the VirusDriver was attached. This is because the saved device object pointer will point to a device object that has a pointer to the owner's driver object. If that driver object has a fast I/O entry point for the fast I/O that has been filtered by the VirusDriver's fast I/O routine, it should be called by passing the incoming parameters to it without any modification. From then on, the fast I/O of the VirusDriver will be refiltered and deactivated.

In a similar manner, the Dispatch routine of the DeactivatorDriver must complete the Interrupt Request Packets (IRPs) of the VirusDriver or pass the IRPs to the corresponding device object with the IoCallDriver() routine.

Complicated? No doubt about it! Certainly this could be done more easily if the NT-based systems filter driver model were organized slightly better.

12.7.7 Dealing with Read-Only Kernel Memory

Windows 2000 implemented read-only kernel memory. If read-only memory is on, non-writeable pages, such as code sections of drivers, cannot be changed. This is to protect the OS kernel (and its data) and drivers from each other. However, this feature also helps computer viruses, requiring extremely careful removal.

It turns out that this feature is only active if the system has 128MB or less physical memory. In this case, the virtual memory is managed with 4KB pages, but if more memory is available, the system switches to large page mode. So far, the protection is not available in that mode.

Nevertheless, there are a couple of ways to deal with read-only memory. For example, the WP flag of the CR0 control register of the IA32 processor could be flipped during writes. This can be done in kernel mode but must be performed with special care (it is definitely a hack!). When WP is off, all pages can be written into.

12.7.8 Kernel-Mode Memory Scanning on 64-Bit Platforms

Most of the 32-bit Windows viruses can already infect 64-bit Windows systems. This is because 64-bit Windows supports 32-bit executables by default. However, 64-bit viruses have already begun to appear. It is expected that virus writers will create a lot more viruses on AMD64 and EM64T (the IA32 with 64-bit extension) systems because programming on those systems is simpler, and such systems are relatively cheap, so attackers will more likely gain access to them. Somewhat contradicting, the first 64-bit viruses appeared on the Itanium processor¹⁴.

The 32-bit processes are linked against 32-bit DLLs only and implemented as a WOW (Windows-on-Windows) system. NTDLL.DLL is 32-bit in the 32-bit process but eventually switches to a 64-bit kernel (NTOSKRNL.EXE).

In the system process, NTDLL.DLL is 64-bit. Porting the 32-bit memory scanner to 64-bit is straightforward. You can decode the entry points of the 64-bit NTDLL.DLL exports to choose the ID that is equivalent in function to the EAX value on IA32. This is what you need to decode to get an NtServiceID for memory scanning if you want to follow the 32-bit approach described in this chapter. Listing 12.15 is a 64-bit Windows syscall on the Itanium.

Listing 12.15. A System Service Call on IA64

This code can be confusing to someone unfamiliar with the Itanium processor. The actual NtServiceID is moved to the r8 register (it is 6 in this example). The long 64-bit value is moved to the r2 register. After that, you have a do-nothing operation.

This is not junk, though. The Itanium processor encodes instructions into a bundle. There can be up to three slots, three instructions in one bundle. Therefore the compiler needs to fill the space in the slot with NOPs if the next instruction cannot be encoded there. The code execution goes from bundle to bundle via IP, the instruction pointer. The instruction slots are decoded according to a mask.

Finally, the code branches to b6 (branch register), which has the value of the r2 register to complete the service call. To decode the NtServiceID, someone must decode the mov r8=6 operation that is encoded into the same bundle as the following MOVL and NOP opertations. This is the easy part.

After you have the NtServiceID, you need to understand how the GP (global pointer) register works on the Itanium. The GP is a preassigned value for accessing data within a load module. There is no global pointer on x86 architecture. It was already used, however, on RISC machines, and NT defined it long ago for the Power PC.

When a standard call is made, GP must be set by the caller. The GP value is available in the load module's header via IMAGE_DIRECTORY_ENTRY_GLOBALPTR.

To call an NTAPI function, you need to get the GP of the kernel (such as NTOSKRNL.EXE). That is a simple task because you can use ZwQuerySystemInformation() to get the base of the module easily.

You also need to know how to define a function pointer. On IA64, each API and function is defined as PLABEL_DESCRIPTOR-s (PLD)¹⁵:

typedef struct _ PLABEL_DESCRIPTOR {
ULONGLONG EntryPoint;
ULONGLONG GlobalPointer;
} PLABEL_DESCRIPTOR, *PPLABEL_DESCRIPTOR;

Thus the API you need to call dynamically must be defined as a PLD. Before making a call to the function, you need to set the GP to the kernel's (NTOSKRNL's) GP and set the EntryPoint to the corresponding address in the service descriptor table entry, which you can get with the decoded ID from NTDLL.DLL. In this way, calls to nonexported APIs become a trivial task.

Note

The AMD64 and EM64T processors do not use a GP register.

Scanning the driver spaces can be solved in a way similar to IA32 systems. See Listing 12.16 for a map snippet of the 64-bit NTOS and loaded drivers on IA64. The System32 folder is a remnant directory name that stores the 64-bit NTOS image. NTDLL.DLL remains to be loaded at the “bottom”of the user address space.

Listing 12.16. A Sample Kernel and Loaded Driver Map on 64-bit Windows on IA64

12.8 Possible Attacks Against Memory Scanning

Unfortunately, memory scanning is subject to several possible attacks. The following points illustrate a number of possible attacks, and also note some solutions.

• Encryption is a main problem, even under other operating systems such as DOS. Viruses might decrypt themselves in such a way that only a tiny window of decrypted code is available at a time.

• Attackers can use in-memory polymorphic code to confuse scanners. For example, viruses such as Whale and DarkParanoid¹⁶ used this method on DOS, and W32/Elkern¹⁷ variants used it on 32-bit Windows systems. Such viruses can be detected only by algorithmic in-memory scanning.

• Metamorphic viruses pose a similar problem. The code of such viruses also must be detected algorithmically in memory.

• An attacker can implement viral code that jumps around in the process address space of a single application or injects itself into new processes and clears itself from the previous place—like a rabbit. This confuses on-demand memory scanners. On-access memory scanning can prevent this kind of attack.

• An attacker could place virus code in multiple processes at once. In most current cases, this is an approach of retro viruses that fight back and do not allow termination. Consider an attack that has fragments of polymorphic or metamorphic routines running inside multiple host processes. The problem in both cases is that the scanner needs to have access to multiple process address spaces at the same time. Thus simultaneous access to all running process address spaces must be implemented. In this way, an algorithmic scanner can check process A and process B at the same time to make a correct decision.

• A worm can run multiple copies of itself, each one keeping an eye on the other(s). Alternatively, a single thread is injected into another process that keeps an eye on the worm process. An example of the first attack is a variant of W32/Chiton. An example of the second attack is W32/Lovegate@mm. (The first variation of this attack is based on the self protection mechanism of the “Robin Hood and Friar Tuck” programs that, according to anecdotes, were developed at Motorola in the mid-1970s¹⁸.

• The attacker can use in-memory stealth techniques by hooking the interfaces that the antivirus software will use. Some rootkits use this idea to avoid showing a malicious process on the process list. Similarly, worms can hide themselves using this approach. For example, several members of the Gaobot worm family hide their process names on the Task List, the Service Control Manager List, and even the worm image on the disk.

12.9 Conclusion and Future Work

Memory scanning and disinfection are very challenging tasks under NT-based systems. The multitasking, multithreaded environment is much more complex than DOS, so most Windows viruses are also very complex. As the number of Win32 viruses grows, the antivirus world will face more and more difficult problems. It is extremely important to study the upcoming Win32 and Win64 viruses in detail to be equipped to deal with them correctly. Scanning of the 64-bit address space on IA64, AMD64 and EM64T systems is feasible. Disinfection of the system is analogous to the challenges in Core Wars.

Among other security features, Microsoft NGSCB (Next Generation Secure Computing Base) systems¹⁹ will support sealed memory—curtaining areas of physical memory (though it remains a question when exactly Microsoft will release it). Because of this uncertainty, detail discussion of NGSCB is beyond the scope of this work. In NGSCB, the hardware is modified to allow code (so-called Nexus Agents) to run in a protected range of memory. The idea is to make it possible to hide information (secrets) from other running components on the system.

It is difficult to predict whether or not antivirus software will be able to scan the in-memory content of Nexus Agents (NCAs) because this could violate the purpose of the curtained memory. If, however, antivirus software cannot scan curtained memory, malicious code will easily enjoy the protection. Thus, if a CodeRed-like threat could exploit an NCA, it could not be detected in memory. This risk is further minimized by the NX (nonexecutable) pages featured on modern CPUs, but it might not be completely eliminated. In addition, NCAs cannot use additional DLLs, and the NCA runtime might have very limited functionality—perhaps not enough to allow an attacker to implement a computer worm.

The outcome of NGSCB remains to be seen. (Consider Figure 12.9 for illustration.)

Figure 12.9. A high-level view of NGSCB based on preliminary information from Microsoft.

References

1. Peter Szor, “Memory Scanning Under Windows NT,” Virus Bulletin Conference, pp. 325-346.

2. Ismo Bergroth and Mikko Hypponen (Data Fellows), personal communication.

3. Eugene Kaspersky (Kaspersky Labs), personal communication.

4. Peter G. Viscarola and W. Anthony Mason, “Windows NT Device Driver Development,” MachMillan Technical Publishing, 1998. ISBN: 1-57870-058-2.

5. “Virtually Unlimited Memory,” The NT Insider, March-April 1998.

6. “Virtual Memory,” The NT Insider, January-February 1999.

7. Jeffrey Richter, “Advanced Windows NT,” Microsoft Press, Redmond, Washington, 1994, ASIN: 1572315482.

8. Peter Szor, “Attacks on Win32,” Virus Bulletin Conference, 1999.

9. Peter Szor, “Parvo—One Sick Puppy,” Virus Bulletin, January 1999.

10. Peter Szor, “Beast Regards,” Virus Bulletin, June 1999.

11. Peter Szor, “Happy Gets Lucky?” Virus Bulletin, April 1999.

12. BartPE available at http://www.nu2.nu/pebuilder.

13. Sergey Belov (Kaspersky Labs), personal communication.

14. Peter Ferrie and Peter Szor, “64-bit Rugrats,” Virus Bulletin, July 2004, pp. 4-6.

15. Matt Pietrek, “Programming for 64-bit Windows,” MSDN Magazine, November 2000.

16. Eugene Kaspersky, “DarkParanoid—Who Me?,” Virus Bulletin, January 1998, pp. 8-9.

17. Peter Ferrie, “Un combate con el Kernado,” Virus Bulletin, 2002, pp. 8-9.

18. “Robin Hood and Friar Tuck,” http://catb.org/~esr/jargon/html/meaning-of-hack.html.

19. “Next Generation Secure Computing Based,” 2003, http://msdn.microsoft.com.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 12. Memory Scanning and Disinfection

Create new playlist

Sign In

Sign Up

Chapter 12. Memory Scanning and Disinfection

12.1 Introduction

12.2 The Windows NT Virtual Memory System

12.3 Virtual Address Spaces

12.4 Memory Scanning in User Mode

12.4.1 The Secrets of NtQuerySystemInformation()

12.4.2 Common Processes and Special System Rights

12.4.3 Viruses in the Win32 Subsystem

12.4.4 Win32 Viruses That Allocate Private Pages

12.4.5 Native Windows NT Service Viruses

12.4.6 Win32 Viruses That Use a Hidden Window Procedure

12.4.7 Win32 Viruses That Are Part of the Executed Image Itself

12.5 Memory Scanning and Paging

12.5.1 Enumerating Processes and Scanning File Images

12.6 Memory Disinfection

12.6.1 Terminating a Particular Process That Contains Virus Code

12.6.2 Detecting and Terminating Virus Threads

12.6.3 Patching the Virus Code in the Active Pages

12.6.4 How to Disinfect Loaded DLLs and Running Applications

12.7 Memory Scanning in Kernel Mode

12.7.1 Scanning the User Address Space of Processes

12.7.2 Determining NT Service API Entry Points

12.7.3 Important NT Functions for Kernel-Mode Memory Scanning

12.7.4 Process Context

12.7.5 Scanning the Upper 2GB of Address Space

12.7.6 How Can You Deactivate a Filter Driver Virus?

12.7.7 Dealing with Read-Only Kernel Memory

12.7.8 Kernel-Mode Memory Scanning on 64-Bit Platforms

Note

12.8 Possible Attacks Against Memory Scanning

12.9 Conclusion and Future Work

References

Table of Contents for
Chapter 12. Memory Scanning and Disinfection