2.3 Components of an antivirus engine

Talking about antivirus components, a file scanner is one of the most important features of an antivirus engine. A file scanner has the capability to identify various file formats (see section 2.1 File Format in Chapter 1, Malware from Fun to Profit) and parse these file formats to retrieve more data. In simple words, we can say that a file scanner can perform static analysis (see section 1. Static Analysis in Chapter 2, Malware Analysis Fundamentals) on a file.

Every file, including executables, has static properties. The static properties of an executable are those properties which you can view without executing the file. Windows PE executable static properties can be viewed using many tools. CFF Explorer is such tool which can help you to explore a lot of static properties:

Static properties in CFF Explorer

You can see that you can view a lot of properties of a PE file, such as Optional Header and Section Header. There are other tools embedded into CFF Explorer, such as Resource Editor, which can be used to view the resource section of a PE file. Quick Disassembler can show the disassembly of the static file.

When we see disassembly in memory (when exe is executed), the disassembly could be different. This can be due to packed or encrypted data that unpacks or decrypts when executed.

An antivirus has a disassembler engine, a PE file format parser that can extract disassembled code and PE fields from a PE executable. It can use these to match against the signatures provided. There are codes in an antivirus engine that can parse other file formats, such as Java, PDF, ELF, DOC, and so on.

Well, why are we talking about this?

A malware signature in antivirus is usually written using the combination of static properties. With the help of packers, a single piece of malware can generate a lot of executable files that vary in their static properties. If a malware signature is based on a hash (we talked about hashes in the Hash algorithms section) of the complete file, altering even a character in the file can alter the hash calculated for the file and can evade the particular signature. We have come across malware that just appends a few characters at the end of the file in order to evade the signature.

Then the question is, why doesn't antivirus use behavior signatures?

Well, to retrieve behavior information about an executable on a desktop, a lot of programming complexity is involved and it also consumes a lot of resources. We mentioned in Chapter 2, Malware Analysis Fundamentals that we can see decrypted malware contents in virtual memory. Most antivirus has the capacity to scan process memory but since the method consumes lot of computer resources, the memory scanner module is triggered into action only under certain conditions .

An antivirus engine can also have an unpacker that can unpack an executable without executing it. You might have seen that 7zip or WinRAR can extract files from a zip archive. The unpacker engine works in a similar way to these. It can extract the unpacked executable out of a packed executable without executing it. Usually, the unpacker does this by recognizing the packer and its compression algorithm and then applying a decompression algorithm to it. So, in order to unpack, a sample unpacker engine should have a signature to identify that packer and its algorithm. It fails to unpack if it does not have a signature. Unfortunately, there are lots of unpackers with various compression algorithms, so it's hard to write a lot of unpackers for an antivirus engine.

Antivirus also has an anti-rootkit engine. We talked a little about rootkits and API hooking in Chapter 1, Malware from Fun to Profit.

A lot of rootkits rely on API hooking. Anti-rootkit engines try to identify these hooks in order to detect a rootkit. When a malicious hook is installed, a malware module is installed in the system. The hooking code in the API directs the control to this malicious module and, after performing the malicious operation, the control returns to the original API:

API hook

The preceding diagram shows how an API hook works. The main program tries to call a findfirstfile() API in the kernel32.dll DLL. The hook transfers control to the malicious DLL, which is injected by malware into the main program. The malicious code is then executed. Control is then transferred back to the main program. In order to identify a hook, the anti-rootkit engine tries to find out whether code in the API transfers control to another module, in this case, the injected DLL.

If malware is identified by a particular signature, a particular cleaning procedure (code) is written for that particular signature. The cleaning procedure can include the following:

Delete or quarantine the malicious file and any files created by the malware
Terminate the malicious process
Clean registry entries created by the malware, such as run entries
Remove hooks or rootkits related to malware
In the case of a file infector, try to remove the additional code (for file infectors, refer to Chapter 1, Malware from Fun to Profit)

Traditional antivirus relies on signatures based on the static properties of the file, which malware can easily alter. This has been the biggest disadvantage for the antivirus product. Now some of the products are involving machine learning to detect malware. While a usual pattern matching signature finds the exact match for the pattern, a machine learning algorithm looks for the closest match and not the exact one. Hence, machine learning can catch more malware. But it also has some disadvantages of its own. It's usually hard to train a machine learning model and a huge set of data is needed to train it.

Here are few points related to prevention using antivirus:

An antivirus should be updated regularly. Sometimes antivirus updates come in short intervals of hours whenever a malware outbreak happens. Administrators should be on the alert and make sure antivirus is updated. Antivirus updates are usually signatures for malware.
Real-time protection should be enabled in an antivirus. As mentioned earlier, antivirus scans the file before it is even written to the disk or executed. If the malware signature is available, then it will prevent further infection. The following screenshot shows real-time protection provided by Windows Defender. Windows Defender is the default antivirus that is shipped with the Windows operating system. One can also install other antivirus alongside it:

Windows Defender real-time protection

Regular full system scanning of the system should be done. This is recommended at least once a day.
Contact your antivirus vendor if you see anything suspicious that the antivirus is not able to catch.

The following are a few references for students who are keen to understand the internals of antivirus software. We won't be going into the details of these as it needs an understanding of lot other concepts related to operating system internals and programming:

The ClamAV project is a famous open source antivirus. Here is the link: https://www.clamav.net/.
Here is another sample project from Microsoft that comes with a Microsoft Driver Development kit: https://github.com/Microsoft/Windows-driver-samples/tree/master/filesys/miniFilter/scanner. This can be used as a framework to build an antivirus with real-time protection.
If you want to learn how to write anti-rootkit, ARKIT is an open source anti-rootkit tool. Here is a reference in Google code: https://code.google.com/archive/p/arkitlib/. It is recommended to understand the Windows internals and driver programming to understand this code.

Table of Contents for 2.3 Components of an antivirus engine

Create new playlist

Sign In

Sign Up

Table of Contents for
2.3 Components of an antivirus engine