© Abhijit Mohanta, Anoop Saldanha 2020
A. Mohanta, A. SaldanhaMalware Analysis and Detection Engineeringhttps://doi.org/10.1007/978-1-4842-6193-4_15

15. Malware Payload Dissection and Classification

Abhijit Mohanta1  and Anoop Saldanha2
(1)
Independent Cybersecurity Consultant, Bhubaneswar, Odisha, India
(2)
Independent Cybersecurity Consultant, Mangalore, Karnataka, India
 

A poisonous snake bites a person. What is the procedure to treat a snakebite victim? You take the patient to the hospital. First, there must be an assurance that the victim has been bitten by a snake and not by any other animal. Next, the patient is given an antidote, but not any antidote. A victim bitten by cobra cannot be treated by the antidote for a black mamba’s venom. So, before you can give the antivenom, you need to identify the snake that has bitten the victim.

The world of malware and remediating malware attacks is the same. The snakebite case arises when a computer is infected by malware. You need to classify the malware by figuring out its category so that you can provide the right treatment to neutralize the malware infection and disinfect the system from the infection. And classification is the technique that aids us in achieving this goal by helping us to identify, categorize, and name malware.

In this chapter, we are going to talk about payloads, the core of the malware. We are payloadsgoing to cover some of the more prevalent categories of malware payloads and explore techniques on how to classify them. But before we get there, in the next set of sections, let’s cover some basic terminologies prominently relevant to the topic of malware classification and why classification of malware is so important.

Malware Type, Family, Variant, and Clustering

A malware type is a high-level categorization of malware based on its functionality. As an example of what that means, let’s start with a scenario where two attackers Attacker-A and Attacker-B who do not know each other create their own versions of their malware. Attacker-A creates Malware-A, which can encrypt files on a victim’s machine using the XOR algorithm and asks $100 in return for decrypting the encrypted files. Attacker-B creates Malware-B, which encrypts the files on a victim’s machine using the RC4 encryption algorithm and asks for $500 in return for decrypting the encrypted files.

What is the common functionality between both pieces of malware? They both encrypt the files and ask for money in return. Do you know what we call this malware? The answer is ransomware, and the money they are seeking is called a ransom. As malware analysts, we can say that both Malware-A and Malware-B belong to the malware type or category called ransomware .

The story of minting malware does not end here. Attacker-A wants to earn a lot of money through extortion, and it doesn’t cut it for him if he just infects one victim. If he can send Malware-A to many other targets, there is a good chance he has a lot more victims, which translates to more money.

Now from a detection perspective, there are good chances that antivirus vendors get hold of Malware-A created by Attacker-A and have created detection for it. So, the next time that Malware-A appears on a target’s machine, there’s a good chance that it won't be able to infect the system if it has an antivirus installed, which catches Malware-A.

So practically speaking, it is hard for Attacker-A to victimize a larger audience with a single piece of his Malware-A. To counter this, the attacker creates several different unique instances of his same Malware-A using a tool like a Polymorphic Cryptor/Packer, which we covered in Chapter 7. These multiple instances of the same malware look different from each other, but internally it is the same malware, all of which, when executed, behave in the same manner. They vary by their hash values, size, icons, sections names, and so forth, but finally, all the instances are going to encrypt files on the victim’s machine with the same XOR encryption of the original Malware-A. Technically the different instances of malware created from Malware-A belong to a single malware family , which let’s call Malware-A-Family. Antivirus and other security vendors use various properties, fields, string values, and functionality values to name a malware family when they see a new one in the wild.

Now we know that there are multiple instances of the same Malware-A. Let’s view the problem from the angle of a detection engineer. As detection engineers, it is hard for us to get each malware instance in the wild for this Malware-A-Family and then individually write a detection method/signature for all of them. Instead, we want to write detection that can cover all the instances of this family or one that covers detection for many of them. But to do this, we need to collect as many instances of Malware-A first. But how do we do this?

To identify malware belonging to a single family, we use a technique called malware clustering. In malware clustering, we start by with just one or two instances of malware belonging to the same malware family, analyze them, figure out their common functionalities, and their unique attributes and traits. Armed with this data, we now search for other malware samples that share these same traits and attributes, thereby enabling us to create clusters of malware that have similar attributes and functionalities.

To elaborate a bit more on the terminologies we introduced, take the example of a banking trojan created by a group of hackers(attackers), which is going to vary from a banking trojan created by another group. A banking trojan created by one attacker group might target Bank_A while another may target Bank_B. Other than this, one group has coded the trojan in C while others are in .net. Thus, banking trojans can further be subclassified. The same holds for other malware types as well. Based on unique properties, we need to provide a proper name to the malware that gives more specific information about it. We call this the malware’s family name.

Like regular software, malware needs to be updated with time. Updates may be needed to patch its flaws or add some additional features. To achieve this, attackers release new variants or versions of their malware.

Nomenclature

Classification helps in providing names to the malware. Anti-malware products need to provide names for the malware they detect. Antiviruses name the malware based on certain properties. Naming the malware helps to correctly identify the threat and potential damage caused by it. It also helps the antivirus users to derive a proper conclusion about the infection. CARO (Computer Antivirus Research Organization) is an organization established to study computer viruses. CARO had set standards for naming viruses. With the advent of new kinds of malware, anti-malware companies have now set their own standard for naming malware as well, which might vary across vendors. For example, often, the malware from the same malware family can be given different family names by different anti-malware vendors. For example, the WannaCry malware was also called Wanna Decryptor, WannaCrypt, and so forth.

Microsoft also follows its own naming convention. The format example is type:Platform/Family.Variant!Suffixes. You can read more about Microsoft’s naming convention by searching “Microsoft malware naming convention” in Google, which should show you the Microsoft resource for its naming convention, which at the time of writing this book is located at https://docs.microsoft.com/en-us/windows/security/threat-protection/intelligence/malware-naming. Table 15-1 lists some of the naming conventions set by Microsoft for some of the malware categories.
Table 15-1

Some of the Naming Conventions Set by Microsoft for Malware Categories

Malware Type

Microsoft Name Format

Example

Trojan

Trojan:Win32/<Family><variant>

Trojan:Win32/Kryptomix

Virus

Virus:Win32/<Family><variant>

Virus:W32/Sality

Ransomware

Ransom:Win32/<Family><variant>

Ransom: Win32/Tescrypt

Adware

PUA:Win32/<Family><variant>

PUA:Win32/CandyOpen

Worm

worm:Win32<Family><variant>

worm:win32/Allaple.O

BackDoor

Backdoor:Win32/<Family><variant>

Backdoor:Win32/Dridexed

Stealer

PWS:Win32/<Family><variant>

PWS:Win32/zbot

Downloader

TrojanDownloader:Win32/<Family><variant>

TrojanDownloader:Win32/Banload

Spying

TrojanSpy:Win32/<Family><variant>

TrojanSpy:Win32/Banker.GB

Do note that some of the malware families might not exactly be given a category name that it should ideally be given or one that you expect it to be given. For example, a lot of antivirus vendors name and classify some of the malware categories like banking malware as trojans or TrojanSpy. Also, it might be difficult for an antivirus engineer to come up with a family name for a piece of malware, either because he didn’t find enough unique properties or because he couldn’t accurately classify the sample. In that case, generic names can be given to malware. For example, TrojanSpy:Win32/Banker tells that it is just a banking trojan and does not tell us the name of the malware family to which the sample belongs to like Tinba or Zeus.

Importance of Classification

Classification of malware is not only important for malware analysts but can be useful for threat hunting and developing antivirus detection solutions and signature creation. Let’s go through some of the important needs that show us why the classification of malware is important.

Proactive Detection

As malware analyst and security researchers, it’s not only important to analyze the malware but also equally important to be in a position where you can detect it so that you can predict and detect future malware attacks to keep your customers protected. As malware analysts, we need to gather further intelligence on malware to stop any attacks in the future.

This especially comes to the fore if you are responsible for the development of an anti-malware product, in which case you need to follow a proactive model in detecting threats. For that, it is important to classify malware that we come across. Classifying malware and tagging them to a category as well as family is important to write an effective detection. As you will learn in Chapter 22, that’s how antivirus engineers write detection. To write detection on malware samples, samples are classified to create clusters of similar samples together, where these clusters are created by finding patterns and attributes that are common to these malware samples.

These common patterns to cluster malware can be derived from static and dynamic analysis including network connections, files dropped, registry operations executed, strings in memory, and so forth. Most detection solutions rely on using common patterns to group malware samples into a cluster, with an expectation that similar patterns will be present in future strains of malware that belong to the same malware family/cluster. This ability to cluster samples is only possible if we can classify malware in the first place and create clusters of them so that we can write detection solutions and signatures for samples in the cluster.

Correct Remediation

Malware is designated to carry out the certain malicious activity on the victim machine. Malware can be a keylogger, a botnet, ransomware, a banking trojan, or a combination of them. When this malware infects systems, most of them make certain changes to the system, which needs to be undone by anti-malware software.

Most malware has common functionality, including creating run entries, code injection, and so forth, which can be handled by an anti-malware solution generically. But then comes other functionalities implemented in malware and malware families, that differentiate one malware or malware family from another. For example, take the case of ransomware. An encryption algorithm to encrypt files of a victim machine may not be the same for all the ransomware out there. WannaCry ransomware may encrypt files with a certain encryption algorithm and CryptoLocker with another. So, if the antivirus wants to decrypt the files encrypted by the ransomware, it must use a separate decryption algorithm. But to have targeted fine-tuned remediation solutions, the anti-malware solution should first know the category and the exact malware family it is dealing with. Hence it is important to know the malware type as well as the family to write a proper remediation solution.

Intelligence

Often the same hacker groups create different kinds of malware. Properly classifying malware based on how they are programmed, their origin, modules used, any common strings, and so forth can help us correlate malware to existing malware families and thereby to the attackers who created them. Malware analysts should build and maintain a database of this information so that it can help them predict and detect attacks and can even help in tracking down attackers.

Intention and Scope of Attack

Attackers program malware for different intentions. A ransomware’s goal is to encrypt files so that it can extort a ransom from the victim. A banking trojan aims to steal banking credentials, a keylogger , and other info stealers aim to steal critical information from the victims and so on. In certain other cases, malware attacks might be targeted, for example, in the HR department, the finance department, the CEO, and so forth.

From a company perspective, these kinds of malware don’t inflict the same kind of damage. They inflict damage to the network and the customer in different ways, and many times, damages can have a ripple effect on companies, including damaging their brand value and stock market value. To deal with damages and provide damage limitation for the company, it is important to classify them and figure out who the attacker is and the intention of the attacker, so that you can start preparing yourselves to deal with the damage caused to your brand and reputation after an infection.

Classification Basis

Most real-world malware is packed, and you need to unpack it or extract the payload to classify them. Also, most malware works successfully only in appropriate environments that they are targeted to run on. A POS or ATM malware won’t successfully execute unless it sees the presence of a POS device or ATM device. Many of them work only after receiving certain data/commands from the C&C server. Some of the malware may not successfully execute its final intention if executed in a malware analysis environment.

As you see here, there are many caveats to successfully analyzing a malware sample, and this is why dynamic analysis doesn’t always work, because in dynamic analysis, we just expect the malware to run, but there are a lot more cases than the ones we mentioned that prevents a sample from successfully executing or executing its full set of behaviors.

Hence reverse engineering is the only way to truly extract the exact behavior of a sample and classify them. We get to reverse engineering in Part 5 of this book, which should help you to a much greater extent when analyzing malware. But up until then, in this chapter, we use string analysis, API analysis, and other dynamic analysis tricks to extract the behavior of the malware and classify them. This avoids the time taking process of reverse engineering.

The classification of malware can largely be done using various combinations of data; the most important are listed next.
  • API calls

  • Author of the malware

  • API hooks

  • Debug information

  • Reused code

  • Library dependencies

  • Format strings

  • Mutex names

  • Registry key names and values

  • IP addresses, domain names, and URLs

  • File names

  • Unique strings

  • API calls

    API calls or rather specific sequences of API calls often define a functionality. For example, ransomware and a file infector are going to call file modification APIs continuously. A POS malware can use APIs like ReadProcessMemory to read the memory of processes to search for credit card numbers and other banking details.

  • Creator

    Sometimes malware writers may leave behind their names, their email IDs, their handles in the malware binaries they create. The reason why they leave these details can range from an open challenge to the security industry to identify/locate them, all the way to maintaining a brand uniqueness for themselves in the hacker world.

  • API Hooks

    Different functionalities require different types of APIs to be hooked and can be used as a great indicator of malware functionality. For example, banking trojans and information-stealing malware hook networking APIs in applications to intercept network communication. Similarly, rootkits can hook file browsing APIs and process listing APIs to hide their artifacts. The type of APIs hooked reveal the intention of the hook and thereby the malware.

  • Debug Information

    Software developers, including malware authors, often use debug statements like printf() for troubleshooting purposes, which usually don’t make their way into production releases of their programs/malware since they usually have it commented out. But if they forget to remove or comment these statements then end up getting compiled into the final software created and can be visible in the compiled binary.

Apart from that, they also use sensible human-readable names for the variables in their code. As an example, they can use a variable name credit_card for storing a credit card number like char *credit_card[100]. When these programs are compiled in debug mode, these variable names are added along with code as debug symbols so that they can be used for debugging the code later.

Debug information embedded in malware, often left unintentionally by malware authors when they forget to remove debug statements or compile their malware code in debug mode, is a great way for us to understand more about the malware and the malware author.
  • Reused Code

    Malware authors often share code and libraries across various malware they write, which might belong to the same malware family or even across malware families. Similarly, a lot of them use specific third-party libraries across all variants of malware they write. When analyzing malware samples, if we discover code or a specific library, that we form our experience have seen being used in another malware we previously analyzed/reversed, we can then correlate and conclude that the current malware we are analyzing might belong to that same malware family or might have been created by the same attacker.

  • Library Dependencies

    Malware uses third-party libraries to implement various functionalities. Many third-party libraries and frameworks are available for use by software developers, some of which have very specific functionalities that reveal the intention of the user of these libraries. Malware uses third-party libraries to implement their functionality, thereby giving us a glimpse into the intention of the malware, which we can infer from the functionality of the library. For example, ransomware uses crypto libraries, cryptominers use various open source cryptomining libraries, and ATM malware uses a library called Extensions for Financial Services (XFS) provided by Microsoft.

  • Format Strings

    You might see format string patterns in any kind of software as well as malware and are used by malware to create meaningful strings C&C URLs and other variable data as well. Format strings can be located by searching for the = and % symbols, including combinations of them. As an example, Listing 15-1 shows a format string used by malware to create an output string that holds various fields like botid, os, and so forth, which is then sent to the malicious server.

botid=%s&ver=1.0.2&up=%u&os=%03u&rights=%s&ltime=%s%d&token=%d&cn=test
Listing 15-1

Example format string whose fields are filled by the malware to generate a final output to be sent to the C2 server

The following lists various other examples of format strings seen in malware.
  • Mutex names

    Malware uses mutexes for synchronization purposes, as you learned in Chapter 5 so that no two instances of the same malware run at the same time. These mutexes created by malware might have names that might be unique to all the malware and malware variants belonging to the same malware family. For example, in Chapter 14, we used the mutex name 2gvwnqjz1 to determine that the malware executed belonged to the Asprox family. The following is a list of mutex names found in some of the malware.
    • 53c044b1f7eb7bc1cbf2bff088c95b30

    • Tr0gBot

    • 6a8c9937zFIwHPZ309UZMZYVnwScPB2pR2MEx5SY7B1xgbruoO

    • TdlStartMutex

  • IP Addresses, Domain Names, and URLs

    As part of malware string analysis both static and dynamic, you might see IP Addresses, C2C and other URLs and C2C domain names used by malware for network communication, which might be specific to threat actor groups, APT, and underground groups who use it for that specific malware family or across multiple families. With these strings in hand, you can check for various other analysis reports publicly available on the web, and your own analysis reports can shed light on these strings and classify the malware sample.

  • File Names

    Malware drops various files to the file system, including executables, config, or data files. They might also create text files on the system to log stolen data. A lot of these files created by the malware have patterned names specific to malware in that malware family. For example, if you analyze Sample-7-2, as we did in Chapter 7, you can see that the malware creates the marijuana.txt file, and this filename is specific to the Wabot malware family.

At the same time, you don’t need to run the sample and wait for the malware to create these files, to obtain these file names. Instead, some of these file names created by the malware when they run can also be obtained from string analysis, static or dynamic. To search for the presence of filename related strings, you can look out for file extension strings like .txt, .exe, .config, .dat, .ini, .xml, .html and other extensions in the strings retrieved from malware.
  • Unique Strings

    Finding unique strings in malware helps give a family name to the malware. This might be a bit hard, and sometimes you might not be successful in finding these unique strings. You probably need more than one malware belonging to the same family to find a unique string. Unique string means it should not be an API or DLL name that can be common in all kinds of Win32 executables. Rather it should be unique to the malware family, like mutex names, IP addresses, URLs, unique files created by the malware, and so forth.

For example, the string YUIPWDFILE0YUIPKDFILE0YUICRYPTED0YUI1.0 is found only in Fareit or Pony malware. If we see this string while conducting string analysis on any other sample, we can conclude and classify that the sample is Fareit/Pony malware. Another example is the string Krab.txt which is unique to malware in the GandCrab malware.

In the next set of sections, let’s put our knowledge to the test to classify and identify various types and categories of malware.

KeyLogger

Keylogging is one of the oldest methods of stealing data. A keylogger logs the keystrokes on your machine. A keylogger not only limits itself to logging the keys but also sends the logged keystrokes to the attacker. Keyloggers can also be a part of other information-stealing malware and can be used in critical APT attacks.

There can be several ways to create a keylogger on a Windows OS. Windows has provided some well documented APIs, with which attackers can create keyloggers very easily. Next, we explore two mechanisms that create keyloggers on Windows and mechanisms that we can employ to identify the presence of a keylogger.

Hooking Keyboard Messages

One mechanism to create a keylogger works by hooking keyboard messages. Several events occur in a system, including key presses and mouse clicks. These events are collected by the system and notified of the processes or applications using messages. Along with keyboard events, the keystroke can also be transmitted using these messages.

To subscribe to these events messages, Win32 provides the SetWindowsHookEx API, as seen in Listing 15-2, which can be used by attackers to create a keylogger.
HOOK WINAPI SetWindowsHookEx(
  __in  int idHook,
  __in  HOOKPROC lpfn,
  __in  HINSTANCE hMod,
  __in  DWORD dwThreadId
)
Listing 15-2

SetWindowsHookEx API Which Can Create a Keylogger on Windows

The API takes four parameters.
  • idHook: Specifies what kind of hook you want to subscribe to. For intercepting keystrokes this parameter can be either WH_KEYBOARD_LL or WH_KEYBOARD.

  • lpfn: Specifies the user-defined callback function, which is called with the intercept events. With malware keyloggers, this function is tasked with the goal of consuming the intercepted keystrokes and logging them. The function is also called a hook procedure.

  • hMod: Handle to the module/DLL that contains the lpfn hook procedure.

  • dwThreadId: The ID of the thread which the hook procedure is to be associated with. If you wish to intercept events for all thread across all programs on the system, this parameter should be set to 0.

With this, creating a keylogger is as simple as invoking this API from our sample program, like the example in Listing 15-3, which creates a global hook for all the applications running on the system and subscribing to all keyboard events. It then sends the keyboard events to our callback KeyboardProc hook procedure.
SetWindowsHookEx(WH_KEYBOARD_LL,
                 HOOKPROC)KeyboardProc,
                 GetModuleHandle(NULL),
                 0);
Listing 15-3

Registering a Hook Using the SetWindowsHookEx API

To detect malware samples that use keyloggers, check for the presence/usage of the SetWindowsHook API, and dother such APIs (CallNextHookEx, Getmessage, TranslateMessage and DispatchMessage). These APIs used by the malware can be obtained using APIMiner, or other such API logging tools.

Getting Keyboard Status

Another way of logging keystrokes is to continuously obtain the state of a key in a loop. This can be achieved by calling the GetAsynckeyState Win32 API in a loop. The API tells if a key has been pressed when the API has been called and tells if the key was pressed after a previous call to the API. The API takes a virtual key code as a parameter and returns the value of –32767 if a key is pressed. The VirtualKeyCode API parameter can be any of the 256 virtual keycodes. Listing 15-4 shows a sample code that gets keystrokes by using the API.
while (1) {
     if (GetAsyncKeyState(VirtualKeyCode) == -32767) {
         switch(VirtualKeyCode) {
             case VK_RIGHT:
                 printf("<right> key pressed");
                 break;
             case ...
      }
}
Listing 15-4

Example of the GetAsyncKeyState() API tHAT Creates a Keylogger on Windows

Keyloggers that use the mechanism can be recognized by using these and other related APIs, which we can obtain using API logging tools like APIMiner. The following lists the common Win32 APIs that identify the presence of a keylogger.
  • GetWindowThreadProcessId

  • CallNextHookEx

  • GetMessage

  • GetKeyboardState

  • GetSystemMetrics

  • TranslateMessage

  • GetAsyncKeyState

  • DispatchMessage

  • SetWindowsHookEx

Other than the API logs from tools like APIMiner that recognize the presence of keyloggers, we can also identify them by strings too using string analysis. Malware usually uses some strings to represent special keys on the keyboard like Ctrl, Alt, Shift, Caps, and so forth. A left arrow key may be represented by [Arrow Left] or [Left Arrow], and so on. The strings that identify keystrokes may vary between keyloggers but are likely to contain similar words like caps and lock, and so forth.

The following list includes the strings that represent special keys that are part of the keylogger component of Xtreme RAT from Sample-15-1 in our samples repo. This sample is packed using UPX, and you can unpack it to generate the unpacked file on disk using CFF Explorer by using its UPX utility. After clicking the Unpack button in the UPX utility, you can click the Save icon to save the unpacked file to disk on which you can carry out static string analysis using BinText. Some of the strings seen statically in this unpacked file are listed next.
  • Backspace

  • Numpad .

  • Numpad /

  • Caps Lock

  • Delete

  • Arrow Down

  • Esc

  • Execute

  • Numpad *

  • Finish

  • Copy

  • Back Tab

After obtaining the keystrokes, malware can store the logged keystrokes in a file on disk or in memory. Both ways of storing keystrokes have their pros and cons. If the keystroke is stored in files, a tool like ProcMon might be able to identify that the file is updated at regular intervals, which gives away the intention of the file and the presence of the keylogger malware.

Many times, you can find the names of .txt or .log files, which might be meant for logging keystrokes, using string analysis, or even dynamic event analysis, again easily giving away the presence of the keylogger. But if the keystrokes are stored in memory by the malware, they cannot be detected easily, but then the downside is that they may be lost if the system is logged off.

Information Stealers (PWS)

A computer user, whether in an organization or an individual, uses a lot of applications. A browser like Firefox is used for browsing websites. An FTP client like FileZilla accesses FTP servers. An email client like MS Outlook accesses emails. Many of these applications save their credentials as well as history to ease these applications by its users. All these applications store their data in certain files or local databases. Information Stealers work by trying to steal these saved credentials along with the rest of the data, which it then sends to its attackers.

Before looking at how this data is stolen, let’s see how some applications store their data. Mozilla Firefox browser saves its data (i.e., the URLs, the form data, credentials, and so forth, in the profile folder located at C:Users<user name>AppDataRoamingMozillaFirefoxProfiles<random name>.default). The folder name ends with .default and <user name> is the username of the user on the system.

Older versions of Firefox stored passwords in a database file called signons.sqlite. The passwords are stored in encrypted form, but once the attackers catch hold of this data, they are somehow going to find ways to decrypt it. The signons.sqlite has a table called moz_logins, which has the saved credentials. To identify info stealer malware that steals data from Firefox SQLite DB, you can search for the presence of strings related to SQL queries from the strings in the malware sample.

Similarly, the FileZilla FTP client has information stored in various files like sitemanager.xml, recentservers.xml, and filezilla.xml. There are many other applications like GlobalScape, CuteFTP, FlashFXP, and so forth, which also save credentials in various files, which malware tries to access and steal. Similarly, malware is also known to hunt for cryptocurrency-related wallet credentials.

From an analysis perspective, it is important to arm ourselves with the knowledge of how various applications that are usually targeted by malware, store their various data and credentials. In the next set of sections, let’s explore how we can identify info stealers using both static and dynamic techniques.

Dynamic Events and API Logs

As you learned in the previous section, various applications store their data and credentials across various files on the disk. Info stealing malware can be identified if you can identify the presence of events that indicate access to credentials files and data files of applications.

Obtaining events that indicate access to these files can be done using tools like APIMiner, which for info stealers might end up logging API calls like CreateFile, GetFileAttributes, or other file access related APIs. Alternatively, you can also identify the events through dynamic analysis tools like ProcMon.

As an exercise, run Sample-15-2 from the samples repo using APIMiner. If you go through your logs, you see APIs very similar to the ones seen in Listing 15-5. The directories and files accessed by the malware are related to Ethereum, Bitcoin, and FileZilla using GetFileAttributesExW file operations related to the Win32 API. None of these files of directories exists on our system, but it looks like the malware is trying to find this information.
<file>-<0,0x00000000> GetFileAttributesExW([info_level]0, [filepath]"C:Users<username>AppDataRoamingFileZilla ecentservers.xml",[filepath_r]"C:Users<username>AppDataRoamingFileZilla ecentservers.xml")
<file>-<0,0x00000000> GetFileAttributesExW([info_level]0, [filepath]"C:Users<username>AppDataRoamingEthereumkeystore",filepath_r]"C:Users<username>AppDataRoamingEthereumkeystore")
<file>-<0,0x00000000> GetFileAttributesExW([info_level]0, [filepath]"C:Users<username>AppDataRoamingmSIGNA_Bitcoinwallets",filepath_r]"C:Users<username>AppDataRoamingmSIGNA_Bitcoinwallets")
<file>-<0,0x00000000> GetFileAttributesExW([info_level]0, [filepath]"C:Users<username>AppDataRoamingElectrumwallets",filepath_r]"C:Users<username>AppDataRoamingElectrumwallets")
<file>-<0,0x00000000> GetFileAttributesExW([info_level]0, [filepath]"C:Users<username>AppDataRoamingBitcoinwallets",[filepath_r]"C:Users<username>AppDataRoamingBitcoinwallets")
Listing 15-5

API logs obtained from APIMiner for Sample-15-2 that show various credentials related files accessed by the sample, indicating that the sample is a keylogger

String Analysis of Info Stealers

You learned that info stealers search for various files, directories storing data, and credentials by various applications. You can use the presence of these strings in string analysis to classify the sample as an info stealer.

As an exercise, analyze Sample-15-2, Sample-15-3, Sample-15-4, Sample-15-5, Sample-15-6, and Sample-15-7 from the samples repo, all of which belong to the same info stealing malware family. Some of these samples run, but none of them are packed, and you can see various strings in them statically using BinText, some of which we have listed in Table 15-2.
Table 15-2

Strings Obtained from String Analysis on Sample-15-2 Extreme RAT Malware, Which Identify That the Sample Has a Keylogger

FileZilla

FileZilla.xml

filezilla.xml

Bitcoinwallets

sitemanager.xml

FlashFXP

Sites.dat

mSIGNA_Bitcoinwallets

Quick.dat

History.dat

Sites.dat

Electrumwallets

NCH SoftwareFling

Accounts

Frigate3

mSIGNA_Bitcoinwallets*.dat

FtpSite.XML

FTP Commander

ftplist.txt

Electrumwallets*.dat

SmartFTP

Favorites.dat

TurboFTP

Ethereumkeystore*

Ethereumkeystore

  

Bitcoinwallets*.datn

From the strings, it is not hard to conclude that the samples try to access various credentials and data files of various applications, indicating that it is an info stealer.

For Sample-15-3 to Sample15-7, if you sift through the strings, you also find a unique string YUIPWDFILE0YUIPKDFILE0YUICRYPTED0YUI1.0. If you search for this string on the web, you see that it is related to Fareit or Pony malware. Look at this unique string again? Does it look like junk? Observe again, and you find some hidden words in it. Just replace YUI with a space, you get the following strings: PWDFILE0, PKDFILE0, and CRYPTED0 1.0, which now kind of makes sense where PWD seems to represent password.

The following is a list of some popular PWS malware. As an exercise, try obtaining samples for each of the malware families and apply both string and other dynamic analysis techniques you learned in this chapter and see if you can identify any info stealer components in them.
  • Loki

  • Zeus

  • Kronos

  • Pony

  • Cridex

  • Sinowal

Banking Malware

We saw information stealers can retrieve saved passwords from browsers. But most banks these days might now allow users to save passwords in a browser. Other than username and passwords, banks may require second-factor authentication , which could be one-time passwords (OTP), CAPTCHAs, number grids to complete authentication, and in some cases, the transaction as well. This data is always dynamic, and even saving this data in the browser is useless as these kinds of data are valid for a single session.

Hence the session needs to be intercepted by a man-in-the- browser attack during a live banking session. Since the banking transactions happen through the browser, malware needs to intercept the banking transaction from within the browser, and malware are called banking trojans . Attacks are often called man-in-the-browser (MITB) attacks.

Let’s go through the sequence of APIs a browser uses to perform an HTTP transaction. The transaction is started by establishing a TCP connection with the server for which a browser client uses a sequence of APIs that includes InternetOpen and InternetConnect. After the TCP connection is established, an HTTP connection can be established using HttpOpenRequest, after which an HTTP request is sent from the browser using HttpSendRequest. The InternetReadFile file API reads the response from the HTTP server.

Now a banking trojan works by hooking these APIs. These API hooks are specific to the Internet Explorer browser. There can be hooks that are related to other browsers too. In the next set of sections, you see how to identify banking trojans.

API Logs and Hook Scanners

Banking trojan works by hooking APIs in the browser, and you can use dynamic analysis tools like APIMiner to log the APIs used by these malware samples to classify them. You can similarly classify them by using hook scanning tools like GMER and NoVirusThanks API Hook Scanner, which we introduced in Chapter 11. While you are analyzing samples, combine both these sets of tools to identify if the sample is a banking trojan.

As you learned in Chapter 10 and 11, hooking requires code injection, and so you are likely to see the code injection-related APIs in your API logs like OpenProcess, Virtualalloc, VirtualProtect, WriteProcessMemory, and so forth.

If the target of the hook is Internet Explorer, you see the APIs that we specified in the previous chapter, which we have listed again. The following are APIs hooked by banking trojans when hooking the Internet Explorer browser.
  • InternetConnectA

  • InternetConnectW

  • HttpOpenRequestA

  • HttpOpenRequestW

  • HttpSendRequestA

  • HttpSendRequestW

  • HttpSendRequestExA

  • HttpSendRequestExW

  • InternetReadFile

  • InternetReadFileExA

The following is the list of APIs hooked if the target application for hooking by the banking trojan is Firefox browser.
  • PR_OpenTCPSocket

  • PR_Connect

  • PR_Close

  • PR_Write

  • PR_Read

The following are the APIs hooked if the target application for hooking by the banking trojan is the Chrome browser.
  • ssl_read

  • ssl_write

One often asked misconception related to banking trojans is that encryption prevents them from stealing our credentials and data. This is not true. Banking trojans hook various APIs that intercept data in your applications and browsers before they get encrypted. Similarly, they also hook APIs that receive data from the servers after decryption, thereby giving them access to unencrypted streams of data.

String Analysis on Banking Trojans

Similar to how we use strings to identify info stealers in the previous section, we can use the same technique to identify banking trojans.

As an exercise, check out Sample-15-8, Sample-15-9, Sample-15-10, and Sample-15-11 from the samples repo. All these samples are not packed, and you can obtain the strings for these samples using BinText, as you learned in the previous chapters of this book.

If you analyze the strings in these samples, you see the list of APIs imported by these samples, also partially seen in Figure 15-1, which are common targets of banking trojans that target Internet Explorer for hooking.
../images/491809_1_En_15_Chapter/491809_1_En_15_Fig1a_HTML.jpg../images/491809_1_En_15_Chapter/491809_1_En_15_Fig1b_HTML.jpg
Figure 15-1

Various APIs obtained from strings of our exercise samples, that indicates APIs that are commonly hooked by banking trojans s targeting Internet Explorer

You also find other strings like the ones listed next. If you search for these strings they point to the web injects config file used by Zeus malware .
  • set_url

  • data_before

  • data_after

  • data_end

  • data_inject

You might also see banking URLs in the strings, and for our samples, you see one: ebank.laiki.com.

From the string seen so far, we were able to conclude that this might be a banking trojan, and some of the strings also point to the config file used by malware that belongs to the Zeus malware family, revealing to us the family of the malware as Zeus. Let’s see if we can somehow find more data to relate to the malware family Zeus. We need to find some common strings which are also unique.

The following lists some unique strings that are unique to our exercise malware sample set. If you Google the third string in the table, you find that it could be related to Zeus banking trojan.
  • id=%s&ver=4.2.5&up=%u&os=%03u&rights=%s&ltime=%s%d&token=%d

  • id=%s&ver=4.2.7&up=%u&os=%03u&rights=%s&ltime=%s%d&token=%d&d=%s

  • command=auth_loginByPassword&back_command=&back_custom1=&

  • id=1&post=%u

  • &cvv=

  • &cvv=&

  • &cvv2=

  • &cvv2=&

  • &cvc=

  • &cvc=&

The following is a list of some popular banking trojan families. As an exercise, try obtaining samples for each of the families and apply both string and other dynamic analysis techniques you learned in this chapter, and see if you can identify the samples as banking trojan and also the family it belongs to.
  • Zbot

  • Dridex

  • UrSnif

  • TrickBot

  • BackSwap

  • Tinba

Point-of-Sale (POS) Malware

All of us have definitely come across the point-of-sale (POS) devices in shops, cinema halls, shopping malls, grocery shops, medicine stores, restaurants, where we swipe our payment cards (debit and credit cards) on the POS devices to make payments. These POS devices are targeted by a category of malware called POS malware that aims to steal our credit card numbers and other banking-related details for malicious purposes. Before we go into depth on how POS malware works and how to identify them, let’s see how a POS device works.

How POS Devices Work

A POS device is connected to a computer, which may be a regular computer or a computer that has a POS specific operating system. The computer has a POS scanner software installed on it, which can be from the vendor who created the POS device. The POS scanner software can read the information of the swiped payment card on the POS device and can extract information like card number, validity, and so forth, and can even validate the card by connecting to the payment processing server.

Now the information is stored in our payment cards in a specific manner. Our payment cards have a magnetic strip on it, which is divided into three tracks: track 1, track 2 and track 3. they contain various kinds of information, such as the primary account number (PAN), card holder’s name, expiry date, and so forth required to make a payment. Track 1 of the card has a format that is illustrated in Figure 15-2.
../images/491809_1_En_15_Chapter/491809_1_En_15_Fig2_HTML.jpg
Figure 15-2

The format of track 1 of a payment credit/debit card

The various fields in the track format are described in Table 15-3.
Table 15-3

The Description for Various Fields in Track 1 of the Payment Card

Field

Description

%

Indicates the start of track 1

B

Indicates Credit or Debit Card

PN

Indicates Primary Account Number (PAN) and can hold up to 19 digits

^

Separator

LN

Indicates last name

Separator

FN

Indicates first name

^

Separator

YYMM

Indicates expiry date of the card in year and date format

DD

Discretionary data

?

Indicates the end of track 1

SC

Service code

%

Indicates the start of track 1

B

Indicates Credit or Debit Card

PN

Indicates Primary Account Number (PAN) and can hold up to 19 digits

^

Separator

LN

Indicates the last name

Separator

FN

Indicates the first name

^

Separator

YYMM

Indicates expiry date of the card in a year and date format

DD

Discretionary data

?

Indicates the end of track 1

An example track that uses the format should look like the one in Listing 15-6.
%B12345678901234^LAST_NAME/FIRST_NAME^2203111001000111000000789000000?
Listing 15-6

An Example Track1 Based on Track1 Format Described in Figure 15-2

Now the POS software can read this information from the card that is swiped on the POS device and store the information in its virtual memory. The POS software then uses this information stored in memory to carry out the payment process, which includes the authentication followed by the transaction. Now that we know how POS devices work let’s see how a POS malware works.

How POS Malware Work

POS software stores the information retrieved for the payment card from the POS device in its virtual memory. This information for the payment card most of the time is present in memory in an unencrypted format. This is what the malware exploits. Malware can scan the virtual memory of the POS software and retrieve the credit/debit card information, as illustrated in Figure 15-3.
../images/491809_1_En_15_Chapter/491809_1_En_15_Fig3_HTML.jpg
Figure 15-3

The POS device and the PS software setup which is the target of malware

To retrieve the credit card information from memory, a POS malware searches for specific patterns in the virtual memory of the POS software process that matches track 1 format of the payment card we explored in Figure 15-2. WIth the track 1 contents retrieved from memory, it checks if the credit card number is a possible valid credit card number using Luhn’s algorithm and then can transfer it to the attacker’s CnC server for other malicious purposes.

Identifying and Classifying POS

A POS malware can be identified by the set of APIs it uses, and this can be obtained from dynamic analysis using APIMiner as we did in the earlier sections for other malware.

As we know, POS malware needs to scan the memory of the POS software process running on the POS system. To that end, it first needs to search the system for the presence of the POS software. With the POS software process found, it then opens a handle to this process and then reads its memory.

You can recognize these activities of POS malware in your API logs by searching for the presence of the sequence of APIs listed.
  • CreateToolhelp32Snapshot

  • Process32FirstW

  • Process32NextW

  • NtOpenProcess

  • ReadProcessMemory

In your API logs, you see continuous calls to ReadProcessMemory after the NtOpenProcess call. This is because the memory blocks are sequentially read and then scanned for the credit card number.

As an exercise, we have a POS malware in our samples repo Sample-15-12 which you can execute in your analysis VM using APIMiner. As described earlier, if you check the API logs, you see multiple calls to ReadProcessMemory by the sample for various processes on the system, as seen in Listing 15-7.
ReadProcessMemory([process_handle]0x000001A4,
                  [base_address]0x00010000)
ReadProcessMemory([process_handle]0x000001A4,
                  [base_address]0x00020000)
ReadProcessMemory([process_handle]0x000001A4,
                  [base_address]0x0012D000)
ReadProcessMemory([process_handle]0x000001A4,
                  [base_address]0x00140000)
Listing 15-7

The API logs for Sample-15-12 shows the sample reading contents of other processes’ memory with the goal of scanning for credit card information

Strings In POS Malware

POS Malware can also be identified by using strings obtained from either static or dynamic string analysis, as you learned in our earlier chapters, and like we did in our earlier sections.

As an example, check out Sample-15-13, Sample-15-14, Sample-15-15, and Sample-15-16 from the samples repo. All these samples belong to the same malware family. Extract the strings for all these samples using BinText.

Now from the strings obtained from these samples, you find ones that we have listed next, which shows the names of well-known software programs that are run on the system. A list could indicate that the sample is a POS malware. But how? Now for POS malware, scanning every system process can be a bit expensive from a CPU consumption perspective. Instead, the malware can have a blacklist, using which it chooses to omit well-known system processes like the ones in the table, as they are not going to be POS scanner software.
  • explorer.exe

  • chrome.exe

  • firefox.exe

  • iexplore.exe

  • svchost.exe

  • smss.exe

  • csrss.exe

  • wininit.exe

  • steam.exe

  • skype.exe

  • thunderbird.exe

  • devenv.exe

  • services.exe

  • dllhost.exe

  • pidgin.exe

There can be more other processes that POS malware can blacklist from scanning. From the strings, you can also find a list of the processes the malware would specifically like to scan. Some of the POS vendors can have specific process names for their POS scanning programs. We can call this a whitelist process list , which POS malware specifically wants to scan. The following is a list of some of the POS scanning software names obtained from the strings of our samples.
  • pos.exe

  • sslgw.exe

  • sisad.exe

  • edcsvr.exe

  • calsrv.exe

  • counterpoint.exe

Beyond these strings, other strings indicate that the sample is a POS.
  • track 1

  • track 2

  • pos

  • master

We were able to classify from the strings that these are POS malware. But we still have a task left for us, and that is identifying the malware family these samples belong to. The following is a list of some unique strings obtained from the provided samples, which may help us to identify the family of the malware, which if you Google it, points to the samples belonging to the Alina POS malware family.
  • /jkp/loading.php

  • \.pipeKatrina

  • /trinapanel/settings.php

  • chukky.xyz

  • /fyzeee/settings.php

  • /ssl/settings.php

  • updateinterval=

  • safetimes.biz

The following lists some of the popular POS malware families. As an exercise, obtain samples from the families and analyze them. See if you can classify and also figure out the family name for the samples using various analysis techniques we have learned so far.
  • Alina

  • VSkimmer

  • Dexter

  • Rdasrv

  • Backoff

  • FastPOS

ATM Malware

Automated teller machines, or ATMs, have always been a target for all kinds of criminals, from petty thieves to cybercriminals. There have been countless attempts to physically break into ATMs to extract cash. But these days, cybercriminals create malware and use it to extract cash from ATMs without even breaking it physically. Before we get down to understanding how ATM malware works, let’s have a basic understanding of how ATMs work.

ATMs have two main components: the cabinet and the safe. The cabinet consists of a computer that has many devices connected to it, while the safe stores the cash. The following is a list of devices connected to the computer in the cabinet.
  • Keypad: This is the number pad where we key in the PIN, amount, and so forth

  • Cash dispenser: This device dispenses the cash.

  • Card reader: This device is responsible for reading the debit (payment) card.

  • Network card: This one connects the ATM to the bank network.

Other than these devices, there are USB ports that can troubleshoot the ATM. These devices are called peripherals. When a card is inserted into the card reader, the computer reads the card/account details from the card and then asks the user to key in the PIN. The user keys in the PIN on the keypad and the computer reads the PIN and validates it by sending information to the bank server. Once the authentication process is complete, the computer asks the user to key in the amount. The user keys in the amount through the keypad, and after the validation, the cash is dispensed from the cash dispenser.

The peripherals are manufactured by different vendors. We saw in the previous paragraph that the computer needs to communicate with the other peripherals. So it is important to have a standard protocol for communication between the computer and the peripherals. XFS (extensions for financial services) is an architecture designed specifically for these purposes. The architecture ensures an abstraction that sees to the proper working of the system if a peripheral manufactured by one vendor is replaced by the peripheral manufactured by another vendor.

The peripherals also have embedded software in them for their functioning. The embedded software exposes APIs which can be invoked by the embedded OS installed in the ATM computer. These APIs are called service provider interfaces (XFS SPIs). Most of the ATMs are known to use embedded versions of Windows OS. Till 2014, 70% of the ATMs had Windows XP installed in them. Windows has implemented an XFS library in msxfs.dll and exposed the XFS API for use by software programs that need them. With the help of the APIs in msxfs.dll, we can communicate with the XFS interfaces of the peripherals without even knowing who the manufacturer is. A software that is meant to operate the ATM can directly call these APIs and need not implement XFS APIs on its own. The same goes for ATM malware.

If it gains access to the ATM computer, malware can use the same msxfs.dll to carry out its malicious intentions, like forcing the ATM to dispense cash.

Analyzing and classifying ATM malware can be extremely easy if common sense is applied. Unless you are working for a bank or ATM vendor, you are very unlikely to encounter an ATM malware. ATM libraries like msxfs.dll are less likely to be used in common software. It is used by either an ATM application (which is clean) or an ATM Malware. So the problem of classifying an ATM malware can be narrowed to any sample that imports msxfs.dll as long as it is first identified as malware.

RATs

RAT is the abbreviation for remote administration tools, also known as remote access trojans. RATs are the most popular tool used in targetted or APT attacks. Remote administration or remote access, as it sounds, works as a remote desktop sharing kind of software, but the difference is it does not seek permission from the victim before accessing and taking control of a remote computer.

RATs stay vigilant on the system and monitor for all kinds of user activities. RAT has two components, one that needs to be installed on the C&C server and the other one the client part, which is the RAT malware that needs to be installed on the victim machine. The server component looks out for connection requests coming back from the RAT malware (clients), which connect to the server to receive commands to execute. This functionality is like a botnet but has many more capabilities. RATs make sure that the victim is under full control after a successful infection.

The following are some of the prominent features of RAT.
  • Turn on the webcam for video

  • Take screenshots

  • Log keystrokes

  • Downloading other malware and executing them

  • Sending the files on the victim machine to the C&C server

  • Terminating other processes like antivirus

  • Execute operating system commands

Many of the RATs tools are freely available on the Internet for use by anybody. The attacker only needs to find a way to infect the victim with the RAT malware. Poison Ivy is one popular freely available RAT tool.

Identifying RATs

RATs can be identified using various techniques. Some of the popular RAT tools leave an open backdoor port, some of which are listed in Table 15-4. These port numbers listed in the table are standard fixed port numbers used by these RAT malware from these families. While analyzing malware samples dynamically, you can use the presence of listening sockets on these port numbers, as an indication that the sample listening on these port numbers belongs to these specific RAT families.
Table 15-4

Some Popular RATs and Port Numbers They Open a Backdoor On

RAT Family

Port

njRat

1177 and 8282

PoisonIvy

6868 and 7777

GravityRAT

46769

RATS are also known to have keylogging functionality, can take screenshots, record audio, and video. Table 15-5 lists some of the APIs that are associated with these various functionalities. Using tools like APIMiner, you can analyze a sample for API logs and check for these APIs by the sample, that can help you identify if the sample in question is a RAT.
Table 15-5

APIs Associated with Various Functionalities Provided by RATs That Can Classify Them Using API Logs from Tools Like APIMiner

Functionality

Associated APIs

Screenshot

GetDC, BitBlt, CreateCompatibleDC, FCICreate, FCIAddFile, FDICreate

BackDoor

WSAStartup, WSASocket

Keylogger

GetAsynckeyState, SetWindowsHook and so forth (check KeyLogger section)

Clipboard

OpenClipboard, GetClipboardData

Strings in RAT Malware

We can use string analysis to identify RATs, as we did to other malware types in the previous sections. As you learned in the previous section, RATs have various functionalities that can be easily identified by the various APIs associated with these functionalities. With strings obtained from string analysis, you can check for the presence of various APIs listed in Table 15-5, which can indicate the possibility that the sample is a RAT. Apart from that might also find various other strings that can point you at resources on Google that identifies the sample as a RAT.

As an exercise, check out samples Sample-15-17, Sample-15-18, Sample-15-19, and Sample-15-20, all of which belong to the same RAT malware family. All these samples are UPX packed, and you need to unpack them using UPX unpacker or CFF Explorer to view their strings. Alternatively, you can also run the samples to obtain the strings after unpacking using Process Hacker as we did in Chapter 7 and Chapter 13. From the string analysis of these samples, you are going to find strings related to keyloggers. Other than that, you are going to find the strings listed next, which, if we search on Google reveals that the sample belongs to the XtremeRat RAT family.
  • XTREMEBINDER

  • SOFTWAREXtremeRAT

  • XTREMEUPDATE

  • Xtreme RAT

The following is a list of some of the popular RAT families. As an exercise, obtain samples from these malware families and analyze them. See if you can apply the various tricks you have learned to classify them as RATs and even identify the families they belong to.
  • njRat

  • Darkcomet

  • AlienSpy

  • NanoCore

  • CyberGate

  • NetWire

Ransomware

Ransomware is one of the most popular categories of malware that always seems to be trending these days. Ransomware has existed since 1989. Ransomware was rarely seen by the security industry for quite some time, but they poured in heavily from 2013 onward. Now there are probably thousands of ransomware, and many of them haven’t even been categorized into a family. The earliest ransomware only locked screens at system login. These screen-locking ransomware could easily be disabled by logging in as administrators and removing the persistence mechanism which launched the ransomware during logins.

In the current day scenarios, ransomware encrypts various important files on your system like documents, images, database files, and so forth, and then seek ransom to decrypt the files. This ransomware is popularly known as crypto-ransomware. The situation with this ransomware is similar to someone locking your door with an extremely strong and unbreakable lock and then demanding a ransom for giving you the key to unlocking it.

Identifying Ransomware

Ransomware identification is relatively quite easy compared to other malware, and all it takes usually is to run the ransomware sample to reveal that you are dealing with a ransomware sample. Ransomware is loud enough and inform their victims through ransom notes that they were successful in hijacking the system. Most ransomware does not even bother to stay in the system using persistence and even delete themselves after execution as their job is done after encrypting the files on the victim machine. Unlike other malware, ransomware has a clear visual impact. Screen-locker ransomware locks the desktop and asks the victim for ransom to unlock. Crypto-ransomware encrypts files and displays ransom messages. It’s pretty straightforward to identify them.

Ransomware can also be identified by ProcMon event logs or APIMiner logs. With ProcMon logs, you can easily identify ransomware, as you see a huge number of file access and modifications by the sample ransomware process, which is indicative of ransomware behavior. Most ransomware target files with extensions like .txt, .ppt, .pdf, .doc, .docx, .mp3, .mp4, .avi, .jpeg, and so forth.

Similarly, if we look into APIMiner logs, we see a lot of CreateFile, and WriteFile API calls by the ransomware process for files with the file extensions.

As an exercise, run Sample-15-21 from our samples repo using APIMiner and ProcMon and check for the presence of file access and modifications and other file related API calls.

Another method to identify and classify a sample as ransomware is to use deception technology or use decoys. As an exercise, create some dummy files on your system in various directories like your Documents folder, the Downloads folder, the Pictures folder, the Videos folder, and so forth. These dummy files are decoy files whose goal is to lure a ransomware sample into encrypting them. Create multiple decoy files with different names and file extensions. We created decoy files with names decoy.txt, decoy.pdf, and decoy.docx and placed them in the following locations.
  • Created a Documents folder with name 1 in this folder and placed some decoy files here.

  • Repeated the same steps from (1) in the My Pictures and C: folders

  • Repeated the same steps from (1) in the current working directory from where we run our malware

You can create as many decoys as possible. Some ransomware might kill tools like ProcMon or Process Explorer or won’t execute in their presence. But if you create decoys, you won’t need these tools to analyze the ransomware. After creating this decoy setup, snapshot your analysis VM so that you can restore it if you want to test another ransomware sample later.

Now run the ransomware Sample-15-21 from our samples repo. Once we run this sample, you notice that the sample encrypts the decoy files and adds a file extension suffix to the files .doc, as shown in Figure 15-4 and leaves behind a file Read__ME.html in the folder, which is the ransom note.
../images/491809_1_En_15_Chapter/491809_1_En_15_Fig4_HTML.jpg
Figure 15-4

Our decoy files encrypted by Sample-15-21 and the ransom note file left behind

The file left behind Read__ME.html is the ransom note, as seen in Figure 15-5.
../images/491809_1_En_15_Chapter/491809_1_En_15_Fig5_HTML.jpg
Figure 15-5

Ransom note displayed to the victim by Sample-15-21 by means of an HTML file

Most of the times when a victim is infected by a ransomware, there are fewer chances that you can decrypt the files encrypted by the ransomware unless he pays the ransom. A solution in such cases is to restore the files from backup. Windows has a feature called Volume Shadow Copies (VSS), which backs up files and volumes on the file system. Ransomware tends to delete these volume copies so as not to leave the victim any option to restore the files on the system.

Ransomware can delete these volume copies using the vssadmin.exe command provided by Windows OS. The command usually takes the form vssadmin.exe Delete Shadows /All /Quiet. You see this process being launched by ransomware when you analyze it dynamically, with both ProcMon and APIMiner. You might also notice this command in strings in the unpacked malware. You might also see other commands, for example, bcdedit.exe /set {default} recoveryenabled no, which is meant to disable automatic recovery after the system boot.

Strings in Ransomware

Ransomware can be identified by using strings from string analysis, either static or dynamic, dynamic in case if the ransomware sample is packed. As an exercise, analyze Sample-15-22 for strings. As listed next, we can see strings containings commands frequently used by ransomware to delete backup file copies and disabling automatic recovery on boot, as discussed.
  • vssadmin.exe delete shadows /all /quiet

  • bcdedit.exe /set {default} recoveryenabled no

  • bcdedit.exe /set {default} bootstatuspolicy ignoreallfailures

Other than the sets of strings, ransomware also keeps a whitelist of file extensions that they want to encrypt, which usually manifests in our string analysis as a set of consecutive file extensions. Apart from that, ransomware can also have strings related to ransom notes and ransom file names as well.

Many times, ransom file names or other strings in the sample can even point you to the exact ransomware family. Case in point is samples Sample-15-23, Sample-15-24, Sample-15-25, and Sample-15-26, all of which belong to the same GandCrab ransomware family. If you analyze these samples for strings, you find strings which are listed as follows. This indicates that the malware family for this ransomware is GandCrab.
  • CRAB-DECRYPT.txt

  • gand

  • GandCrab!

  • -DECRYPT.html

  • GDCB-DECRYPT.txt

  • RAB-DECRYPT.txt

  • GandCrab

The following lists some popular ransomware families. As an exercise, download samples from these malware families and apply the tricks and the techniques we discussed in this chapter to classify and identify their families.
  • CovidLock

  • Cryptolocker

  • CTB-Locker

  • TorrentLocker

  • SamSam

  • Wannacry

Cryptominer

You have likely heard about cryptocurrencies. The most popular ones are Bitcoin, Monero, Ethereum, Litecoin, Dash, Ethereum Classic, Bitcoin Gold, and Dogecoin. Mining cryptocurrencies is resource-consuming and needs a lot of computing power. Attackers who want to make a quick buck by mining cryptocurrencies found another source of free computing power, which are their victims’ computers on which they can install their malware (a.k.a. cryptominers) to run and mine cryptocurrencies. Free computing power and no electricity bill is awesome. Thousands of computers are infected with cryptocurrency mining malware, and you have supercomputer equivalent computing power at your fingerprints.

Most cryptomining malware makes use of free and open source tools to mine cryptocurrencies. To identify cryptominers, one popular method that you can use is to check for open source cryptomining tools by the sample, in the events generated by ProcMon and API logs from APIMiner. Alternatively, you can also carry out string analysis on the samples and search for the presence of these open source cryptomining tools in the strings that should help you classifying if the sample is a cryptominer.

As an exercise, run Sample-15-27 from the samples repo with the help of ProcMon and APIMiner, and you notice that it drops the open source tool xmrig.exe and runs it using the command seen in Listing 15-8. The string 49x5oE5W2oT3p97fdH4y2hHAJvANKK86CYPxct9EeUoV3HKjYBc77X3hb3qDfnAJCHYc5UtipUvmag7kjHusL9BV1UviNSk/777 in the command is the cryptominer wallet IDs of the attacker.
$ xmrig.exe -o stratum+tcp://xmr-eu1.nanopool.org:14444 -u 49x5oE5W2oT3p97fdH4y2hHAJvANKK86CYPxct9EeUoV3HKjYBc77X3hb3qDfnAJCHYc5UtipUvmag7kjHusL9BV1UviNSk/777 -p x --donate-level=1 -B --max-cpu-usage=90 -t 1
Listing 15-8

xmrig Open Source Mining Tool Dropped and Run by Cryptominer Sample-15-27

As an exercise, analyze Sample-15-28, Sample-15-29, and Sample-15-30 for strings. Out of these three samples, Sample-15-28 and Sample-15-29 are UPX packed, and you can statically unpack it using CFF Explorer’s UPX utility and generated unpacked files, on which you can then carry out static string analysis. Sample-15-30 is packed as well, but you have to carry out dynamic string analysis on this sample by running it and extracting the strings using Process Hacker, as you learned in Chapter 7 and Chapter 13. If you analyze the strings in these samples, you notice various strings related to mining pools, cryptocurrency wallets, and cryptocurrency algorithms, which are enough to classify the samples as cryptominers. The following are strings obtained from dynamic string analysis on the preceding samples that reveal various cryptomining-related strings that point to our samples being cryptominers.
  • minergate

  • monerohash

  • suprnova

  • cryptonight

  • dwarfpool

  • stratum

  • nicehash

  • nanopool

  • Xmrpool

  • XMRIG_KECCAK

  • Rig-id

  • Donate.​v2.​xmrig.​com

  • aeon.pool.

  • .nicehash.com

  • cryptonight/0

Virus (File Infectors)

Viruses, also known as file infectors or parasites, were the first malware to be created in the world of malware. Viruses work by modifying clean or healthy executables in the system and transform them into viruses. Now the healthy executable has changed into a virus and is capable of infecting other healthy files. Viruses have reduced a lot over time, and most antivirus catch 100% of some of these file infector malware families. File infectors were more popular in XP days and have reduced a lot now, but it’s always good to know how to classify them.

As we all know, Windows executable follows the PE file format, and a PE file starts execution from the entry point. A PE file infector can append or rather add malicious code to a clean PE file and then alter the entry point to point to the malicious code that it has added to the file. This process is called PE infection, as illustrated in Figure 15-6.
../images/491809_1_En_15_Chapter/491809_1_En_15_Fig6_HTML.jpg
Figure 15-6

How a PE Infection from a virus transforms a health file into malware

As seen in the diagram, the malware modifies a healthy PE file by adding a new malware section to the PE file. It then modifies the entry point in the PE header of the malware file, which was earlier pointing to .text section to now point to the malware code in the newly added malware section. When the user executes the infected file, the code in the malware section is executed. Then the code is again redirected to the .text section, which was supposed to be executed before the infection of the file.

During the execution of a healthy file infected by a virus, both the malicious and clean codes in the file are executed. So if a victim starts a notepad that is infected by a file infector, he sees only the notepad and does not realize that the file infector code has also executed.

There can be many types of file infectors, and it is not necessary that all of them only patch the entry point. The malware can also patch/modify/add code to other parts of a healthy PE executable, as long as they lie in the execution path of the program’s code. You might be wondering how this is all different from code injection. The difference is code injection occurs in virtual memory of a live process, but PE file infection occurs on a raw file on the disk.

Here is how you should look at identifying file infectors.
  • File infectors, just like ransomware, alter a lot of files, but the difference lies in that they only alter executable files and are least interested in other types of files. This should manifest as file access/modification events in tools like ProcMon and APIMiner.

  • Unlike ransomware, which alter the extensions of files that they modify/encrypt, the file extensions of files altered by file infectors are never changed.

  • You need to compare the clean version of the executable file with the modified version to see the changes made by the infector.

  • Reverse Engineering is an option.

  • Looking for strings in file infector may be deceiving as it can so happen that you might have got a file that was clean earlier and has been infected by a virus. You look into strings from the clean code and data portions of the file, and you might end up identifying the sample as a clean system file.

Do note that Virus or File Infection is a technique to spread malware and stay persistent. There is another payload that is executed that might contain the true functionality of the malware.

Summary

Malware is plenty in number, and the antivirus industry has devised a way to classify them into various categories and has devised naming schemes that group them into families. In this chapter, you learned how this classification of malware into various categories and families is accomplished. We went through the various use-cases on why classification is important both for malware analysts and other anti-malware vendors. Using hands-on exercises, we explored the working of various types of malware and learn tricks and techniques that we can apply to classify them and identify the family they belong to.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.108.241