CHAPTER
1

Malware Analysis 101

 

So you want to learn how to analyze malware? Well, you picked up the right book. But before I go into the meat of analyzing malware, it is important to know and understand several things that will be key in effectively analyzing malware.

This chapter will get you started on the right path to malware analysis by establishing the needed foundational knowledge to effectively analyze malware.

The chapter will tackle the two types of malware analysis, as well as its purpose, its limitations, and the malware analysis process itself. The chapter will then conclude by discussing what is needed to become an effective malware analyst.

Malware Analysis


Malware analysis is the process of extracting information from malware through static and dynamic inspection by using different tools, techniques, and processes. It is a methodical approach to uncovering a malware’s main directive by extracting as much data from malware as possible while it is at rest and in motion.

LINGO

A malware at rest is a malware that is not running in a target environment, while a malware in motion is a malware that is running in a target environment.

Data is extracted from malware through the use of data extracting and monitoring tools. The techniques and processes needed to successfully gather data from malware differ depending on the malware’s capability; they adapt to the changing malware landscape. This is why malware analysis is considered an art. The tools are your paintbrushes, the techniques and processes are your drawing style, and the malware is your subject. How effectively you use those tools and how well the techniques and processes are applied and refined will reflect how well the malware subject is pictured. It’s a picture you create to show the malware’s true identity stripped of all its protective mechanisms and revealing only its darkest and deepest malicious directive. The artist becomes better at her craft through practice and by gaining knowledge, skills, and experience as time goes by. The same concepts apply to the malware analyst and researcher. With continuous practice and exposure to different malware, the malware analyst becomes more knowledgeable, skillful, and experienced.

Malware Analysis and Reverse Engineering

I always separate malware analysis from reverse engineering, although many consider reverse engineering as malware analysis. The way I see it, the two require two different sets of skills. Malware analysis is more about the mastery of different tools, techniques, and processes to extract as much information from malware without disassembling or decompiling it and to make malware function in a controlled environment for the purpose of monitoring and collecting data that can be used to understand the malware’s true directive.

LINGO

Disassembling is the process of breaking down a binary into low-level code such as assembly code, while decompiling is the process of breaking down a binary into high-level code such as C or C++.

Malware reverse engineering, on the other hand, is the process of breaking down malware into low-level lines of code, usually assembly code, to fully understand its function. It requires that you can read and understand low-level code. This is where knowledge of assembly language becomes crucial. It is needed to decipher and read low-level code. Without this knowledge, it will be impossible to trace each line of code’s execution flow let alone understand what each line of code actually means.

Malware reverse engineering complements malware analysis. It is usually the last resort when malware analysis fails to extract the needed data from malware to paint an accurate picture. As we will discuss later in the “Limitations of Malware Analysis” section, reverse engineering is an important activity that serves as an addition to the malware analysis process.

NOTE

A combination of malware analysis and reverse engineering usually exposes new malware techniques that are otherwise not visible through malware analysis alone.

Types of Malware Analysis

As previously defined, malware analysis is the process of extracting information from malware through static and dynamic inspection by using different tools, techniques, and processes. Given that inspection or analysis can take place regardless of whether the malware is static or dynamic, it is only appropriate that the two types of malware analysis are called static analysis and dynamic analysis.

LINGO

Static malware is malware at rest. Dynamic malware is malware in motion.

Static Analysis

Static analysis is the process of extracting information from malware while it is not running. Typically, malware is subjected to different static analysis tools, which will be detailed in succeeding chapters of the book that are designed to extract as much information as possible from the malware. The information that is collected can range from the simplest, such as the malware’s file type, to the most complicated, such as identifying maliciousness based on non-encrypted code or strings found in the malware. Static analysis is the easiest and least risky malware analysis process. It is the easiest because there are no special conditions needed for analysis. The malware is simply subjected to different static analysis tools. It is as easy as clicking some buttons or using a command line. It is less risky because the malware is not running during static analysis; therefore, there is no risk of an infection occurring while analysis is taking place.

Another thing that makes static analysis less risky is the availability of tools in other operating system, such as Linux, that can be used to statically extract information from Windows files. Statically inspecting a malware in an operating system where the subject malware is not intended to execute eliminates the malware’s ability to run and wreak havoc in the system.

But not all static analysis tools that Windows offers have a counterpart or equivalent tool in other operating systems. This leaves the researcher no choice but to use those Windows-based tools. If this is the case and just an added precaution, Linux-based systems can still be used to run Windows-based static analysis tools by using WINE in Linux. WINE is short for Wine Is Not an Emulator. WINE makes it possible to run Windows programs under Linux-based systems, such as Ubuntu and Debian. WINE acts as a middleman and translator between the Windows-based program and the Linux-based operating system.

TIP

Using WINE in Linux to run Windows static analysis tools also means that the subject malware can run using WINE. Utmost care must still be practiced at all times even if the malware’s capability is limited because of the absence of some aspects of Windows that the malware needs to achieve its directive.

Static analysis is considered to be a low-risk, low-reward process. It is low risk, but it yields less promising results because information gathering is based solely on what can be seen while the malware is inactive. The information gathered is limited and does not reveal that much about the malware’s directive. Most of the time, malware reveals its true nature only while it is running.

Dynamic Analysis

Dynamic analysis is the process of extracting information from malware while it is running. Unlike the limited view static analysis provides of the malware being analyzed, dynamic analysis offers a more in-depth view into the malware’s functions because it is collecting information while the malware is executing its functions and directives.

To conduct dynamic malware analysis, two things are needed:

Images   Malware test environment

Images   Dynamic analysis tools

A malware test environment is a system where malware is executed for the purpose of analysis. It is designed to satisfy most, if not all, of the conditions for a malware to run. It must consist of an operating system that the malware is written for and must have most, if not all, of the dependencies the malware needs to execute properly.

Dynamic analysis tools, also known as system monitoring tools, are the ones monitoring the malware test environment for any changes made by the malware to the target system. Some of the changes that are monitored and recorded include changes in the file system, modifications in configuration files, and any other relevant changes that are triggered by the malware’s execution. The dynamic analysis tools also monitor inbound and outbound network communications and any operating system resources used by the malware. With these tools, the analyst is able to understand what the malware is trying to do to the target system.

A fully implemented malware test environment with the appropriate dynamic analysis tools is also known as a malware sandbox. A malware sandbox is where an analyst can run and observe a malware’s behavior. A malware sandbox can be a single system or a network of systems designed solely to analyze malware during runtime.

LINGO

Malware sandbox, malware test environment, and dynamic analysis lab are different names given to a system where malware is executed for the purpose of analysis.

Unlike static analysis, dynamic analysis is a high-risk, high-reward process. The risk of infection or something wrong happening is high because the malware is running; the reward is high because the malware reveals more of itself during execution. But do not let the high-risk part of dynamic analysis scare you because this is manageable and just takes common sense. There are precautions that can be taken to minimize if not completely eliminate any risk of infection. One of them is to make sure that the system used for dynamic analysis is fully isolated from any production systems and network. I’ll talk more about dynamic analysis precautions in Chapter 8 once we start explaining how to build your own dynamic analysis lab or malware test environment.

Purpose of Malware Analysis

Malware analysis is an important skill to have in today’s interconnected world. It is a nice skill to have for anyone who uses a computer and an essential skill to have for information security professionals. With the influx of malware, malware analysis has become an indispensable tool for information technology professionals. This is because it has an important purpose, which is to understand malware for the purpose of stopping it.

From this understanding of malware behavior, you can accomplish many things, such as formulating a solution to prevent the malware from spreading further or compromising new target systems, detecting the malware’s presence on compromised systems, and remediating the infection caused by the malware by completely eradicating the malware’s hold on the infected systems.

Aside from formulating a solution for the specific malware, data collected can be used to gain a deeper understanding of malware in general. It gives researchers the knowledge to come up with proactive solutions to combat the onslaught of malware. And in some cases, these data can be used to formulate actionable intelligence that can be used by different law enforcement agencies to investigate and capture the people behind the attack.

Prevent the Spread of Malware

Preventing the malware from spreading further is usually made possible by understanding how the malware gets into the system. If you understand the infection vectors used by malware, you can intercept or block the malware’s main infection method of compromising the target system.

LINGO

Infection vector refers to the technology used by attackers to deliver or deploy malware into a target system.1 The most common technology used is e-mail.

Preventing the spread of malware is a good first step when malware is discovered in a target system. It is like first-aid: Stop the bleeding first while waiting for the paramedics to arrive.

Identifying the malware’s infection vector is easier said than done. In most cases, it is hard to determine how the malware got in. In cases like this, the only way to prevent the spread of malware is to understand how it infects other systems. Does it require a network share? Does it rely on a specific e-mail client to spread? The information to answer these questions can be pieced together from the data gathered during malware analysis.

Detect the Presence of Malware

The data collected from malware analysis can be used as signatures or fingerprints to detect the malware in the network and in the host. When it comes to detecting the malware’s presence in the host, the following are the most common:

Images   Host changes

Images   Code snippet collected from the malware code

Host changes made by the malware can be used to spot malware infection or detect the presence of malware, especially if the detection technology used to look for possible infections is a system scanner. A system scanner is any tool that scans the whole system for changes in the file system, registry, or operating system configuration files.

For example, a researcher analyzed a malware that drops a copy of itself in the Windows Startup folder with the name IAMNOTAVIRUS.EXE. The researcher adds this data to a signature database that is used by a system scanner. If during a routine scan, the system scanner finds a file in the Windows Startup folder named IAMNOTAVIRUS.EXE, it will recognize this as a match in its signature database, and it will alert the user of an infection.

Using a code snippet to create a malware signature is the most common practice in the industry. File and memory scanners utilize this kind of signature database that contains malware code snippets. This is the foundation of antivirus (AV) products. The catch here is that the code snippet must be from an unencrypted malware code; otherwise, it will cause a lot of false alarms.

LINGO

False alarms are divided into two types: false positives and false negatives. False positive means that the scanner identified a benign file to be malware. False negative means that the scanner identified another malware with a different malware name.

TIP

The best code snippet signature is from an unencrypted malware code.

The file and memory scanners scan every file in the system and those that are loaded in memory for a possible match in its signature database. If there is a match found, it alerts the user of an infection.

LINGO

Signature database, scan patterns, and antivirus definitions all mean the same thing.

When it comes to detecting the presence of malware in the network, the most common indicator is its network communication. The data collected from network sniffers such as Wireshark is a good source for detecting the presence of malware while it is traversing the network or while it is communicating to the different network resources it needs.

For example, during dynamic analysis, say a researcher was able to collect domains communicated to by the subject malware. These domains, upon further research and analysis, are determined to be network resources being used by malware, so they can be used to spot systems in the network that are possibly infected by malware. Any system that shows any sign of communication to and from these domains can be flagged as a possible compromised system.

These are just a few examples of how data from malware analysis can be used to detect the presence of malware. The main idea here is that every footprint the malware leaves behind can be used as an indicator of its presence in the host and in the network.

Remediate the Malware Infection

After the presence of malware has been detected, the next step is to remove it from the system. This is where the data regarding host changes becomes really important. If the analyst or researcher is able to identify the host changes, there is a chance that it can be reversed and the operating system restored to a “pre-infected” or clean state. This is the ideal scenario. In most cases, infection is hard to reverse. The malware has embedded itself so much into the system that removing it will corrupt the operating system, rendering the system unusable. This leaves system administrators little choice but to restore the system using a backup or to completely rebuild the compromised system from scratch.

Clean tools use host changes to reverse the effect of malware. There are usually two types of clean tools: generic and specific. Generic clean tools reverse host changes that are common to most malware, while specific clean tools are tailored to reverse changes done by a specific malware family or its variant. Specific clean tools are often created for malware that is involved in an outbreak or has affected numerous systems worldwide.

Advance Malware Research

Data gathered from malware analysis helps researchers determine new techniques malware uses to compromise the target system, new malware technologies being used by the attackers, and new vulnerabilities being exploited by malware. These data sets are used more to beat the malware by understanding the current threat landscape and to predict how it will look in the near term and long term. This enables researchers to come up with new ways of preventing, detecting, and remediating malware infections. Academia is full of research papers discussing new ideas on how to stop malware, and most of these papers are good examples of how data gathered from malware analysis can be used to further advance research on the war against malware.

Produce Actionable Intelligence

In addition to using this gathered data for advanced malware research, law enforcement agencies have special interest in malware analysis data if it is enough to produce actionable intelligence that can be used as evidence against threat actors. In some cases, threat actors are sloppy enough that data collected from malware analysis can link them to an attack campaign.

LINGO

Threat actors are also known as attackers.

Limitations of Malware Analysis

Although malware analysis is key in understanding malware’s true nature, it also has its limitations. In static analysis, data gathering is effective only if the malware is in its true form free from any type of encryption or obfuscation. This is why malware decryption, deobfuscation, and unpacking are big. They totally eliminate or at least minimize static malware protection that hinders static analysis. To mitigate this limitation, encrypted malware is subjected to different tools and methods that help reveal the decrypted malware code before being subjected to static analysis.

LINGO

Encrypted malware is a catchphrase that includes not only encrypted malware but also obfuscated and packed malware.

Dynamic analysis is all about making the malware successfully execute in a controlled environment. Therefore, its limitation is because of the different malware dependencies that enable it to run successfully in a target system. They are the following:

Images   Program dependencies

Images   User dependencies

Images   Environment dependencies

Images   Timing dependencies

Images   Event dependencies

If one of these dependencies is not satisfied, the malware may not execute any or all of its functions. If the malware does not run because of an unsatisfied condition or dependency, no data will be collected during dynamic analysis. If the malware does run but some of its routines or functions are dependent on some of the factors mentioned, no data will be collected from those functions, which can result in an incomplete picture of the malware’s behavior. This is the main reason why some malware analysis systems produce little data compared to other analysis systems. The malware analysis system that can satisfy more malware dependencies will produce more malware analysis data.

For example, automated dynamic analysis systems, also known as automated malware sandboxes, can record only the malware behavior that the malware exhibits during the short amount of time it is running. Some functions that require special conditions, such as a user logged in to Facebook, an e-mail client running, network connection conditions such as home network versus corporate network, or even those that require hours to start executing their directive, will not be executed by the malware; thus, no visible behavior can be observed, resulting to zero data gathered for those special functions. If the analyst opts to not use an automated sandbox but instead use more of an interactive malware environment that she has control of and can be active as long as needed, it will still be hard for her to see all possible execution paths of the malware, especially if she does not have knowledge of the triggers or dependencies.

NOTE

Reversing, a short term for malware reverse engineering, uncovers these special conditions and helps improve malware sandbox implementations.

The Malware Analysis Process

Malware analysis is an art. Depending on the researcher’s knowledge and skill, she might approach analyzing malware using different techniques and methods. The approach an analyst takes is often influenced by the experience she has gained through years of analyzing different kinds of malware. One researcher’s approach might differ slightly or greatly with another researcher’s but yield similar results. One thing is certain, no matter what techniques and methods a researcher or an analyst employs, the malware analysis process can be represented succinctly, as shown in Figure 1-1.

Images


Figure 1-1   Malware analysis process.

Figure 1-1 shows that the malware file, represented as VX (which is a common term used in the industry to symbolize virus in the past and now malware as a whole), is subjected into static analysis. The data, which can be used to generate usable information, gathered from static analysis is collected and added to artifacts, code, and other information collected about the malware. The end goal here is to identify the malware’s main directive and to create a solution for the malware to prevent it from spreading, detect its presence, and remediate, if possible, the malware infection. In most cases, static analysis is not enough, so the malware is subjected to dynamic analysis, where malware undergoes further scrutiny while it is running on a target system. The data gathered from dynamic analysis is added to the data collected from static analysis. If the data gathered from both static and dynamic analyses is enough to produce actionable intelligence to prevent the spread of malware, detect its presence, and possibly remediate compromise systems, then the analysis process usually ends. But if the malware proves to be difficult and static analysis and dynamic analysis yield poor results, reverse engineering is done to get to the bottom of what the malware really is.

TIP

When analyzing malware, there is no need to choose between the two types of static analysis or dynamic analysis. They complement each other and should be done one after the other.

As previously stated, the first step in the analysis process is to subject the malware to static analysis. If the collected information is enough to understand the malware and formulate a solution for it, then subjecting the malware to dynamic analysis becomes an optional or nice-to-have task. But this is rarely the case. Almost all the time, dynamic analysis is needed to collect more information to determine the malware’s directive and formulate a solution based on dynamically gathered data from the malware sandbox. But if static and dynamic analyses prove to be not enough to understand the malware because of its complexity and sophistication, then reverse engineering becomes the last resort.

Manual Malware Analysis

During the early years of malware, when everything was still called viruses, malware analysis was mostly done by hand. A handful of tools, a single isolated system, and a lot of patience were all that was needed. During that time, this was the best way to analyze malware. Fast-forward to the present, and in my humble opinion, this is still the best way to analyze malware. Most of the techniques, methods, and concepts are still the same. The tools are better, and the test environment has expanded. The single system is now an isolated network of systems and, if needed, has a restricted Internet connection. When I was at Trend Micro, we used the term infect machine to describe the single isolated system used to analyze malware during the DOS era and the term superlab to describe the network of systems used to manually analyze modern malware that you see today.

Performing manual malware analysis is always required, especially if the malware is noteworthy. This gives the analyst time and total control of the environment where malware is being executed for the purpose of analysis.

LINGO

Noteworthy malware is malware that exhibits new technology or is currently found in the wild. In the wild describes malware that is currently and actively infecting systems.

The beauty of manual malware analysis is that the researcher has full control of the timing. The researcher can choose how long a malware must be executed and not be limited by time constraints used in automated malware analysis systems. In automated malware systems, the execution time for malware can range from 30 seconds to a few minutes. This is not helpful, especially if the malware has a sleep function. A sleep function is a malware routine that allows it to be dormant for a period of time before it starts executing. It can be minutes, or it can be days. A malware with sleep function is the attacker’s defense against automated malware analysis.

Manual malware analysis also enables the researcher to be more interactive with malware. The researcher can execute programs and even log in to banking sites, web-based e-mails, or social media. This is helpful especially if the malware has program dependencies or works only when a user tries to log in to an online resource. This is mostly true for keyloggers. The information-stealing routine of a keylogger gets activated if the malware believes that the user is logging in to a website that it is targeting to steal credentials from.

TIP

Just to be clear, when logging in to a supposed banking site, do not use a real banking site and do not use a real login credential. An internally controlled fake bank site must be done for this. For webmails and social media, a dummy account will work.

Automated Malware Analysis

As the years go by, the onslaught of malware has become alarming. The number of malware samples seen every day is astounding. Figure 1-2 shows that the number of malware discovered by mid-2014 already exceeded that discovered in 2013.

Images


Figure 1-2   Number of malware discovered from 1984 to 2014. (Source: AV-Test.ORG.)

As of June 16, 2014, the number of malware that has been discovered is about 230 million. This equates to about 1.4 million unique malware samples per day, and that is already about 50 million more than all malware discovered in 2013. Note that this is discovered malware. It does not account for malware that has not been discovered yet. There could be millions more out there that is still enjoying the luxury of not being found.

With this fact, manual malware analysis is not feasible anymore. It does not scale to handle this amount of malware, and even if all the researchers in the world combined to tackle this amount of malware on a daily basis, our efforts would still not be enough. This is why the process of malware analysis became automated. Manual malware analysis is now called upon only when a malware is considered noteworthy or if the automated malware analysis systems are not able to produce any results.

An automated malware analysis system consists of multiple malware test environments or malware sandboxes. Malware samples are thrown at these sandboxes, where the malware is executed and monitored for a specific amount of time. As mentioned in the previous section, this can range from 30 seconds to a few minutes. The more sandboxes there are, the more malware the automated system can process. The processing is done in parallel, so if an automated system has 10 sandboxes and each is configured to run malware in 30 seconds, then it can process 10 malware in 30 seconds, which equates to 28,800 malware processed per day (assuming an ideal situation where each system is utilized and there is no downtime).

LINGO

Automated malware analysis systems are also known as automated sandbox systems or simply sandbox. The term sandbox is widely used to describe automated systems because it is expected that a sandbox is always part of an automated malware analysis system.

Static analysis complements dynamic analysis. I cannot stress this enough, but unfortunately some automated malware analysis systems do not utilize static analysis and proceed directly to dynamic analysis. In my humble opinion, this is a waste of sandbox resources. Static analysis is still needed not only to gather static information from malware but also to provide intelligence to the whole automated malware analysis system. One of the ways it does this is by determining whether the malware needs a special sandbox implementation. For example, if the analyst has different sandbox flavors and implementations, it is important to know which of those flavors and implementations will work well for the malware. This intelligence can be provided by static analysis. So instead of subjecting the malware blindly to the next available sandbox, the automated system, through the intelligence provided by the static analysis, can assign the malware to the appropriate sandbox, thus increasing the chances of a successful dynamic analysis session. Figure 1-3 shows an automated sandbox implementation taking advantage of static analysis.

Images


Figure 1-3   Automated sandbox implementation taking advantage of static analysis.

Before going into a detailed discussion of the automated sandbox implementation shown in Figure 1-3, it is important to note that not all files subjected to malware analysis, be it manual or automated, are malware. Most of the files are suspicious files that end up being proven to be a non-malware or benign file. This is the most important use case of malware analysis: to determine whether a file is malicious and, if it is malicious, gather as much data as possible to generate important information and actionable intelligence that will enable the analyst to prevent the spread of, detect the presence of, and remediate infection caused by malware.

In a typical scenario, a suspicious file is subjected to static analysis first. In this stage of the analysis process, all data that is collected from the file is processed. A typical static data analysis pre-processing determines the following:

Images   Has the file been processed before?

Images   Does it match any known benign files?

Images   Does it match any known malicious files?

Images   Does it require any special sandbox implementations?

The first two are the most common reasons for a file to be dropped and not processed by the automated malware analysis system anymore. Answering the first two questions is made possible by identifying duplicate files and having a whitelist database as part of the automated malware analysis implementation. A whitelist database is a database of file hashes that are known to be benign. The hashes usually come from files of different operating systems and popular software.

TIP

A whitelist database is not perfect, so it is always advisable to use other indicators of being benign to reinforce file determination.

The third question also causes a file to be dropped, especially if there is an exact match or duplicate that has already been processed. An exact match here can be an exact hash match; i.e., the files have the same MD5 hashes.

NOTE

A typical automated malware analysis system drops files that are duplicates regardless of whether they match a whitelist or database hash of known malware.

But when it comes to matching with known malicious files, some implementations use malware families and classes.

LINGO

A malware family is a group of malware that behaves in the same way. A family can be divided into different variants, especially if a new generation has different functionality than the previous ones. For example, the Conficker family of malware has three known variants: Conficker.A, Conficker.B, and Conficker.C.

LINGO

A malware class or malware type is a group of malware with similar malicious behaviors or directives. For example, a malware that deletes all files on a hard disk and another that formats the hard drive can both be classified as a Trojan because of their destructive nature.

If a file matches a family of malware, an automated sandbox can be configured to not process it anymore because indicators of compromise (IOC) were already collected from previous family members.

LINGO

Indicators of compromise (IOCs) are host and network footprints of malware that can be used to determine whether a system has been infected or compromised.

An automated sandbox can also be configured to process the file to determine whether it is a new variant or whether it has new features that were not present in other family members that were processed before.

If a file matches a malware class, the sample is seldom dropped. The malware class usually helps to determine what to monitor or look for during dynamic analysis. In some cases, it can be used to determine a specific sandbox implementation. For example, if a sample has been determined statically to have mass-mailing capabilities, it can be thrown into a sandbox that has different mail clients installed.

The main idea here is throwing the suspicious file deemed to be malware into a sandbox implementation that satisfies all of its known dependencies to get the most promising results during dynamic analysis. Having this kind of intelligence provided by static analysis data enables you to save time and sandbox resources. This makes a lot of difference in terms of resources and cycle time, especially if you are processing hundreds of thousands of suspicious files on a daily basis, and improves the efficiency of the whole automated malware analysis system.

NOTE

Static analysis can easily be beaten by packed and encrypted file. This is why file unpacking and decryption are important in the fight against malware.

Static analysis also determines whether a file is packed and encrypted. If this is the case, appropriate actions are taken to statically unpack or decrypt a file so static analysis data can be gathered. If this is not possible, no static analysis can be done, and the file is subjected to the next available sandbox where the researcher can attempt to capture memory images of the unpacked file and then subject that captured memory image to static analysis. This is an example of dynamic analysis feeding samples to static analysis for the purpose of system improvement.

In most cases, regardless of whether the file is packed, dynamic analysis yields promising results because the dynamic analysis system does not care whether a file is packed. The file will still function, and the dynamic analysis system can still capture data from the running file. But there will always be files or samples that will not yield any data through static and dynamic analyses. A well-designed automated system will flag these samples for review by malware researchers and analysts, and whatever is learned from these samples is then applied as new technology to the automated malware analysis system so it can tackle those file samples the next time around. This is how an automated malware analysis system evolves. As new malware employs evasion techniques to avoid data being gathered from them through static and dynamic analyses, the researchers take a closer look into those samples with the purpose of understanding the evasion techniques and finding ways of thwarting these newly discovered malware technologies.

The Effective Malware Analyst


The main goal of this book is to help you become an effective malware analyst. To do that, I have identified three characteristics that will help you achieve that goal.

Images   Familiarization with malware

Images   Familiarization with analysis tools

Images   Patience

Familiarization with Malware

To effectively analyze malware, you must first be familiar with what it is. An analyst must be familiar with how malware behaves, how malware operates to stay persistent, how malware protects itself, and how malware manipulates the target environment for it to execute its directives.

In reality, malware analysis may not reveal all the information about the malware because of the known limitations of malware analysis and because of the sophistication and difficulty level of the malware. The analyst may get only bits and pieces of data that she needs to connect together and make sense of. In cases such as these, familiarization with different malware characteristics enables the analyst to recognize and make an educated guess on how the malware behaves given a collection of data extracted from the malware during static and dynamic analyses. This is helpful especially, as stated previously, if the malware is extremely difficult to analyze and there is only a short amount of time available to understand what the malware is doing.

Familiarization with malware enables the malware researcher and analyst to formulate information from bits and pieces of data and not come out empty handed. This comes with experience and education. The malware can then be tested again to either prove or disprove the conclusion drawn by the researcher or analyst from the bits of data gathered.

TIP

Reading malware blogs, white papers, and detailed malware technical reports helps in increasing familiarity with different malware characteristics.

Part I of the book is all about malware. It is designed to serve as an introduction to novice researchers and analysts and a refresher to seasoned professionals. I will discuss the different classes of malware, how malware is deployed, how malware protects itself, and what the different dependencies are that malware has for it to function as designed by the attacker.

Familiarization with Analysis Tools

An effective malware analyst is someone who has the right skills and has the right tools. It has to be both. An analyst with skills but without the proper tools is like a carpenter trying to hammer a nail using a screwdriver. An analyst without skills but with the proper tools will not know what to do with them. Since you picked up this book, I would say that you are already working on your skills. To sweeten the pot, I will throw in the analysis tools as well.

This book will discuss the tools needed to analyze malware. I will start by showing how to set up a malware research lab and then go through the different static and dynamic analysis tools that will help you in becoming an effective malware analyst.

Patience

Malware analysis is not for those who get frustrated easily. As a malware analyst, you have to understand that analyzing malware will not always go the way you want it to go. There will be hiccups along the way. There will be unforeseen circumstances and challenges that might slow you down. Your limits and patience will be tested. But the key here is to recognize that this will happen, and you have to prepare yourself for it. There will be instances wherein nothing is going right and no data is being extracted from the malware. If this happens, you need to pause for a bit and give yourself some time to relax and then start tinkering again. Do not be afraid to try different things, different tool combinations, or different methods. Malware analysis is an art after all.

TIP

It is important to remember that malware analysis is not a set process, wherein you just follow a series of steps and arrive at your destination. Nothing is set in stone. Every malware analysis case can be different. The best thing to do is to recognize patterns of analysis so you can apply them as a mental template when faced with a malware analysis problem.

Recap


Malware analysis is a fun and exciting activity. The joy of discovering a new malware technology and using it against the malware can be an overwhelmingly good feeling. This chapter introduces malware analysis to the reader. It is aimed to warm you up before your journey into malware analysis. It serves as a brief introduction into malware analysis.

In this chapter, I discussed the two types of malware analysis, which are the following:

Images   Static analysis

Images   Dynamic analysis

I then proceeded to discuss the purpose of malware analysis, which includes the following:

Images   Preventing the spread of malware

Images   Detecting the presence of malware

Images   Remediating the malware infection

Images   Conducting advance malware research

Images   Producing actionable intelligence

Understanding how the malware operates enables you to achieve all of these.

You also recognized the fact that malware analysis has its limitations, which I described so you fully understand what malware analysis can and cannot do.

I also touched on the two types of malware analysis process, which are the following:

Images   Manual malware analysis

Images   Automated malware analysis

I gave an overview of each process and discussed how each of them can be used to your advantage in solving the malware problem in general.

Then I concluded by discussing the characteristics or needed knowledge, skills, and attitude of an effective malware analyst. I summarized them as follows:

Images   Familiarization with malware

Images   Familiarization with analysis tools

Images   Patience

Now let’s begin.

1 Malware, Rootkits & Botnets by Christopher C. Elisan, published by McGraw-Hill.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.211.87