Levels of Abstraction

In traditional computer architecture, a computer system can be represented as several levels of abstraction that create a way of hiding the implementation details. For example, you can run the Windows OS on many different types of hardware, because the underlying hardware is abstracted from the OS.

Figure 4-1 shows the three coding levels involved in malware analysis. Malware authors create programs at the high-level language level and use a compiler to generate machine code to be run by the CPU. Conversely, malware analysts and reverse engineers operate at the low-level language level; we use a disassembler to generate assembly code that we can read and analyze to figure out how a program operates.

Code level examples

Figure 4-1. Code level examples

Figure 4-1 shows a simplified model, but computer systems are generally described with the following six different levels of abstraction. We list these levels starting from the bottom. Higher levels of abstraction are placed near the top with more specific concepts underneath, so the lower you get, the less portable the level will be across computer systems.

  • Hardware. The hardware level, the only physical level, consists of electrical circuits that implement complex combinations of logical operators such as XOR, AND, OR, and NOT gates, known as digital logic. Because of its physical nature, hardware cannot be easily manipulated by software.

  • Microcode. The microcode level is also known as firmware. Microcode operates only on the exact circuitry for which it was designed. It contains microinstructions that translate from the higher machine-code level to provide a way to interface with the hardware. When performing malware analysis, we usually don’t worry about the microcode because it is often specific to the computer hardware for which it was written.

  • Machine codeThe machine code level consists of opcodes, hexadecimal digits that tell the processor what you want it to do. Machine code is typically implemented with several microcode instructions so that the underlying hardware can execute the code. Machine code is created when a computer program written in a high-level language is compiled.

  • Low-level languages. A low-level language is a human-readable version of a computer architecture’s instruction set. The most common low-level language is assembly language. Malware analysts operate at the low-level languages level because the machine code is too difficult for a human to comprehend. We use a disassembler to generate low-level language text, which consists of simple mnemonics such as mov and jmp. Many different dialects of assembly language exist, and we’ll explore each in turn.

    Note

    Assembly is the highest level language that can be reliably and consistently recovered from machine code when high-level language source code is not available.

  • High-level languages. Most computer programmers operate at the level of high-level languages. High-level languages provide strong abstraction from the machine level and make it easy to use programming logic and flow-control mechanisms. High-level languages include C, C++, and others. These languages are typically turned into machine code by a compiler through a process known as compilation.

  • Interpreted languages. Interpreted languages are at the top level. Many programmers use interpreted languages such as C#, Perl, .NET, and Java. The code at this level is not compiled into machine code; instead, it is translated into bytecode. Bytecode is an intermediate representation that is specific to the programming language. Bytecode executes within an interpreter, which is a program that translates bytecode into executable machine code on the fly at runtime. An interpreter provides an automatic level of abstraction when compared to traditional compiled code, because it can handle errors and memory management on its own, independent of the OS.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.140.198.173