A quick review on how native executables are loaded by the OS

For better understanding on how packers modify files, let us have a quick review of how executable files are loaded by the operating system. Native executables are better known as PE files for Windows and ELF files for Linux. These files are compiled down to their low-level format; that is, using assembly language like x86 instructions. Every executable is structured with a header, code section, data section, and other pertinent sections. The code section contains the actual low-level instruction codes, while the data section contains actual data used by the code. The header contains information about the file, the sections, and how the file should be mapped as a process in the memory. This is shown in the following diagram:

The header information can be classified as raw and virtual. Raw information consists of appropriate information about the physical file, such as file offsets and size. The offsets are relative to file offset 0. While virtual information consists of appropriate information regarding memory offsets in a process, virtual offsets are usually relative to the image base, which is the start of the process image in memory. The image base is an address in the process space allocated by the operating system. Basically, the header tells us how the operating system should map the file (raw) and its sections to the memory (virtual). In addition, every section has an attribute which tells us whether the section can be used for reading, writing, or executing. In chapter 4, Static and Dynamic Reversing, under Memory Regions and Mapping of a Process, we showed how a raw file gets mapped in virtual memory space. The following figure shows how the file on a disk (left) would look when mapped in virtual memory space (right):

The libraries or modules containing functions required by the code are also listed in a portion of the file that can be seen in sections other than the code and data sections. This is called the import table. It is a list of API functions and the libraries it is from. After the file is mapped, the operating system loads all the libraries in the same process space. The libraries are loaded in the same manner as the executable file but in a higher memory region of the same process space. More about where the libraries are loaded can be found in Chapter 4, Static and Dynamic Reversing, under Memory Regions and Mapping of a Process.

When everything is mapped and loaded properly, the OS reads the entry point address from the header then passes the code execution to that address.

There are other sections of the file that make the operating system behave in a special manner. An example of this is the icons displayed by the file explorer, which can be found in the resource section. The file can also contain digitally signed signatures which are used as indicators if the file is allowed to run in the operating system. The CFF Explorer tool should be able to help us to view the header information and these sections, as shown in the following screenshot:

We have covered the basics so far but all these structures are well documented by Microsoft and the Linux community. The structure of the Windows PE file can be found in the following link: https://docs.microsoft.com/en-us/windows/desktop/debug/pe-format. While the structure for a Linux ELF file can be found in the following link: http://refspecs.linuxbase.org/elf/elf.pdf.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.95.38