Processor Module Architecture

As you set about designing processor modules, one of the things you will need to consider is whether the processor will be closely coupled with a specific loader or can be decoupled from all loaders. For example, consider the x86 processor module. This module makes no assumptions about the type of file that is being disassembled. Therefore, it is easily incorporated and used in conjunction with a wide variety of loaders such as the PE, ELF, and Mach-O loaders.

In a similar manner, loaders show versatility when they are capable of handling a file format independently of the processor used with the file. For example, the PE loader works equally well whether it contains x86 code or ARM code; the ELF loader works equally well whether it contains x86, MIPS, or SPARC code; and the Mach-O loader works fine whether it contains PPC or x86 code.

Real-world CPUs lend themselves to the creation of processor modules that do not rely on a specific input file format. Virtual machine languages, on the other hand, pose a much larger challenge. Whereas a wide variety of loaders (such as ELF, a.out, and PE) may be used to load code for execution on native hardware, a virtual machine typically acts as both a loader and a CPU. The net result is that, for virtual machines, both the file format and the underlying byte code are intimately related. One cannot exist without the other. We bumped up against this limitation several times in the development of the Python processor module. In many cases, it simply was not possible to generate more readable output without a deeper understanding of the structure of the file being disassembled.

In order for the Python processor to have access to the additional information that it requires, we could build a Python loader that configures the database in a manner very specific to the Python processor so that the Python processor knows exactly where to find the information it needs. In this scenario, a significant amount of loader state data would need to pass from the loader to the processor. One approach is to store such data in database netnodes, where that data could later be retrieved by the processor module.

An alternative approach is to build a loader that does nothing other than recognize .pyc files and then tells the processor module that it should handle all of the other loading tasks, in which case the processor will surely know how to locate all of the information needed for disassembling the .pyc file.

IDA facilitates the construction of tightly coupled loaders and processor modules by allowing a loader to defer all loading operations to an associated processor module. This is how the SDK’s included Java loader and Java processor are constructed. In order for a loader to defer loading to the processor module, the loader should first accept a file by returning a file type of f_LOADER (defined in ida.hpp). If the loader is selected by the user, the loader’s load_file function should ensure that the proper processor type has been specified by calling set_processor_type (idp.hpp) if necessary before sending a loader-notification message to the processor. To build a tightly coupled Python loader/processor combination, we might build a loader with the following load_file function:

void idaapi load_file(linput_t *li, ushort neflag, const char *) {
   if (ph.id != PLFM_PYTHON) {  //shared processor ID
      set_processor_type("python", SETPROC_ALL|SETPROC_FATAL);
   }
   //tell the python processor module to do the loading for us
   //by sending the processor_t::loader notification message
   if (ph.notify(processor_t::loader, li, neflag)) {
      error("Python processor/loader failed");
   }
}

When the processor module receives the loader notification, it takes responsibility for mapping the input file into the database and making sure that it has access to any information that will be required in any of the ana, emu, and out stages. A Python loader and processor combination that operates in this manner is available on the book’s companion website.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.71.6