2.1 File format

File format is one of the most important concepts you need to understand in order to understand malware. Here is a simple task for readers to perform in order to understand the concept of file format:

  1. Open a WordPad program on a Windows machine by typing wordpad in the Windows search tab.
  2. Type in this is my text in the newly opened WordPad. Then save the file with name test.rtf. When you try to save the file, a window pops up asking if you want to save the file in Rich Text Format (RTF). You can just give it the name test.rtf and save it.

 

  1. Now open test.rtf with Notepad. You can simply do this by right-clicking on test.rtf, going to Open with, and browsing and opening with notepad.exe. What do you see?
test.rtf opened in Notepad

Your text lies toward the end and the file starts with { tf1. This is how WordPad has saved whatever we wrote into it. It has saved our text in what is called RTF file format. There is other information saved in the file. For example, information about the font is stored in a tag that starts with {fonttbl. Here, the font used is calibre, as you can see in the screenshot. When you open the file with WordPad, the WordPad program parses the file format and displays the meaningful data to the user. In short, RTF file format tells the WordPad program how it should display the stored text to the user. File formats are complex structures which can have multiple substructures inside them, in a hierarchical order.

There are numerous file formats for different programs. Microsoft Word has the capability to parse DOC, DOCX, and XLS files, which follow the Object Library (OLE) file format. Similarly, the Adobe and Foxit PDF readers can read the PDF file format.

A binary or executable in Windows follows the PE file format. Microsoft Windows has a program which is called loader, that can parse the .exe with reference to the PE file structure. Loader finds out details such as which code needs to be executed first (this is called the entry point) and how the executable should be placed in virtual memory. Similarly, a Linux executable follows the ELF file format.

There is an exhaustive list of file formats on Wikipedia at https://en.wikipedia.org/wiki/List_of_file_formats.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.233.43