Deadlisting

For the password, we may be able to find it in the text strings lying around the file itself. To get a list of strings from the file, we'll need to use SysInternal Suite's Strings (https://docs.microsoft.com/en-us/sysinternals/downloads/strings). Strings is a console-based tool. The list of strings at the output are printed out on the console.

We should redirect the output to a text file by running it as strings.exe passcode.exe > strings.txt:

Regardless, we still get a wrong password when we try out the strings. That being said, the strings do show us that a correct message would most likely display correct password. bye!. The list also shows a lot of APIs that the program uses. However, knowing that this was compiled using MingWin-Dev C++, it is possible that most of the APIs used are part of the program's initialization.

Disassembling the file using the IDA Pro 32-bit decompiler, we get to see the main function code. You can download and install IDA Pro from https://github.com/PacktPublishing/Mastering-Reverse-Engineering/tree/master/tools/Disassembler%20Tools. Since we are working in a Windows 32-bit environment, install the 32-bit idafree50.exe file. These installers were pulled from the official IDA Pro website and are hosted in our GitHub repository for the purpose of availability.

This file is a PE file, or Portable Executable. It should be opened as a Portable Executable to read the executable codes of the PE file. If opened using the MS-DOS executable, the resulting code will be the 16-bit MS-DOS stub:

IDA Pro was able to identify the main function. It is located at the address 0x004012B8. Scrolling down to the Graph overview shows the branching of the blocks and may give you an idea of how the program's code will flow when executed. To view the code in plain disassembly, that is, without the graphical representation, just change to Text view mode:

Knowing that this is a C compiled code, we only need to focus our analysis on the _main function. We will try to make pseudocode out of the analysis. The information that will be gathered are the APIs, since they are used in the flow of code, the conditions that make the jump branches, and the variables used. There might be some specific compiler code injected into the program that we may have identify and skip:

Quickly inspecting the functions sub_401850 and sub_4014F0, we can see that the _atexit API was used here. The atexit API is used to set the code that will be executed once the program terminates normally. atexit and similar APIs are commonly used by high-level compilers to run cleanup code. This cleanup code is usually designed to prevent possible memory leaks, close opened and unused handles, de-allocate allocated memory, and/or realign the heap and stack for a graceful exit:

The parameter used in _atexit points to sub_401450,  and contains the cleanup codes.

Continuing, we get to a call the printf function. In assembly language, calling APIs requires that its parameters are placed in sequence from the top of the stack. The push instruction is what we commonly use to store the data in the stack. This code does just the same thing. If you right-click on [esp+88h+var_88], a drop-down menu will pop out, showing a list of possible variable structures. The instruction line can be better understood as mov dword ptr [esp], offset aWhatIsThePassw:

This does the same as push offset aWhatIsThePassw.  The square brackets were used to define a data container. In this case, esp is the address of the container where the address of what is the password?  gets stored. There is a difference between using push and mov.  In the push instruction, the stack pointer, esp, is decremented. Overall, printf got the parameter it needed to display the message to the console.

The next API is scanfscanf requires two parameters: the format of the input and the address where the input gets stored. The first parameter is located at the top of stack, and should be in the format of the input followed by the address where the input will be placed. Revising the variable structure should look like this:

The format given is "%30[0-9a-zA-Z ]" , which means that scanf will only read 30 characters from the start of the input and that it will only accept the first set of characters that are within the square bracket. The accepted characters would only be "0" to "9", "a" to "z", "A" to "Z", and the space character. This type of input format is used to prevent exceeding a 30 character input. It is also used to prevent the rest of the code from processing non-alphanumeric characters, with the exception of the space character.

The second parameter, placed at [esp+4], should be an address to where the input will be stored. Tracing back, the value of the eax register is set as [ebp+var_28]. Let's just take note that the address stored at var_28 is the inputted password.

The strlen API comes right after and requires only one parameter. Tracing back the value of eax, var_28, the inputted password, is the string that strlen will be using. The resulting length of the string is stored in the eax register. The string size is compared to a value of 11h or 17.  After a cmp, a conditional jump is usually expected. The jnz instruction is used. The red line is followed if the comparison deems false. A green line is followed for a true condition. A blue line simply follows the next code block, as shown here:

Following the red line means that the string length is equal to 17. At this point, our pseudocode is as follows:

main()
{
printf("what is the password? ");
scanf("%30[0-9a-zA-Z ]", &password);
password_size = strlen(password);
if (password_size == 17)
{ ... }
else
{ ... }
}

It is more than likely that if the size of the password is not 17, it will say wrong password. Let's follow the green path first:

The green line goes down to the loc_4013F4 block, followed by the loc_401400 block that ends the _main function. The instruction at loc_4013F4 is a call to sub_401290.  This function contains code that indeed displays the wrong password message. Take note that a lot of lines point to loc_4013F4:

Here's the continuation of building our pseudocode with this wrong password function:

wrong_password()
{
printf("wrong password. try again! ");
}

main()
{
printf("what is the password? ");
scanf("%30[0-9a-zA-Z ]", &password);
password_size = strlen(password);
if (password_size == 17)
{ ... }
else
{
wrong_password();
}
}
One good technique in reverse engineering is to find the shortest exit path possible. However, this takes practice and experience. This makes it easier to picture the whole structure of the code.

Now, let's analyze the rest of the code under a 17 character string size. Let's trace the branching instructions and work backwards with the conditions:

The condition for jle is a comparison between the values at var_60 and 0.  var_60 is set with a value of 5, which came from var_5c. This prompts the code direction to follow the red line, like so:

Zooming out, the code we are looking at is actually a loop that has two exit points. The first exit point is a condition that the value at var_60 is less than or equal to 0. The second exit point is a condition where the byte pointed to by register eax should not be equal to 65h. If we inspect the variables in the loop further, the initial value, at var_60, is 5. The value at var_60 is being decremented in the loc_401373 block. This means that the loop will iterate 5 times.

We can also see var_8 and var_5c in the loop. However, since the start of the main code, var_8 was never set. var_5c was also used not as a variable, but as part of a calculated address. IDA Pro helped to identify possible variable usage as part of the main function's stack frame and set its base in the ebp register. This time, we may need to undo this variable identification by removing the variable structure only on var_8 and var_5c in the loop code. This can be done by choosing the structure from the list given by right-clicking the variable names:

Thereby, for calculating the value in eax, we begin from the lea instruction line. The value stored to edx is the difference taken from ebp minus 8.  lea here does not take the value stored at ebp-8, unlike when using the mov instruction. The value stored in ebp is the value in the esp register after entering the main function. This makes ebp the stack frame's base address. Referencing variables in the stack frame makes use of ebp. Remember that the stack is used by descending from a high memory address. This is the reason why referencing from the ebp register requires subtracting relatively:

Now, in the add instruction line, the value to be stored in edx will be the sum of edx, and the value stored from a calculated address. This calculated address is eax*4-5Cheax is the value from var_60 which contains a value that decrements from 5 down to 0.  But since the loop terminates when var_60 reaches 0, eax in this line will only have values from 5 down to 1.   Calculating all five addresses, we should get the following output:

[ebp+5*4-5ch] -> [ebp-48h] = 10h
[ebp+4*4-5ch] -> [ebp-4Ch] = 0eh
[ebp+3*4-5ch] -> [ebp-50h] = 7
[ebp+2*4-5ch] -> [ebp-54h] = 5
[ebp+1*4-5ch] -> [ebp-58h] = 3

It also happens that the values stored at these stack frame addresses were set before calling the first printf function. At this point, given the value of eax from 5 down to 1, edx should have the resulting values:

eax = 5;  edx = ebp-8+10h;  edx = ebp+8
eax = 4; edx = ebp-8+0eh; edx = ebp+6
eax = 3; edx = ebp-8+7; edx = ebp-1
eax = 2; edx = ebp-8+5; edx = ebp-3
eax = 1; edx = ebp-8+3; edx = ebp-5

The resulting value of edx is then stored in eax by the mov instruction. However, right after this, 20h is subtracted from eax

from eax = 5;  eax = ebp+8-20h;  eax = ebp-18h
from eax = 4; eax = ebp+6-20h; eax = ebp-1ah
from eax = 3; eax = ebp-1-20h; eax = ebp-21h
from eax = 5; eax = ebp-3-20h; eax = ebp-23h
from eax = 5; eax = ebp-5-20h; eax = ebp-25h

The next two lines of code is the second exit condition for the loop. The cmp instruction compares 65h with the value stored at the address pointed to by eax. The equivalent ASCII character of 65h is "e". If the values at the addresses pointed to by eax don't match a value of 65h, the code exits the loop. If a mismatch happens, following the red line ends up with a call to sub_401290, which happens to be the wrong password function. The addresses being compared to with the character "e" must be part of the input string.

If we made a map out of the stack frame in a table, it would look something like this:

0 1 2 3 4 5 6 7 8 9 A B C D E F
-60h 03 00 00 00 05 00 00 00
-50h 07 00 00 00 0e 00 00 00 10 00 00 00
-40h
-30h X X X e X e X e
-20h X X X X X X e X e
-10h
ebp

 

We have to consider that scanf stored the input password at ebp-var_28 or ebp-28. Knowing that there are exactly 17 characters for a correct password, we marked these input locations with X. Let's also set the addresses that should match with "e" to proceed. Remember that the string begins at offset 0, not 1.

Now that we're good with the loop, here's what our pseudocode should look like by now:

wrong_password()
{
printf("wrong password. try again! ");
}

main()
{
e_locations[] = [3, 5, 7, 0eh, 10h];
printf("what is the password? ");
scanf("%30[0-9a-zA-Z ]", &password);
password_size = strlen(password);
if (password_size == 17)
{

for (i = 5; i >= 0; i--)
if (password[e_locations[i]] != 'e')
{
wrong_password();
goto goodbye;
}
...
}
else
{
wrong_password();
}
goodbye:
}

Moving on, after the loop, we will see another block that uses strcmp. This time, we corrected some of the variable structures to get a better grasp of what our stack frame would look like:

The first two instructions read DWORD values from ebp-1Ah and ebp-25h, and are used to calculate a binary, AND. Looking at our stack frame, both locations are within the inputted password string area. Eventually, a binary AND is again used on the resulting value and 0FFFFFFh. The final value is stored at ebp-2Ch. strcmp is then used to compare the value stored at ebp-2Ch with the string "ere". If the string comparison does not match, the green line goes to the wrong password code block.

Using the AND instruction with 0FFFFFFh means that it was only limited to 3 characters.  Using AND on the two DWORDs from the password string would only mean that both should be equal, at least on the 3 characters. Thus, ebp-1Ah and ebp-25h should contain "ere":

0 1 2 3 4 5 6 7 8 9 A B C D E F
-60h 03 00 00 00 05 00 00 00
-50h 07 00 00 00 0e 00 00 00 10 00 00 00
-40h
-30h e r e X X X e r e X e
-20h X X X X X X e r e
-10h
ebp

Let's mode on to the next set of code, following the red line:

All green lines point to the wrong password code block. So, to keep moving forward, we'll have to follow the conditions that go with the red line. The first code block in the preceding screenshot uses the XOR instruction to validate that the characters at ebp-1Eh and ebp-22h are equal.  The second block adds both character values from the same offsets, ebp-1Eh and ebp-22h.  The sum should be 40h. In that case, the character should have an ASCII value of 20h, a space character.

The third block reads a DWORD value from ebp-28h and then uses the AND instruction to only take the first 3 characters. The result is compared with 647541h. If translated to ASCII characters, it is read as "duA".

The fourth block does the same method as the third but takes the DWORD from ebp-1Dh and compares it with 636146h, or "caF".

The last block takes a WORD value from ebp-20h and compares it with 7473h, or "ts".

Writing all these down to our stack frame table should be done in little endian:

0 1 2 3 4 5 6 7 8 9 A B C D E F
-60h 03 00 00 00 05 00 00 00
-50h 07 00 00 00 0e 00 00 00 10 00 00 00
-40h
-30h e r e A u d e r e e
-20h s t F a c e r e
-10h
ebp

 

The password should be "Audere est Facere". If successful, it should run the correct password function:

To complete our pseudocode, we have to compute the string's relative offsets from ebp-28hebp-28h is the password string's offset, 0, while the last offset, offset 16, in the string should be at ebp-18h:

wrong_password()
{
printf(" wrong password. try again! ");
}

correct_password()
{
printf(" correct password. bye! ");
}

main()
{
e_locations[] = [3, 5, 7, 0eh, 10h];
printf("what is the password? ");
scanf("%30[0-9a-zA-Z ]", &password);
password_size = strlen(password);
if (password_size == 17)
{
for (i = 5; i >= 0; i--)
if (password[e_locations[i]] != 'e')
{
wrong_password();
goto goodbye;
}
if ( (password[6] ^ password[10]) == 0 ) // ^ means XOR
if ( (password[6] + password[10]) == 0x40 )
if ( ( *(password+0) & 0x0FFFFFF ) == 'duA' )
if ( ( *(password+11) & 0x0FFFFFF ) == 'caF' )
if ( ( *(password+8) & 0x0FFFF ) == 'ts' )
{
correct_password();
goto goodbye
}
}
wrong_password();
goodbye:
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.131.238