We cannot interact physically with the systems (humans are not very well equipped to see and produce precise and fast electrical signals, are they?) and we may not want to risk our main computer platform by connecting it directly to a device under test (DUT). We will need a specialized tool for this.
In this chapter, we will look at the main tool we will use to actively attack our targets. The bluepill board we are going to use is very cheap, accessible, and can be programmed with an entirely open source toolchain. We will review what it is exactly, its hardware, its variants, and how to program it (with a little introduction to C) before actually using it to attack protocols and chips in the next chapters.
In this chapter, we will cover the following topics:
In order to be able to program and use the bluepill, it is essential to have the following:
For the examples, you will require the following:
You may want to also buy or find components that are using the same protocol but that are slightly different, so as to train yourself in adapting the examples.
In terms of the compilation of programs and flashing, install the following (for a Debian-based system):
Note
Please be aware that the version that your distribution sports may not be sufficiently new. If this is the case, it could have a problem with the cheaper clones (in that case, install from source by following the instructions here: https://github.com/texane/stlink/blob/master/doc/compiling.md).
You can refer to the code used in this chapter at the following link:
https://github.com/PacktPublishing/Practical-Hardware-Pentesting
Check out the following link to see the Code in Action video:
A board to do what? What is the board? What can it do? How much does it cost? Why this one? Where is the documentation? Yes, you surely have plenty of questions! You will sometimes need a reminder while testing or doing the exercises, so I will also point to the chip's documentation. These questions are exactly what we are going to be talking about in the following sub-headings.
Well, we will need to interface the board with the circuit we will want to attack. Since a general-usage PC doesn't really have a readily accessible interface board to connect with the most common protocols, we will use a bluepill to do so.
The bluepill is a colloquial name for many different boards that have the following characteristics:
The STM32F103C8T6 is a quite capable (32 bits, 72 MHz) microcontroller produced by STMicroelectronics that comes with a wide range of typical general-use peripherals:
We can now use these to interface with our target systems. Also, in quite practical terms, it is possible to program it directly in C (which we will use in the book) or use the Arduino IDE and API to program.
Important note
Some vendors are selling boards that have a clone of the STM32F103C8T6 on it. These should be fine, but the programming software may complain about it.
The C programming language has a reputation for being hard to use and complex. Trust me, it is not. This reputation comes from the fact it doesn't come with a lot of the convenience functions of more modern languages. The simplicity that comes with this language makes it shine when the resources are constrained and when the execution needs to be really efficient, like on a microcontroller!
While I am quite sure that most of the examples in the book could be written using the Arduino IDE and API, it would do the following:
All of this (unless you actually have a degree in electrical engineering or experience in programming embedded systems) would hinder your ability to understand your actual targets! It would do so because you will understand some fundamental concepts about the way in which microcontrollers work and are used on your targets!
Aside from that, you definitely should buy an Arduino and play around with it, but I will not focus on that here. You can even use the STM32duino libraries on this platform!
The datasheet has a scope that is restricted to the model itself. Like most of the chip manufacturers, their chips are named in a nomenclature that allows us to decipher the capabilities of the chip that is soldered on the bluepill. For example, let's look at the nomenclature for STM32F103C8T6:
In STMicro vocabulary, the document that will provide you with the detailed information of the family is a "reference manual." It will give you the addresses of the different memory-projected registers. It also explains the way in which the peripherals are programmed and all the things that are shared across the family members, irrespective of how much memory they have, how many leads are available on this package or that package, and so on.
Tip
The datasheet can be found here: https://www.st.com/resource/en/datasheet/stm32f103c8.pdf. The reference manual can be found here: https://www.st.com/resource/en/reference_manual/cd00171190.pdf.
In the reference manual, you will find a description of all the peripherals that are on the chip. While reading the documentation for a peripheral, you should expect to always find the same following sequence:
- How the peripheral type behaves in general
- What the available functionalities of the peripheral type are
- How to initialize and configure the peripheral type
- How to use the internal peripheral behavior (what the interrupts are, how they play out together, which bit is flipped by which events, and so on)
Like most (if not all) programming languages, the main thing C does is make the CPU core move values from memory locations to other memory locations. In order to react to the programming, the chip has special memory regions where memory locations are actually special storage units ("the registers," as opposed to generic storage locations) that react to the stored value by altering the chip behavior. At some of these special addresses (that is, some registers), it is the behavior of the chip itself (such as its clock and turning peripherals on and off) that is set, and for others, it is the behavior of peripherals around the CPU that is altered. This concept is called memory-projected register and is the basis of the operation of MCUs and CPUs. Let's now dive into how this is translated in a binary that defines the MCU's behavior.
We will use a set of tools to transform a high-level language (yes, I wrote that, C is a high-level language) into the binary code that the chip understands and is laid out in a file that it can execute. To make it short, it's called compilation (compilation is actually one step of it, but it is a quite easy shorthand). We will push this file to the chip and have it run our code. In order to do that, we will have to use a set of tools and I will describe these in the following sections.
Under the generic compilation concept, the way it is understood by most people, we turn the code into something that can be executed by a computer. From the push of a button or a sternly typed command line, we see a file appear that we can run (a .exe file, a .elf file, or other formats). In reality, this is (of course) a little bit more complicated.
The goal of the compilation process is to turn a human-readable language (C, C++, assembly opcodes, Java, and so on) into a sequence of instructions that the decoding unit in the CPU can understand.
For the bluepill, we will use the GNU Compiler Collection (GCC) and, more specifically, a flavor (gcc-am-none-eabi) that is geared toward our architecture (arm) without any specific operating system (none-eabi).
In order to be able to understand the process, we will perform this operation on our local machine since it is easier to see the result than on the bluepill, and the process is essentially the same.
First, let's compile a simple hello world code:
$ cat hello.c
#include <stdio.h>
int main(){printf("hello world!");}
$ gcc -c hello.c
$ file hello.o
hello.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ chmod u+x hello.o
$ ./hello.o
bash: ./hello.o: cannot execute binary file: Exec format error
Here, gcc -c means compile only. When we try to execute hello.o, the error tells us that this is not a binary file that our computer knows how to execute. This is because we need to put it in a format it understands.
If you need to include header files (header files described by the functions provided by a library or another .o file), use -I to provide the path to the header directory and use the #include directive in the source file.
The linking turns object files into an understandable format for the operating system. In our example, the printf() function is provided by an external library (the description of what the library provides comes from the #include <stdio.h> line), but the operating system has no clue as to which library just by looking at the object file. This is the linker's job (we will use gcc to call the linker) to link it (and put the relevant information) into a file format that the operating system will understand:
$ gcc -o hello.elf hello.o
$ ./hello.elf
hello world!
This process (since it is not very clear in our very small example) is very important as soon as a project is divided into multiple source files. Each will become a .o object file and will be linked together as something that is usable.
Of course, a project can do the following:
That is why there are tools to drive the compilation process. The simplest and most ubiquitous one is Make. Make is driven by a description file called a Makefile.
A Makefile can be complex (if you look at a big file for a complex project) but is composed of very simple elements:
Let's have a look at a very simple Makefile to compile our hello world example:
CC=arm-none-eabi-gcc
hello : hello.o
$(CC) -o hello hello.o
hello.o : hello.c
$(CC) -c -o hello.o hello.c
Let's discuss a few terms from this Makefile:
Important note
In Makefiles, before a list of tasks (such as the $(CC) directive (the tasks for the target)), there must be a tabulation ( ) and not just a space. If the make command tells you a separator is missing, this means that your editor transformed the tab into multiple spaces, and this will not work.
To illustrate the dependencies system, let's try a number of things:
$ make # (1)
make: 'hello' is up to date.
$ rm hello # (2)
$ make
gcc -o hello hello.o
$ touch hello.c # (3)
$ make
gcc -c -o hello.o hello.c
gcc -o hello hello.o
Let's understand this code:
Make is very powerful and allows much more than this simple example. I strongly encourage you to read some Makefiles to get used to its possibilities and, of course, read the documentation on Make's website:
https://www.gnu.org/software/make/
Now we can build code, let's see how we can push it to the chip.
The easiest and most versatile software for STM32 chips on Linux is an open source implementation of ST's programming protocol. This software is available in the most modern distribution in a packaged format as the stlink-tools package.
Information box
For more information on the stlink-tools package, you can refer to the following link: https://github.com/texane/stlink.
It comes with different tools:
Now, enough with the examples, let's do the real thing.
In order to make our first program for our chips, we will need to do the following:
Before we start coding, we will need a corpus of information that will help us with providing all of the addresses of the different registers and constants that will help set them up without constantly doing (usually quite error-prone) bitwise arithmetic with raw values. Additionally, the opencm3 library comes with convenience functions to set up and use peripherals that we will use later on.
Here is how to get the library:
$ git clone https://github.com/libopencm3/libopencm3.git
...
$ cd libopencm3
$ make
...
$cd ..
At this point, the library is ready to be used.
The chip needs to be initialized for the following purposes:
The entire code and Makefile can be found in the book's Git repository in bluepill/ch5/blink (do not forget to clone it and its submodules with --recursive).
Try to read the Makefile and understand what it does, as well as what the different targets do:
Now that we've seen how code is transformed into a binary that can be transferred to the chip, let's look a bit more into the code and how it works.
C will be your bread and butter for developing your attacks. Yes, there are easier, more modern, less cumbersome languages, but the following is true:
So, pony up, and learn the language that makes the hardware run!
This is really intended as a crash course that will just allow you to understand the code that comes with this book. There are plenty of resources on C on the internet if you want to dig deeper (and trust me, you will want to).
C comes with most of the operators you are expecting:
You may already be familiar with the majority of the statements:
The comments can come in two forms:
Numeral bases as literals are also very straightforward:
Variables have a type. This is so that the compiler knows what kind of operation to apply to the variable.
The main types in C are as follows:
int: an integer value, usually 4 bytes
short : a short integer, usually 2 bytes
char : enough to hold a character, usually a byte
float : a representation of a real (floating point) value, usually 4 bytes. Attention, the precision is limited !
That's it. There are no evolved types such as strings, lists, and hash maps out of the box. This is a very concise language where you have to create the evolved types you may need from the basic types. But don't underestimate C. The chances are that it is still the language that created the code managing the hardware in most of the devices you own. The majority of the kernels, the low-level libraries, are written in C because it is extremely efficient, both for size and for pure code performance.
Pointers are making people afraid of C, and this is somewhat ridiculous. Pointers, just by themselves, are making people afraid of this language. Generations of students have been frustrated by the dreaded and mystical beast called "segmentation fault" (the error that usually comes from flawed pointer operations).
It is true that people are scared of pointers, and I cannot fathom why. They are easy.
A variable is held at a memory location. The pointer is the address of this location. Done ... finished. It is no more complicated than that. Of course, our systems hold this address in a location in memory.
The notation for pointers is * (a pointer is a type and it points to a value with a type so that the compiler can perform a size calculation). The notation of "get address of" is &, while, within an expression, * is used as a dereference (that is, "this thing that is at the address I am applying the * to"):
int a = 5; // a holds 5, for example at address 632
int b = 8;
int * a_ptr = & a; // a_ptr, a pointer to an int, holds the value 632
*a_ptr = 6; // the address 632 now hold the value 6, and so does a (cause it is at address 632)
b = b + *a_ptr; // b holds 14
* a_ptr = b + *a_ptr;//a holds 20,a_ptr still holds 632
In C, pointers are the way in which arrays are managed, either with dynamic allocation (almost never used in MCUs), or statically with the [] shorthand syntax:
int a[4];
for(int i=0;i<4;i++){ a[i]= i}; // we initialize the array with 0,1,2,3
a[0] = 1; // arrays are 0 based since the address of the array holds the first value
*(a+0) == a[1]; // is now true, a+0 actually holds the address of the first value
Since the array is so easy to use, it is also used to hold strings:
char * s1 = "hello reader !"; // s holds the address of the first character,
//"" tells the compiler that the initial value it is a 0
//terminated array of characters
char s2[15]; // declare a new array
char * s1_ptr = s1; // s1_ptr holds the address of the first character of s1
char * s2_ptr = s2; // s2_ptr holds the address of the first character of s2
while(*s1_ptr != 0){*s2_ptr++ = *s1_ptr++; };
/* string are 0(null character) terminated, and we use this to
copy to the target array, i used the ++ shorthand to do all
of this in one statement */
*s2_ptr = 0; //We 0 terminate our target string since the while didn't
//execute for the last 0
s2[0]='H'; // Change the first value of s2 to the character H, it is now "Hello reader !"
s1[0]='H'; // This will crash ! (we will see why in the static reverse engineering chapter)
Like I said before, this is just a crash course, but for now, you are able to code for the bluepill, push code onto it, and start having fun!
Preprocessor directives are directives that a special piece of code in the compiler (the preprocessor) understands. They begin with # and are used by the preprocessor to do text replacement or file inclusion.
The most frequently used directives are the following:
Multiple other directives exist including #undef, #else, and more besides.
Declaring a function in C is very easy:
function_return_type function_name(type_arg1 arg1, type_arg2 arg2){
body of function
}
Then, the function_name variable simply holds a pointer to the assembly code that implements the function. One consequence of this is that it is possible to use function pointers as variables that hold a reference to a function that you can change and call dynamically.
In this chapter, we have programmed our main attack platform for the first time and then installed and compiled the library that will help us interact with its peripheral. We also had a brief introduction to the language we are going to use to program it – C.
In the next chapter, we will go through the most common protocols used in embedded systems, and learn how to find them, sniff them, and then attack them with our bluepills.
Read more about the C language:
Read more about GNU Make:
18.224.30.118