In the early days of microcomputers, like the Apple II, people wrote complete applications in Assembly Language, such as the first spreadsheet program VisiCalc. Many video games were also written in Assembly to squeeze every bit of performance they could out of the hardware. These days, modern compilers like the GNU C compiler generate good code and microprocessors are much faster; as a result most applications are written in a collection of programming languages, where each excels at a specific function. If you are writing a video game today, chances are you would write most in C, C++, or even C# and then use Assembly for performance, or to access parts of the video hardware not exposed through the graphics library you are using.
In this chapter, we will look at using components written in other languages from our Assembly Language code and look at how other computer languages can make use of the fast-efficient code we are writing in Assembly.
Calling C Routines
If we want to call C functions, we must restructure our program. The C runtime has a _start label; it expects to be called first and to initialize itself before calling our program, which it does by calling a main function. If we leave our _start label in, we will get an error that _start is defined more than once. Similarly, we won’t call the Linux terminate program service anymore; instead we’ll return from main and let the C runtime do that along with any other cleanup it performs.
That will call as on myprogram.s and then do the ld command including the C runtime.
The C runtime gives us a lot of capabilities including wrappers for most of the Linux system services. There is an extensive library for manipulating NULL-terminated strings, routines for memory management, and routines to convert between all the data types.
Printing Debug Information
One handy use of the C runtime is to print out data to trace what our program is doing. We wrote a routine to output the contents of a register in hexadecimal, and we could write more Assembly code to extend this, or we could just get the C runtime to do it. After all, if we are printing out trace or debugging information, it doesn’t need to be performant, rather easy to add to our code.
For this example, we’ll use the C runtime’s printf function to print out the contents of a register in both decimal and hexadecimal format. We’ll package this routine as a macro, and we’ll preserve all the registers that might be corrupted. This way we can call the macro without worrying about register conflicts. The exception is the condition flags which it can’t preserve, so don’t put these macros between instructions that set the flags and then test the flags. We also provide a macro to print a string for either logging or formatting purposes.
Debug macros that use the C runtime’s printf function
Preserving State
First, we push registers X0–X18 and LR; we either use these registers or printf might change them. They aren’t saved as part of the function calling protocol. At the end, we restore these. This makes calling our macros as minimally disruptive to the calling code as possible.
It is unfortunate that each instruction can only save or restore two registers at a time, and since there are 19 corruptible registers along with LR, this means ten instructions to push all these registers and another ten to pop them all off of the stack.
Calling Printf
c for character
d for decimal
x for hex
0 means 0 pad
l for long meaning 64 bits
A number specifying the length of the field to print
It is important to move the value of the register to X2 and X3 first since populating the other registers might wipe out the passed in value if we are printing X0 or X1. If our register is X2 or X3, one of the MOV instructions does nothing. Luckily, we don’t get an error or warning, so we don’t need a special case.
Now we look at the details of how we pass this format string to printf.
Passing a String
In the printStr macro, we pass in a string to print. Assembly doesn’t handle strings, so we embed the string in the code with an .asciz directive, then branch around it.
There is an .align directive right after the string, since Assembly instructions must be word aligned. It is good practice to add an .align directive after strings, since other data types will load faster if they are word aligned.
Generally, I don’t like adding data to the code section, but for our macro, this is the easiest way. The assumption is that the debug calls will be removed from the final code. If we add too many strings, we could make PC relative offsets too large to be resolved. If this happens, we may need to shorten the strings, or remove some.
Next, we need a program that needs to print something.
Adding with Carry Revisited
Updated addexamp2.s to print out the inputs and outputs
Makefile for updated addexamp2.s
Besides adding the debug statements, notice how the program is restructured as a function. The entry point is main, and it follows the function protocol of saving LR.
By just adding the C runtime, we bring a powerful tool-chest to save us time as we develop our full Assembly application. On the downside, notice our executable has grown to over 9KB.
Now we know how to call C routines from our Assembly Language code, next let’s do the reverse and call Assembly Language from C.
Calling Assembly Routines from C
A typical scenario is to write most of our application in C, then call Assembly Language routines in specific use cases. If we follow the function calling protocol from Chapter 6, “Functions and the Stack,” C won’t be able to tell the difference between our functions and any functions written in C.
Main program to show calling our toupper function from C
Makefile for C and our toupper function
We had to change the name of our toupper function to mytoupper, since there is already a toupper function in the C runtime, and this led to a multiple definition error. This had to be done in both the C and the Assembly code. Otherwise, the function is the same as in Chapter 6, “Functions and the Stack.”
This should be familiar to all C programmers, as you must do this for C functions as well. Usually, you would gather up all these definitions and put them in a header (.h) file.
The string is in upper-case as we would expect, but the string length appears one greater than we might expect. That is because the length includes the NULL character, which isn’t the C standard. If we really wanted to use this a lot with C, we should subtract 1, so that our length is consistent with other C runtime routines.
Packaging Our Code
We could leave our Assembly code in individual object (.o) files, but it’s more convenient for programmers using our library to package them together in a library. This way the user of our Assembly routines just needs to add one library to get all of our code, rather than possibly dozens of .o files. In Linux there are two ways to do this. The first way is to package our code together into a static library that is linked into the program. The second method is to package our code as a shared library that lives outside the calling program and can be shared by several applications.
Static Library
Makefile to build upper.s into a statically linked library
The only difference compared to the last example is that we first use as to compile upper.s into upper.o and then use ar to build a library containing our routine. If we want to distribute our library, we include libupper.a, a header file with the C function definitions and some documentation. Even if you aren’t selling, or otherwise distributing your code, building libraries internally can help organizationally to share code among programmers and reduce duplicated work. In the next section, we explore shared libraries, another Linux facility for sharing code.
Shared Library
Shared libraries are much more technical than statically linked libraries. They place the code in a separate file from the executable and are dynamically loaded by the system as needed. There are several issues, but we are only going to touch on them, such as versioning and library placement in the file system. If you decide to package your code as a shared library, this section provides a starting point and demonstrates that it applies to Assembly Language code as much as C code.
The shared library is created with the gcc command, giving it the -shared command line parameter to indicate we want to create a shared library and then the -soname parameter to name it.
To use a shared library, it must be in a specific place in the filesystem. We can add new places, but we’re going to use a place created by the C runtime, namely, /usr/local/lib. After we build our library, we copy it here and create a couple of links to it. These steps are all required as part of shared library versioning control system.
Makefile for building and using a shared library
after the shared library is put in place. This causes Linux to search all the folders that hold shared libraries and update its master list. We must run this once after we successfully compile our library, or Linux won’t know it exists.
Placing -lup on the end of the command to build uppertst3, after the file that uses it, is important, or you will get unresolved externals when you build.
Gcc inserted this indirection into our code, so the loader can fix up the address when it dynamically loads the shared library.
As a final technique, we will look at mixing Assembly Language and C code in the same source code file.
Embedding Assembly Code Inside C Code
The GNU C compiler allows Assembly code to be embedded right in the middle of C code. It contains features to interact with C variables and labels and cooperate with the C compiler for register usage.
Embedding our Assembly routine directly in C code
AssemblerTemplate: A C string containing the Assembly code. There are macro substitutions that start with % to let the C compiler insert the inputs and outputs.
OutputOperands: A list of variables or registers returned from the code. This is required, since it’s expected that the routine does something. In our case, this is “=r” (len) where the =r means an output register and that we want it to go into the C variable len.
InputOperands: List of input variables or registers used by our routine. In this case “r” (str), “r” (outBuf) meaning we want two registers, one holding str and one holding outBuf. It is fortunate that C string variables hold the address of the string, which is what we want in the register.
Clobbers: A list of registers that we use and will be clobbered when our code runs. In this case “r4” and “r5”. This statement is the same for all processors, so it just means registers 4 and 5, which in our case are X4 and X5.
GotoLabelsr: A list of C program labels that our code might want to jump to. Usually, this is an error exit. If you do jump to a C label, you must warn the compiler with a goto asm-qualifier.
You can label the input and output operands, we didn’t, and that means the compiler will assign them names %0, %1, … as you can see used in the Assembly code.
Running the program produces the same output as the last section.
If you disassemble the program, you will find that the C compiler avoids using registers X4 and X5 entirely, leaving them to us. You will see it loads up our input registers from the variables on the stack, before our code executes and then copies our return value from the assigned register to the variable len on the stack. It doesn’t give the same registers we originally used, but that isn’t a problem.
This routine is straightforward and doesn’t have any side effects. If your Assembly code is modifying things behind the scenes, you need to add a volatile keyword to the asm statement to make the C compile be more conservative on any assumptions it makes about your code.
In the next section, we’ll look at calling our Assembly Language code from the popular Python programming language.
Calling Assembly from Python
If we write our functions following the Linux function calling protocol from Chapter 6, “Functions and the Stack,” we can follow the documentation on how to call C functions for any given programming language. Python has a good capability to call C functions in its ctypes module. This module requires we package our routines into a shared library.
Since Python is an interpreted language, we can’t link static libraries to it, but we can dynamically load and call shared libraries. The techniques we go through here for Python have matching components in many other interpreted languages.
Python code to call mytoupper
The code is fairly simple; we first import the ctypes module so we can use it. We then load our shared library with the CDLL function. This is an unfortunate name since it refers to Windows DLLs, rather than something more operating system neutral. Since we installed our shared library in /usr/local/lib and added it to the Linux shared library cache, Python has no trouble finding and loading it.
The next two lines are optional, but good practice. They define the function parameters and return type to Python, so it can do extra error checking.
In Python, strings are immutable, meaning you can’t change them, and they are in Unicode, meaning each character takes up more than one byte. We need to provide the strings in regular buffers that we can change, and we need the strings in ASCII rather than Unicode. We can make a string ASCII in Python by putting a “b” in front of the string, which means to make it a byte array using ASCII characters. The create_string_buffer function in the ctypes module creates a string buffer that is compatible with C (and hence Assembly) for us to use.
Summary
In this chapter, we looked at calling C functions from our Assembly code. We made use of the standard C runtime to develop some debug helper functions to make developing our Assembly code a little easier. We then did the reverse and called our Assembly upper-case function from a C main program.
We learned how to package our code as both static and shared libraries. We discussed how to package our code for consumption. We looked at how to call our upper-case function from Python, which is typical of high-level languages with the ability to call shared libraries.
In the next chapter, Chapter 10, “Interfacing with Kotlin and Swift,” we will see how to incorporate Assembly Language code into Android and iOS apps.
Exercises
- 1.
Add a macro to debug.s to print a string given a register as a parameter that contains a pointer to the string to print.
- 2.
Add a macro to debug.s to print a register, if it contains a single ASCII character.
- 3.
In the printReg macro, set X0–X18 to known unusual values before the call to printf. Then step through the call to printf to see how many of these registers are clobbered.
- 4.
Create a C program to call the lower-case routine from Chapter 6 (“Functions and the Stack”), Exercise 3, and print out some test cases.
- 5.
Create static and shared library packages for the lower-case routine from Chapter 6, Exercise 3.
- 6.
Take the lower-case routine from Chapter 6, Exercise 3, and embed it in C code using an asm statement.
- 7.
Create a Python program to call the shared library from Exercise 5.