© Stephen Smith 2019
S. SmithRaspberry Pi Assembly Language Programminghttps://doi.org/10.1007/978-1-4842-5287-1_9

9. Interacting with C and Python

Stephen Smith1 
(1)
Gibsons, BC, Canada
 

In the early days of microcomputers, like the Apple II, people wrote complete applications in Assembly language, such as the first spreadsheet program VisiCalc. Many video games were written in Assembly to squeeze every bit of performance they could out of the hardware. These days, modern compilers like the GNU C compiler generate fairly good code and microprocessors are much faster; as a result, most applications are written in a collection of programming languages, where each excels at a specific function. If you are writing a video game today, chances are you would write most in C, C++, or even C#, then use Assembly for performance, or to access parts of the video hardware not exposed through the graphics library you are using.

In this chapter, we will look at using components written in other languages from our Assembly language code and look at how other languages can make use of the fast-efficient code we are writing in Assembly.

Calling C Routines

If we want to call C functions, we must restructure our program. The C runtime has a _start label; it expects to be called first and to initialize itself before calling our program, as it does by calling a main function. If we leave our _start label in, we will get an error that _start is defined more than once. Similarly, we won’t call the Linux terminate program service anymore; instead, we’ll return from main and let the C runtime do that along with any other cleanup it performs.

To include the C runtime, we could add it to the command-line arguments in the ld command in our makefile. However, it's easier to compile our program with the GNU C compiler (which includes the GNU Assembler), then it will link in the C runtime automatically. To compile our program, we will use
gcc -o myprogram myprogram.s

That will call as on myprogram.s and then do the ld command including the C runtime.

The C runtime gives us a lot of capabilities including wrappers for most of the Linux system services. There is an extensive library for manipulating NULL-terminated strings, routines for memory management, and routines to convert between all the data types.

Printing Debug Information

One handy use of the C runtime is to print out data to trace what our program is doing. We wrote a routine to output the contents of a register in hexadecimal, and we could write more Assembly code to extend this or we could just get the C runtime to do it. After all, if we are printing out trace or debugging information, it doesn’t need to be performant, rather just easy to add to our code.

For this example, we’ll use the C runtime’s printf function to print out the contents of a register in both decimal and hexadecimal format. We’ll package this routine as a macro, and we’ll preserve all the registers with push and pop instructions. This way, we can call the macro without worrying about register conflicts. The exception is CPSR which it can’t preserve, so don’t put these macros between instructions that set the CPSR, then test the CPSR. We also provide a macro to print a string for either logging or formatting purposes.

The C printf function is mighty; it takes a variable number of arguments depending on the contents of a format string. There is extensive online documentation on printf; so for a fuller understanding, please have a look. We will call our collection of macros debug.s, and it contains the code from Listing 9-1.
@ Various macros to help with debugging
@ These macros preserve all registers.
@ Beware they will change cpsr.
.macro  printReg    reg
      push     {r0-r4, lr} @ save regs
      mov      r2, R eg   @ for the %d
      mov      r3, R eg   @ for the %x
      mov      r1, # eg
      add      r1, #'0'    @ for %c
      ldr      r0, =ptfStr @ printf format str
      bl       printf @ call printf
      pop      {r0-r4, lr} @ restore regs
.endm
.macro      printStr    str
      push     {r0-r4, lr} @ save regs
      ldr      r0, =1f     @ load print str
      bl       printf @ call printf
      pop      {r0-r4, lr} @ restore regs
      b        2f          @ branch around str
1:    .asciz        "str "
      .align        4
2:
.endm
.data
ptfStr: .asciz   "R%c = %16d, 0x%08x "
.align 4
.text
Listing 9-1

Debug macros that use the C runtime’s printf function

Preserving State

First, we push registers R0–R4 and LR; we either use these registers, or printf might change them. They aren’t saved as part of the function calling protocol. At the end, we restore these. This makes calling our macros as minimally disruptive to the calling code as possible.

Calling Printf

We call the C function with these arguments:
printf("R%c = %16d, 0x%08x ", reg, Rreg, Rreg);
Since there are four parameters, we set them into R0–R3. In printf each string that starts with a percentage sign (“%”), it takes the next parameter and formats it according to the next letter:
  • c for character.

  • d for decimal.

  • x for hex.

  • 0 means 0 pad.

  • A number specifies the length of the field to print.

Note

It is important to move the value of the register to R2 and R3 first since populating the other registers might wipe out the passed-in value if we are printing R0 or R1. If our register is R2 or R3, one of the MOV instructions does nothing. Luckily, we don’t get an error or warning, so we don’t need a special case.

Passing a String

In the printStr macro, we pass in a string to print. Assembly doesn’t handle strings, so we embed the string in the code with an .asciz directive, then branch around it.

There is an .align directive right after the string, since Assembly instructions must be word aligned. It is good practice to add an .align directive after strings, since other data types will load faster if they are word aligned.

Generally, I don’t like adding data to the code section, but for our macro, this is the easiest way. The assumption is that the debug calls will be removed from the final code. If we add too many strings, we could make PC relative offsets too large to be resolved. If this happens, we may need to shorten the strings or remove some.

Adding with Carry Revisited

In Chapter 2, “Loading and Adding,” we gave sample code to add two 64-bit numbers using ADDS and ADC instructions. What was lacking from this example was some way to see the output. Now we’ll take addexamp2.s and add some calls to our debug macros, in Listing 9-2, to show it in action.
@
@ Example of 64-bit addition with the ADD/ADC
@ instructions.
@
.include "debug.s"
.global main  @ Provide program starting
@ main routine to be called by C runtime
main:
      push {R4-R12, LR}
@ Load the registers with some data
@ First 64-bit number is 0x00000003FFFFFFFF
      MOV  R2, #0x00000003
      MOV  R3, #0xFFFFFFFF     @Assembler will change to MVN
@ Second 64-bit number is 0x0000000500000001
      MOV  R4, #0x00000005
      MOV  R5, #0x00000001
      printStr "Inputs:"
      printReg 2
      printReg 3
      printReg 4
      printReg 5
      ADDS  R1, R3, R5 @ Lower order word
      ADC   R0, R2, R4 @ Higher order word
      printStr "Outputs:"
      printReg 1
      printReg 0
      mov  r0, #0           @ return code
      @ restore registers and return by popping to PC
      pop  {R4-R12, PC}
Listing 9-2

Updated addexamp2.s to print out the inputs and outputs

The makefile, in Listing 9-3, for this is quite simple.
addexamp2: addexamp2.s debug.s
      gcc -o addexamp2 addexamp2.s
Listing 9-3

Makefile for updated addexamp2.s

If we compile and run the program, we will see:
pi@raspberrypi:~/asm/Chapter 9 $ make
gcc -o addexamp2 addexamp2.s
pi@raspberrypi:~/asm/Chapter 9 $ ./addexamp2
Inputs:
R2 =                3, 0x00000003
R3 =               -1, 0xffffffff
R4 =                5, 0x00000005
R5 =                1, 0x00000001
Outputs:
R1 =                0, 0x00000000
R0 =                9, 0x00000009
pi@raspberrypi:~/asm/Chapter 9 $

Besides adding the debug statements, notice how the program is restructured as a function. The entry point is main, and it follows the function protocol of saving all the registers. Since this is the main routine and only called once, we save all the registers rather than try to track the registers we are really using. This is the safest, since then we don’t have to worry about it as we work on our program.

By just adding the C runtime, we bring a powerful tool chest to save us time as we develop our full Assembly application. On the downside, notice our executable has grown to over 8KB.

Calling Assembly Routines from C

A typical scenario is to write most of our application in C, then call Assembly language routines in specific use cases. If we follow the function calling protocol from Chapter 6, “Functions and the Stack,” C won’t be able to tell the difference between our functions and any other functions written in C.

As an example, let’s call our toupper function from Chapter 6, “Functions and the Stack,” and call it from C. Listing 9-4 contains the C code for uppertst.c to call our Assembly function.
//
// C program to call our Assembly
// toupper routine.
//
#include <stdio.h>
extern int mytoupper( char *, char * );
#define MAX_BUFFSIZE 255
int main()
{
      char *str = "This is a test.";
      char outBuf[MAX_BUFFSIZE];
      int len;
      len = mytoupper( str, outBuf );
      printf("Before str: %s ", str);
      printf("After str: %s ", outBuf);
      printf("Str len = %d ", len);
      return(0);
}
Listing 9-4

Main program to show calling our toupper function from C

The makefile is in Listing 9-5.
uppertst: uppertst.c upper.s
      gcc -o uppertst uppertst.c upper.s
Listing 9-5

Makefile for C and our toupper function

We had to change the name of our toupper function to mytoupper, since there is already a toupper function in the C runtime, and this led to a multiple definition error. This had to be done in both the C and the Assembly code. Otherwise, the function is the same as in Chapter 6, “Functions and the Stack.”

We must define the parameters and return code for our function to the C compiler. We do this with
extern int mytoupper( char *, char * );

This should be familiar to all C programmers, as you must do this for C functions as well. Usually, you would gather up all these definitions and put them in a header (.h) file.

As far as the C code is concerned, there is no difference to using this Assembly function than if we wrote it in C. When we compile and run the program, we get
pi@raspberrypi:~/asm/Chapter 9 $ make
gcc -o uppertst uppertst.c upper.s
pi@raspberrypi:~/asm/Chapter 9 $ ./uppertst
Before str: This is a test.
After str: THIS IS A TEST.
Str len = 16
pi@raspberrypi:~/asm/Chapter 9 $

The string is in uppercase as we would expect, but the string length appears one greater than we might expect. That is because the length includes the NULL character that isn’t the C standard. If we really wanted to use this a lot with C, we should subtract 1, so that our length is consistent with other C runtime routines.

Packaging Our Code

We could leave our Assembly code in individual object (.o) files, but it is more convenient for programmers using our library to package them together in a library. This way, the user of our Assembly routines just needs to add one library to get all of our code, rather than possibly dozens of .o files. In Linux there are two ways to do this; the first way is to package our code together into a static library that is linked into the program. The second method is to package our code as a shared library that lives outside the calling program and can be shared by several applications.

Static Library

To package our code as a static library, we use the Linux ar command. This command will take a number of .o files and combine them into a single file by convention lib<ourname>.a, that can then be included into a gcc or ld command. To do this, we modify our makefile to build this way as demonstrated in Listing 9-6.
LIBOBJS = upper.o
all: uppertst2
%.o : %.s
      as $(DEBUGFLGS) $(LSTFLGS) $< -o $@
libupper.a: $(LIBOBJS)
      ar -cvq libupper.a upper.o
uppertst2: uppertst.c libupper.a
      gcc -o uppertst2 uppertst.c libupper.a
Listing 9-6

Makefile to build upper.s into a statically linked library

If we build and run this program, we get
pi@raspberrypi:~/asm/Chapter 9 $ make
as   upper.s -o upper.o
ar -cvq libupper.a upper.o
a - upper.o
gcc -o uppertst2 uppertst.c libupper.a
pi@raspberrypi:~/asm/Chapter 9 $ ./uppertst2
Before str: This is a test.
After str: THIS IS A TEST.
Str len = 16
pi@raspberrypi:~/asm/Chapter 9 $

The only difference to the last example is that we first use as to compile upper.s into upper.o and then use ar to build a library containing our routine. If we want to distribute our library, we include libupper.a, a header file with the C function definitions, and some documentation. Even if you aren’t selling or otherwise distributing your code, building libraries internally can help organizationally to share code among programmers and reduce duplicated work.

Shared Library

Shared libraries are much more technical than statically linked libraries. They place the code in a separate file from the executable and are dynamically loaded by the system as needed. There are a number of issues, but we are only going to touch on them, such as versioning and library placement in the filesystem. If you decide to package your code as a shared library, this section provides a starting point and demonstrates that it applies to Assembly code as much as C code.

The shared library is created with the gcc command, giving it the -shared command-line parameter to indicate we want to create a shared library and then the -soname parameter to name it.

To use a shared library, it must be in a specific place in the filesystem. We can add new places, but we are going to use a place created by the C runtime, namely, /usr/local/lib. After we build our library, we copy it here and create a couple of links to it. These steps are all required as part of shared library versioning control system.

Then to use our shared library libup.so.1, we include -lup on the gcc command to compile uppertst3. The makefile is presented in Listing 9-7.
LIBOBJS = upper.o
all: uppertst3
%.o : %.s
      as $(DEBUGFLGS) $(LSTFLGS) $< -o $@
libup.so.1.0: $(LIBOBJS)
      gcc -shared -Wl,-soname,libup.so.1 -o libup.so.1.0 $(LIBOBJS)
      mv libup.so.1.0 /usr/local/lib
      ln -sf /usr/local/lib/libup.so.1.0 /usr/local/lib/libup.so.1
      ln -sf /usr/local/lib/libup.so.1.0 /usr/local/lib/libup.so
uppertst3: libup.so.1.0
      gcc -o uppertst3 -lup uppertst.c
Listing 9-7

Makefile for building and using a shared library

If we run this, several commands will fail. To copy the files to /usr/local/lib, we need root access, so use the sudo command. The following is the sequence of commands to build and run the program
pi@raspberrypi:~/asm/Chapter 9 $ sudo make -B
as   upper.s -o upper.o
gcc -shared -Wl,-soname,libup.so.1 -o libup.so.1.0 upper.o
mv libup.so.1.0 /usr/local/lib
ln -sf /usr/local/lib/libup.so.1.0 /usr/local/lib/libup.so.1
ln -sf /usr/local/lib/libup.so.1.0 /usr/local/lib/libup.so
gcc -o uppertst3 -lup uppertst.c
pi@raspberrypi:~/asm/Chapter 9 $ sudo ldconfig
pi@raspberrypi:~/asm/Chapter 9 $ ./uppertst3
Before str: This is a test.
After str: THIS IS A TEST.
Str len = 16
pi@raspberrypi:~/asm/Chapter 9 $
Notice there is a call to the following command:
sudo ldconfig

before we run the program. This causes Linux to search all the folders that hold shared libraries and update its master list. We have to run this once after we successfully compile our library, or Linux won’t know it exists.

If you use objdump to look inside uppertst3, you won’t find the code for the mytoupper routine; instead, in our main code, you will find
 104c0:    ebffffb4 bl     10398 <mytoupper@plt>
which calls
00010398 <mytoupper@plt>:
   10398:  e28fc600 add    ip, pc, #0, 12
   1039c:  e28cca10 add    ip, ip, #16, 20  ; 0x10000
   103a0:  e5bcfc78 ldr    pc, [ip, #3192]! ; 0xc78

Gcc inserted this indirection into our code, so the loader can fix up the address when it dynamically loads the shared library.

Embedding Assembly Code Inside C Code

The GNU C compiler allows Assembly code to be embedded right in the middle of C code. It contains features to interact with C variables and labels and cooperate with the C compiler and optimizer for register usage.

Listing 9-8 is a simple example, where we embed the core algorithm for the toupper function inside the C main program.
//
// C program to embed our Assembly
// toupper routine inline.
//
#include <stdio.h>
extern int mytoupper( char *, char * );
#define MAX_BUFFSIZE 255
int main()
{
      char *str = "This is a test.";
      char outBuf[MAX_BUFFSIZE];
      int len;
      asm
      (
            "MOV R4, %2 "
            "loop:    LDRB   R5, [%1], #1 "
            "CMP  R5, #'z' "
            "BGT  cont "
            "CMP  R5, #'a' "
            "BLT  cont "
            "SUB  R5, #('a'-'A') "
            "cont:      STRB R5, [%2], #1 "
            "CMP  R5, #0 "
            "BNE  loop "
            "SUB  %0, %2, R4 "
            : "=r" (len)
            : "r" (str), "r" (outBuf)
            : "r4", "r5"
      );
      printf("Before str: %s ", str);
      printf("After str: %s ", outBuf);
      printf("Str len = %d ", len);
      return(0);
}
Listing 9-8

Embedding our Assembly routine directly in C code

The asm statement lets us embed Assembly code directly into our C code. Doing this, we could write an arbitrary mixture of C and Assembly. I stripped out the comments from the Assembly code, so the structure of the C and Assembly is a bit easier to read. The general form of the asm statement is
asm asm-qualifiers ( AssemblerTemplate
                : OutputOperands
                [ : InputOperands]
                [ : Clobbers ] ]
                [ : GotoLabels])
The parameters are
  • AssemblerTemplate: A C string containing the Assembly code. There are macro substitutions that start with % to let the C compiler insert the inputs and outputs.

  • OutputOperands: A list of variables or registers returned from the code. This is required, since it is expected that the routine does something. In our case this is “=r” (len) where the =r means an output register and that we want it to go into the C variable len.

  • InputOperands: A list of input variables or registers used by our routine, in this case “r” (str), “r” (outBuf) meaning we want two registers, one holding str and one holding outBuf. It is fortunate that C string variables hold the address of the string, which is what we want in the register.

  • Clobbers: A list of registers that we use and will be clobbered when our code runs, in this case “r4” and “r5”.

  • GotoLabelsr: A list of C program labels that our code might want to jump to. Usually, this is an error exit. If you do jump to a C label, you have to warn the compiler with a goto asm-qualifier.

You can label the input and output operands, we didn’t, and that means the compiler will assign them names %0, %1, … as you can see used in the Assembly code.

Since this is a single C file, it is easy to compile with
gcc -o uppertst4 uppertst4.c

Running the program produces the same output as the last section.

If you disassemble the program, you will find that the C compiler avoids using registers R4 and R5 entirely, leaving them to us. You will see it load up our input registers from the variables on the stack, before our code executes and then copies our return value from the assigned register to the variable len on the stack. It doesn’t give the same registers we originally used, but that isn’t a problem.

This routine is straightforward and doesn’t have any side effects. If your Assembly code is modifying things behind the scenes, you need to add a volatile keyword to the asm statement to make the C compile be more conservative on any assumptions it makes about your code.

Calling Assembly from Python

If we write our functions following the Raspbian function calling protocol from Chapter 6, “Functions and the Stack,” we can follow the documentation on how to call C functions for any given programming language. Python has a good capability to call C functions in its ctypes module. This module requires we package our routines into a shared library. Since Python is an interpreted language, we can’t link static libraries to it, but we can dynamically load and call shared libraries. The techniques we go through here for Python have matching components in many other interpreted languages.

The hard part is already done, we’ve built the shared library version of our uppercase function; all we must do is call it from Python. Listing 9-9 is the Python code for uppertst5.py.
from ctypes import *
libupper = CDLL("libup.so")
libupper.mytoupper.argtypes = [c_char_p, c_char_p]
libupper.mytoupper.restype = c_int
inStr = create_string_buffer(b"This is a test!")
outStr = create_string_buffer(250)
len = libupper.mytoupper(inStr, outStr)
print(inStr.value)
print(outStr.value)
print(len)
Listing 9-9

Python code to call mytoupper

The code is fairly simple; we first import the ctypes module so we can use it. We then load our shared library with the CDLL function. This is an unfortunate name since it refers to Windows DLLs rather than something more operating system neutral. Since we installed our shared library in /usr/local/lib and added it to the Linux shared library cache, Python has no trouble finding and loading it.

The next two lines are optional, but good practice. They define the function parameters and return type to Python, so it can do extra error checking.

In Python, strings are immutable, meaning you can’t change them, and they are in Unicode, meaning each character takes up more than 1 byte. We need to provide the strings in regular buffers that we can change, and we need the strings in ASCII rather than Unicode. We can make a string ASCII in Python by putting a “b” in front of the string; that means to make it a byte array using ASCII characters. The create_string_buffer function in the ctypes module creates a string buffer that is compatible with C (and hence Assembly) for us to use.

We then call our function and print the inputs and outputs. Raspbian comes with the Thonny Python IDE preinstalled as shown in Figure 9-1, so we can use that to test the program.
../images/486919_1_En_9_Chapter/486919_1_En_9_Fig1_HTML.jpg
Figure 9-1

Our Python program running in the Thonny IDE

Summary

In this chapter, we looked at calling C functions from our Assembly code. We made use of the standard C runtime to develop some debug helper functions to make developing our Assembly code a little easier. We then did the reverse and called our Assembly uppercase function from a C main program.

We learned how to package our code as both static and shared libraries. We discussed how to package our code for consumption. We looked at how to call our uppercase function from Python, which is typical of high-level languages with the ability to call shared libraries.

In the next chapter, Chapter 10, “Multiply, Divide, and Accumulate,” we will return to mathematics. We will cover multiplication, division, and multiply with accumulate.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.162.216