© Stephen Smith 2020
S. SmithProgramming with 64-Bit ARM Assembly Languagehttps://doi.org/10.1007/978-1-4842-5881-1_7

7. Linux Operating System Services

Stephen Smith1 
(1)
Gibsons, BC, Canada
 

In Chapter 1, “Getting Started,” we needed the ability to exit our program and to display a string. We used Linux to do this, invoking operating system services directly. In all high-level programming languages, there is a runtime library that includes wrappers for calling the operating system. This makes it appear that these services are part of the high-level language. In this chapter, we’ll look at what these runtime libraries do under the covers to call Linux and what services are available to us.

We will review the syntax for calling the operating system, the error codes returned to us. We’ll get some help from the GNU C compiler, utilizing some C header files to get the definitions we need for the Linux service call numbers, rather than using magic numbers like 64 and 93.

So Many Services

Linux is a powerful, full-featured operating system with over 25 years of development. Linux powers devices from watches all the way up to super-computers. One of the keys to this success is the richness and power of all the services that it offers.

There are slightly over 400 Linux service calls; covering all of these is beyond the scope of this book, and more the topic for a book on Linux System Programming. In this section, we cover the mechanisms and conventions for calling these services and some examples, so you know how to go from the Linux documentation to writing code quickly. Fortunately, the Linux documentation for all these services is quite good. It is oriented entirely to C programmers, so anyone else using it must know enough C to convert the meaning to what is appropriate for the language they are using.

Calling Convention

We’ve used two system calls: one to write ASCII data to the console and the second to exit our program. The calling convention for system calls is different from that for functions. It uses a software interrupt to switch context from our user-level program to the context of the Linux kernel.

The calling convention is
  1. 1.

    X0X7: Input parameters, up to eight parameters for the system call.

     
  2. 2.

    X8: The Linux system call number.

     
  3. 3.

    Call software interrupt 0 with “SVC 0”.

     
  4. 4.

    X0: The return code from the call.

     

The software interrupt is a clever way for us to call routines in the Linux kernel without knowing where they are stored in memory. It also provides a mechanism to run at a higher security level while the call executes. Linux will check if you have the correct access rights to perform the requested operation and give back an error code like EACCES (13) if you are denied.

Although it doesn’t follow the function calling convention from Chapter 6, “Functions and the Stack,” the Linux system call mechanism will preserve all registers not used as parameters or the return code. When system calls require a large block of parameters, they tend to take a pointer to a block of memory as one parameter, which then holds all the data they need. Hence, most system calls don’t use that many parameters.

Now we need to know where to get those magic Linux system call numbers, so we can call all those useful services.

Linux System Call Numbers

We know 93 is the Linux system call number for exit and 64 is the number for write to a file. These seem rather cryptic. Where do we look these up? Can’t we use something symbolic in our programs rather than these magic numbers? The Linux system call numbers are defined in the C include file:
/usr/include/asm-generic/unistd.h
In this file, there are define statements such as the following:
#define __NR_write 64

This defines the symbol __NR_write to represent the magic number 64 for the write Linux system call.

Next, we need a similar method for the service return codes, so we know what went wrong if they fail.

Return Codes

The return code for these functions is usually zero or a positive number for success and a negative number for failure. The negative number is the negative of the error codes from the C include file:
/usr/include/errno.h
This file includes several other files; the main ones that contain most of the actual error codes are
/usr/include/asm-generic/errno.h
/usr/include/asm-generic/errno-base.h

We’ll see how to use the constants from these files in our code when we get to a sample program.

For example, the open call to open a file returns a file descriptor if it is successful. A file descriptor is a small positive number, then a negative number if it fails, where it is the negative of one of the constants in errno.h.

If you’ve programmed in C, you know many of the C runtime functions take structures as parameters. The Linux service calls are the same and we’ll look at dealing with these next.

Structures

Many Linux services take pointers to blocks of memory as their parameters. The contents of these blocks of memory are documented with C structures, so as Assembly programmers, we must reverse engineer the C and duplicate the memory structure. For instance, the nanosleep service lets the program sleep for several nanoseconds; it is defined as
int nanosleep(const struct timespec *req, struct timespec *rem);
and then the struct timespec is defined as
   struct timespec {
               time_t tv_sec;      /* seconds */
               long   tv_nsec;     /* nanoseconds */
           };
We then must figure out that these are two 64-bit integers, then define in Assembly
timespecsec:   .dword   0
timespecnano:  .dword   100000000
To use them, we load their address into the registers for the first two parameters:
        ldr         X0, =timespecsec
        ldr         X1, =timespecsec

We’ll be using the nanosleep function in Chapter 8, “Programming GPIO Pins,” but this is typical of what it takes to directly call some Linux services.

Next, we need to decide how to make these calls easier to use. Do we wrap them in Assembly functions or use another method?

Wrappers

Rather than figure out all the registers each time we want to call a Linux service, we will develop a library of routines or macros to make our job easier. The C programming language includes function call wrappers for all the Linux services; we will see how to use these in Chapter 9, “Interacting with C and Python.”

Rather than duplicate the work of the C runtime library by developing wrapper functions, we’ll develop a library of Linux system calls using the GNU Assembler’s macro functionality. We won’t develop this for all the functions, just the functions we need. Most programmers do this; then over time their libraries become quite extensive.

A problem with macros is that you often need several variants with different parameter types. For instance, sometimes you might like to call the macro with a register as a parameter and other times with an immediate value.

Now that we understand the theory of using Linux services, let’s look at a complete program that uses a collection of these.

Converting a File to Upper-Case

In this chapter, we present a complete program to convert the contents of a text file to all upper-case. We will use our toupper function from Chapter 6, “Functions and the Stack,” and get practice coding loops and if statements.

To start with, we need a library of file I/O routines to read from our input file, then write the upper-case version to another file. If you’ve done any C programming, these should look familiar, since the C runtime provides a thin layer over these services. We create a file fileio.S containing Listing 7-1. Note the file extension is a capital S; this is important as this allows us to use C include files as we’ll discuss shortly.
// Various macros to perform file I/O
//
// The fd parameter needs to be a register.
// Uses X0, X1, X8.
// Return code is in X0.
#include <asm/unistd.h>
.equ  O_RDONLY, 0
.equ  O_WRONLY, 1
.equ  O_CREAT,  0100
.equ  O_EXCL,   0200
.equ  S_RDWR,   0666
.equ  AT_FDCWD, -100
.macro  openFile    fileName, flags
      mov         X0, #AT_FDCWD
      ldr         X1, =fileName
      mov         X2, #flags
      mov       X3, #S_RDWR          // RW access rights
      mov       X8, #__NR_openat
      svc         0
.endm
.macro  readFile   fd, buffer, length
      mov         X0, fd      // file descriptor
      ldr         X1, =uffer
      mov         X2, #length
      mov         X8, #__NR_read
      svc         0
.endm
.macro  writeFile   fd, buffer, length
      mov         X0, fd      // file descriptor
      ldr         X1, =uffer
      mov         X2, length
      mov         X8, #__NR_write
      svc         0
.endm
.macro  flushClose  fd
//fsync syscall
      mov         X0, fd
      mov         X8, #__NR_fsync
      svc         0
//close syscall
      mov         X0, fd
      mov         X8, #__NR_close
      svc         0
.endm
Listing 7-1

Macros to help us read and write files

Now we need a main program to orchestrate the process. We’ll call this main.S, again with the capital S file extension, containing the contents of Listing 7-2.
//
// Assembler program to convert a string to
// all upper case by calling a function.
//
// X0-X2, X8 - used by macros to call linux
// X11 - input file descriptor
// X9 - output file descriptor
// X10 - number of characters read
//
#include <asm/unistd.h>
#include "fileio.S"
.equ  BUFFERLEN, 250
.global _start                    // Provide program starting address to linker
_start: openFile       inFile, O_RDONLY
       ADDS            X11, XZR, X0 // save file descriptor
       B.PL            nxtfil  // pos number file opened ok
       MOV             X1, #1  // stdout
       LDR             X2, =inpErrsz // Error msg
       LDR             W2, [X2]
       writeFile       X1, inpErr, X2 // print the error
       B               exit
nxtfil: openFile       outFile, O_CREAT+O_WRONLY
       ADDS            X9, XZR, X0   // save file descriptor
       B.PL            loop    // pos number file opened ok
       MOV             X1, #1
       LDR             X2, =outErrsz
       LDR             W2, [X2]
       writeFile       X1, outErr, X2
       B               exit
// loop through file until done.
loop:  readFile        X11, buffer, BUFFERLEN
       MOV             X10, X0       // Keep the length read
       MOV             X1, #0        // Null terminator for string
       // setup call to toupper and call function
       LDR             X0, =buffer   // first param for toupper
       STRB            W1, [X0, X10] // put null at end of string.
       LDR             X1, =outBuf
       BL              toupper
       writeFile       X9, outBuf, X10
       CMP             X10, #BUFFERLEN
       B.EQ            loop
       flushClose      X11
       flushClose      X9
// Setup the parameters to exit the program
// and then call Linux to do it.
exit:  MOV     X0, #0      // Use 0 return code
       MOV     X8, #__NR_exit
       SVC     0           // Call Linux to terminate
.data
inFile:  .asciz  "main.S"
outFile: .asciz      "upper.txt"
buffer:      .fill  BUFFERLEN + 1, 1, 0
outBuf:      .fill  BUFFERLEN + 1, 1, 0
inpErr: .asciz      "Failed to open input file. "
inpErrsz: .word  .-inpErr
outErr: .asciz      "Failed to open output file. "
outErrsz: .word     .-outErr
Listing 7-2

Main program for case conversion program

To build these source files, we add a new rule to our makefile, to build .S files with gcc rather than as, as shown in the next section.

Building .S Files

The makefile is contained in Listing 7-3.
UPPEROBJS = main.o upper.o
ifdef DEBUG
DEBUGFLGS = -g
else
DEBUGFLGS =
endif
all: upper
%.o : %.S
      gcc $(DEBUGFLGS) -c $< -o $@
%.o : %.s
      as $(DEBUGFLGS) $< -o $@
upper: $(UPPEROBJS)
      ld -o upper $(UPPEROBJS)
Listing 7-3

Makefile for our file conversion program

This program uses the upper.s file from Chapter 6 , “Functions and the Stack,” that contains the function version of our upper-case logic.

We added a rule to compile our two .S files with gcc rather than as. Most people think of gcc as the GNU C compiler, but it actually stands for the GNU Compiler Collection and is capable of compiling several other languages in addition to C including Assembly Language. The clever trick that gcc supports when we do this is the ability to add C preprocessor commands to our Assembly code.

When we compile a .S (the capital is important) file with gcc, it will process all C #include and #define directives before processing the Assembly instructions and directives. This means we can include standard C include files for their symbols, as long as the files don’t contain any C code or conditionally exclude the C code when processed by the GNU Assembler.

The Linux kernel consists of both C and Assembly Language code. For the definition of constants that are used by both code bases, they don’t want to make the definitions in two places and risk errors from differences. Thus, all the Assembly Language code in the Linux kernel are in .S files and use various C include files including unistd.h.

Using this technique, our Linux function numbers are no longer magic numbers and will be correct and readable.

When we process a .s (lower-case) file with gcc, it assumes we want pure Assembly code and won’t run things through the C preprocessor first.

If you build this program, notice that it is only 3KB in size. This is one of the appeals of pure Assembly Language programming. There is nothing extra added to the program—we control every byte—no mysterious libraries or runtimes added.

Next, let’s look at the details of opening a file.

Opening a File

The Linux openat service is typical of a Linux system service. It takes four parameters:
  1. 1.

    Directory File Descriptor: File descriptor to the folder that filename is open relative to. If this is the magic number AT_FDCWD, then it means open relative to the current folder.

     
  2. 2.

    Filename: The file to open as a NULL-terminated string.

     
  3. 3.

    Flags: To specify whether we’re opening it for reading or writing or whether to create the file. We included some .EQU directives with the values we need (using the same names as in the C runtime).

     
  4. 4.

    Mode: The access mode for the file when we create the file. We included a couple of defines, but in octal these are the same as the parameters to the chmod Linux command.

     

The return code is either a file descriptor or an error code. Like many Linux services, the call fits this in a single return code by making errors negative and successful results positive.

The C runtime has both open and openat routines; the open routine calls the openat Linux service with AT_FDCWD for the first parameter as we use here.

Error Checking

Books tend to not promote good programming practices for error checking. The sample programs are kept as small as possible, so the main ideas being explained aren’t lost in a sea of details. This is the first program where we test any return codes, partly because we had to develop enough code to be able to do it and secondly error checking code tends to not reveal any new concepts.

File open calls are prone to failing. The file might not exist, perhaps, because we are in the wrong folder or we may not have sufficient access rights to the file. Generally, check the return code to every system call, or function you call, but practically speaking programmers are lazy and tend to only check those that are likely to fail. In this program, we check the two file open calls. Checking every return code would make the code listings too long to include in this book, so don’t take this code as an example; do the error checking in your real code.

First of all, we have to copy the file descriptor to a register that won’t be overwritten, so we move it to X11. We do this with an ADDS instruction, so the condition flags will be set. It would be nice if there was a MOVS alias for ADDS, but since there isn’t, we add X0 to the zero register XZR and put the result in X11, and the condition flags are set accordingly.
ADDS   X11, XZR, X0 // save file descriptor
This means we can test if it’s positive, and if so, go on to the next bit of code:
B.PL   nxtfil  // pos number file opened ok
If the branch isn’t taken, then openFile returned a negative number. Here we use our writeFile routine to write an error message to stdout, then branch to the end of the program to exit.
MOV          X1, #1           // stdout
LDR          X2, =inpErrsz    // Error msg
LDR          W2, [X2]
writeFile    X1, inpErr, X2   // print the error
B            exit
In our .data section, we defined the error messages as follows:
inpErr: .asciz    "Failed to open input file. "
inpErrsz: .word  .-inpErr

We’ve seen .asciz and this is standard. For writeFile, we need the length of the string to write to the console. In Chapter 1, “Getting Started,” we counted the characters in our string and put the hard-coded number in our code. We could do that here too, but error messages start getting long and counting the characters seems like something the computer should do. We could write a routine like the C library’s strlen() function to calculate the length of a NULL-terminated string. Instead, we use a little GNU Assembler trickery. We add a .word directive right after the string and initialize it with “.-inpErr”. The “.” is a special Assembler variable that contains the current address the Assembler is on as it works. Hence, the current address right after the string minus the address of the start of the string is the length. Now people can revise the wording of the error message to their heart’s content without needing to count the characters each time.

Most applications contain an error module, so if a function fails, the error module is called. Then the error module is responsible for reporting and logging the error. This way error reporting can be made quite sophisticated without cluttering up the rest of the code with error handling code. Another problem with error handling code is that it tends to not be tested. Often bad things can happen when an error finally does happen, and problems with the previously untested code manifest.

Looping

In our loop, we
  1. 1.

    Read a block of 250 characters from the input file

     
  2. 2.

    Append a NULL terminator

     
  3. 3.

    Call toupper

     
  4. 4.

    Write the converted characters to the output file

     
  5. 5.

    If we aren’t done, branch to the top of the loop

     
We check if we are done with
CMP      X10, #BUFFERLEN
B.EQ     loop

X10 contains the number of characters returned from the read service call. If it equals the number of characters requested, then we branch to loop. If it doesn’t equal exactly, then either we hit end of file, so the number of characters returned is less (and possibly 0), or an error occurred, in which case the number is negative. Either way, we are done and fall through to the program exit.

Summary

In this chapter, we gave an overview of how to call the various Linux system services. We covered the calling convention and how to interpret the return codes. We didn’t cover the purpose of each call and referred the user to the Linux documentation instead.

We presented a program to read a file, convert it to upper-case, and write it out to another file. This is our first chance to put together what we learned in Chapters 16 to build a full application, with loops, if statements, error messages, and file I/O.

In the next chapter, we will use Linux service calls to manipulate the GPIO pins on the Raspberry Pi board.

Exercises

  1. 1.

    The files this program operates on are hard coded in the .data section. Change them, play with them, generate some errors to see what happens. Single step through the program in gdb to ensure you understand how it works.

     
  2. 2.

    Modify the program to convert the file to all lower-case.

     
  3. 3.

    Convert fileio.S to use callable functions rather than macros. Change main.S to call these functions.

     
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.151.231