© Thomas Mailund 2021
T. MailundPointers in C Programminghttps://doi.org/10.1007/978-1-4842-6927-5_13

13. Function Pointers

Thomas Mailund1  
(1)
Aarhus N, Denmark
 

We now leave data structures for a spell to talk about something new: pointers to functions. Like pointers to data, pointers to functions give us a level of indirection, so we can hold in a variable the address of an object rather than the object itself. Further, with a pointer, the same variable can refer to different objects over time.

When you write a function, you write the return type before the function name, then the parameters after the name, and finally the body. If you declare a function pointer instead, you should put the name in parentheses and put a * before the name. Thus, this defines a function, f, from void to void and a pointer, fp, to functions of that type:
void f(void) {}
void (*fp)(void) = f;
where, as for all non-const pointers, you can change the value a function pointer points at:
void f2(void) {}
fp = f2;
For a more complex example, we can define a function, g, that returns double and takes two arguments, the first of type int and the second of type float:
double g(int x, float y) { return x + y; }
A pointer to the same type looks like this:
double (*gp)(int, float) = g;
For both fp and gp, we assigned functions, f and g, to them. If we had data pointers, we would have taken the address
fp = &f;
gp = &g;
and that is valid syntax as well. Like arrays, if you use an object as a pointer, it becomes a pointer. So, you can assign a function to a function pointer without taking its address—you automatically get its address in that case. Also, as with arrays, this doesn’t mean that the address of a pointer to a function is the same as the pointer itself. While you can assign both &f and f to fp, you cannot assign
void (*fp2)(void) = &fp;

Here, fp is a pointer to a function, and &fp is a pointer to a pointer to a function.

If you need a function pointer type alone, for example, if you are casting one function type to another, you use normal cast notation: you put the type in parentheses. For the function pointer type, you leave out the name after *, so a cast will look like
fp = ( void (*)(void) ) g;
The notation for function types can make complex expressions hard to read. Consider this function:
void (* complex(int x, int (*p)(int)) )(void)
{
  printf("%d ", p(x));
  return f;
}
It is not at all clear that this part
void (* /*...*/ )(void)
specifies the return type of a function—it returns a void to void function —but that is what it says. The name of the function is complex and
(int x, int (*p)(int))
are its parameters. The parameter
int (*p)(int)
is an int to int function pointer. With expressions such as this, typedef is your friend, and to typedef a function type, you put the type name where you would put the pointer name:
typedef void (*void_fn)(void);
typedef int  (*int_fn)(int);
void_fn simple(int x, int_fn p)
{
  printf("%d ", p(x));
  return f;
}

The example also shows you how to call the function that a pointer points at: p(x). If you have a function pointer, you use it the same way as if you had a function.

Function pointers are in a different category in the C standard, and there are slightly different rules. Most importantly, you are not guaranteed that casting a function pointer to void * and back will work. The POSIX standard does give you this guarantee, but if your setup is merely compliant to standard C, it is not guaranteed. So don’t put pointers to functions in data structures that hold void *; it isn’t safe. You are, however, ensured that you can convert between any function pointers and back safely, so you can choose any function pointer type to store pointers. If you call a function through a pointer of the wrong type, however, you get undefined behavior.

Function Pointers for High-Order Functions

It is, I think, easiest to learn how to use function pointers through a few examples, and we start with the simplest situation where function pointers are needed: parameterizing behavior in other functions. We use function arguments to parameterize the behavior of a function, but when what we must parameterize is complex, like item comparison in qsort(), we cannot easily do so through data. Rather, again like with qsort(), we can ask the caller to provide a function that handles part of the computation our function must do. The qsort() function doesn’t know how to compare elements, so the caller must provide a function that can do that. With such a function in hand, qsort() can sort arrays of any type.

Functions that take function arguments are called higher-order functions , and C’s support for such functions is rudimentary compared to many modern programming languages. Still, with function pointers, we can implement some basic high-level functions. For the following examples, I will use doubly linked lists from Chapter 11, and to make the code more readable, I will use these two macros:
#define abort_append(x,v)             
  do {                                
    if (!append((x), (v))) {          
      free_list(x); return 0;         
    }                                 
  } while(0)
#define for_each(var,x)               
  for (struct link *var = front(x);   
    var != x; var = var->next)

The first checks if an append() operation fails, and if it does, it will free the list and return NULL. For all the following functions, that is the appropriate way to handle an allocation error. The second macro just makes it easier to write a loop through a list. It is the for-loop header we use when we want to iterate through all links in a list.

Now to our high-level functions . The first function we will write is map(). It takes a list and a function as arguments, and it creates a new list constructed by evaluating the function on all elements in the input list. So with a function such as this
int add2(int x) { return 2 + x; }
that adds two to its input, calling
y = map(x, add2);

will generate a list with all the numbers from x plus 2.

The map() function looks like this:
list map(list x, int (*f)(int))
{
  list y = new_list();
  if (!y) return 0;
  for_each(link, x) {
    abort_append(y, f(link->value));
  }
  return y;
}

The code is straightforward. We use the function pointer argument f as a function to create new values f(link->value) to append to y.

For the next function, filter(), we want a function that takes a list and a predicate—a function that returns a Boolean value—and gives us a new list with all the elements that satisfied the predicate. The function is as simple to write as map() and can look like this:
list filter(list x, bool (*p)(int))
{
  list y = new_list();
  if (!y) return 0;
  for_each(link, x) {
    if (p(link->value))
      abort_append(y, link->value);
  }
  return y;
}
With a function that tells us if a number is even:
bool is_even(int i) { return i % 2 == 0; }
calling
y = filter(x, is_even);

will give us the even numbers from x.

The fold() function is only slightly more complicated. When we call fold() with a list x and function f, we want to iteratively apply f to the elements in x, with the result of one call as the first argument to the next call of f. We need a way to specify what the first argument should be the first time we call f, and we take the easy road and make it a parameter of fold(). The implementation looks like this:
int fold(list x, int (*f)(int, int), int x0)
{
  list y = new_list();
  if (!y) return 0;
  int res = x0;
  for_each(link, x) {
    res = f(res, link->value);
  }
  return res;
}

The first call to f will be with f(x0,x1) where x1 is the first value in x. The next call will be f(f(x0,x1),x2) where x2 is the second value in x, and so on.

Using fold(), we can, for example, implement functions for summing and multiplying all elements in a list. We need to give fold() a function for adding two numbers and then sum with an initial value of zero, and we need a function for multiplying two numbers and then take the product with an initial value of one.
int add(int x, int y) { return x + y; }
int sum(list x)       { return fold(x, add, 0); }
int mul(int x, int y) { return x * y; }
int prod(list x)      { return fold(x, mul, 1); }

These are simple high-order functions, and it can get more involved. However, the way that you use function pointer arguments doesn’t get more complicated than this. You take a function pointer as an argument, and you use it as you would any other function. Call it when you need the caller’s function to do what you cannot do on your own. There is little beyond that to high-order functions in C.

Callbacks

We use function pointers for much more than high-order functions. In event-driven systems, such as graphical user interfaces, they are frequently used to decouple GUI code from application logic. One design is to hook up GUI elements to so-called callbacks , functions that the GUI framework will call when certain events happen.

Imagine a, somewhat simplistic, GUI framework where we have buttons, and buttons can have different events. Without tying our code too much in with the GUI handling, we want to be informed about the events that a given button experiences. The mechanism for this will be a function pointer. Every button holds a function pointer, and they will call that function for each event.
enum button_events {
  MOUSEDOWN, MOUSEUP,
  CLICKED, DBLCLICKED,
  // more...
};
struct button;
typedef void (*button_cb)(struct button *,
                           enum button_events);
struct button {
  char *text; // what the button says
  // lots of gui stuff here
  button_cb cb_func; // <- the callback function
};
The application programmer can create a button and install a callback, and after that, the framework will handle the GUI.
struct button *but = new_button("my button");
install_button_callback(but, my_callback);
In the callback , the application programmer can check which event happened and handle it accordingly:
void my_callback(struct button *but,
                 enum button_events event)
{
  switch (event) {
    case CLICKED:
      printf("button %s was clicked ", but->text);
      break;
    default:
      // nothing
      break;
  }
}

In most frameworks, the mechanism is more involved, so events, GUI objects, and callback functions are loosly coupled, but the principle is the same. If you want to be informed about specific events, you install a callback, and that callback is called when the event occurs. Callbacks aren’t restricted to GUIs either. In network programming, you might have callbacks to notify you when a package arrives at a port, or in a complicated workflow, you might want to hook in a bit of processing at specific steps in the pipeline.1

Callbacks (and hooks) are mostly used with frameworks, and not with simple programs of the length I can include in this book, so I am sorry, but I cannot give you a more realistic example than the one earlier. You will have to take my word for their usefulness. But in the following examples, I will show you how we combine data and function pointers in various ways with just as interesting results.

Generic String Iterator

In Chapter 7, we wrote two iterators for finding words and integers in a text, and we used macros to separate the generic code from the iterator-specific code. We needed the macros because we had different functions for classifying individual characters as numbers of part of a word, but by far the most of the code was generic, and the macros helped us avoid duplicating it. In the source code, at least, it is, of course, repeated once the macros are expanded.

Now, we are going to solve the same problem, but with function pointers to handle the varying parts of the code. We built our iterators on two functions, one for finding the next character of a class and one for skipping past characters of the class. For words, for example, we had
char *find_word(char *x)
{
  while (*x && !isalpha(*x))
    x++;
  return x;
}
char *skip_word(char *x)
{
  while (*x && isalpha(*x))
    x++;
  return x;
}

With these two functions, the first instance in a string x would be at y = find_word(x); and the following instances at y = find_word(skip_word(y));. The variable part is the character class, and we used macros to substitute different functions there, otherwise generating the same code.

Now, let us use a function pointer instead. Functions such as isalpha() have type int (*)(int), that is, they are integer to integer functions . We could easily update the preceding functions to take such a function as an argument:
char *find(int (*char_class)(int), char *x)
{
  while (*x && !char_class(*x))
    x++;
  return x;
}
char *skip(int (*char_class)(int), char *x)
{
  while (*x && char_class(*x))
    x++;
  return x;
}
To get an iterator, we can collect a character class function and a character pointer to the current location in a struct. We can initialize it with a function and a string and return the first occurrence (or NULL if there isn’t one).
typedef struct {
  char *x;
  int (*char_class)(int);
} find_skip_iter;
#define NULLIFY(x) ((*x) ? (x) : 0)
char *init_iter(find_skip_iter *iter, char *x,
                int (*char_class)(int))
{
  iter->char_class = char_class;
  iter->x = find(iter->char_class, x);
  return NULLIFY(iter->x);
}
#define init_word_iter(itr, x)
        init_iter((itr),(x), isalpha)
#define init_int_iter(itr, x)
        init_iter((itr),(x), isnumber)

The macros give us an easy way to initialize an iterator over words and numbers.

Each time we go for the next occurrence , we do a skip followed by a find:
char *next(find_skip_iter *iter)
{
  iter->x = skip(iter->char_class, iter->x);
  iter->x = find(iter->char_class, iter->x);
  return NULLIFY(iter->x);
}
In action, we can use the iterators like this:
int main(void)
{
  char *x = "123 sss 321 xxx 123";
  find_skip_iter itr;
  for (char *y = init_word_iter(&itr, x);
       y; y = next(&itr)) {
    printf("%s ", y);
  }
  for (char *y = init_int_iter(&itr, x);
       y; y = next(&itr)) {
    printf("%s ", y);
  }
  return 0;
}

The control flow in this example is different from the high-order functions. We do not use a function pointer to parameterize the behavior within one function call. Instead, we parameterize the behavior of an object, the iterator, through function pointers. When we need to get the next item from the iterator, we use the saved function pointers to get there. Other than that, the use of function pointers to parameterize behavior is similar.

Function Pointers for Abstract Data Structures

Abstract data structures are data structures defined by their operations, but not their specific implementation. A stack, for example, is something you can push to and pop from, and you can implement it in various ways. When we develop algorithms, abstract data structures are a conceptual tool we use, but when we implement them, of course, we need a concrete representation. Using a concrete type, however, makes it difficult to change the choice later, if a better implementation comes along. It also makes it hard to experiment with different implementation choices because each change requires that we update all the code that uses the data structure. There are times where we wish to keep the interface to a data structure abstract, and with function pointers, we can do so.

Let us, as an example, take a stack. With some appropriate definition of the type stack, and the elements we put in it, elem, the interface to a stack could look like this:
typedef ??? stack;
typedef ??? elem;
stack new_stack  (void);
bool empty_stack (stack);
elem statck_top  (stack);
bool stack_push  (stack, elem);
elem stack_pop   (stack);
void free_stack  (stack);
An easy way to hide the underlying implementation is to make stack some opaque type, like a void *, or a pointer to a struct we don’t reveal to the user. A stack of integers, for example, could define the types as
typedef struct stack *stack;
typedef int            elem;

Then, we would be free to implement the struct stack however we want, and the user could only access it through the functions declared through the preceding prototypes. That would work flawlessly if our application only ever needed one implementation of a stack, but it would fail if we used different implementations in different parts of the program. We might have hidden the implementation details, but the stack functions are linker objects, and we can only have one function with any given name at a time.

There are different ways to resolve the problem, but since the chapter is about function pointers, we are going for a solution using pointers . We can wrap the stack operations up in a structure that holds pointers to the different operations. They will operate on an implementation-specific stack, which we might as well represent as a void pointer. Then we are going to wrap everything up in a stack struct that ties the interface and implementation together. The structure of operations, defining a concrete implementation of the stack interface, looks like this:
typedef void * impl_stack;
typedef int    elem;
typedef struct {
  impl_stack  (*new_stack)   (void);
  bool        (*empty_stack) (impl_stack);
  elem        (*top)         (impl_stack);
  bool        (*push)        (impl_stack, elem);
  elem        (*pop)         (impl_stack);
  void        (*free_stack)  (impl_stack);
} stack_type;
It is function pointers with the interface from the original prototypes, except that we use the impl_stack, a void *, as the stack type. A stack is going to be a pointer to a struct that holds a pointer to the implementation stack type and a pointer to the functions that implement the stack:
struct stack {
  impl_stack  impl_stack;
  stack_type *type;
};
typedef struct stack * stack;
Now, we can implement the original stack interface, operating on a stack type, but where each operation delegates to the function pointed to in the stack_type structure. The only difference to the preceding prototypes is that the new_stack() function takes an argument that is the stack_type:
stack new_stack(stack_type *type)
{
  void *impl_stack = type->new_stack();
  stack stack = malloc(sizeof *stack);
  if (!impl_stack || !stack) goto error;
  stack->impl_stack = impl_stack;
  stack->type = type;
  return stack;
error:
  free(stack);
  if (impl_stack)
    type->free_stack(impl_stack);
  return 0;
}

When we create a stack, we create the underlying stack representation using the function from the type, then we wrap up that implementation stack and the type in the stack object.

The remaining functions get the implementation function from the stack_type pointer and call them. It is a simple forwarding call.
bool empty_stack(stack stack)
{
  return stack->type->empty_stack(stack->impl_stack);
}
elem stack_top(stack stack)
{
  return stack->type->top(stack->impl_stack);
}
bool stack_push(stack stack, elem elem)
{
  return stack->type->push(stack->impl_stack, elem);
}
elem stack_pop(stack stack)
{
  return stack->type->pop(stack->impl_stack);
}
void free_stack(stack stack)
{
  stack->type->free_stack(stack->impl_stack);
  free(stack);
}
If you want to implement a concrete stack, you must provide each of the functions for the stack_type structure. An implementation using the linked lists from Chapter 11 could look like this:
impl_stack list_stack_new(void)
{
  return new_list();
}
bool list_stack_empty(impl_stack stack)
{
  return is_empty((list)stack);
}
elem list_stack_top(impl_stack stack)
{
  return front((list)stack)->value;
}
bool list_stack_push(impl_stack stack, elem elem)
{
  return prepend(stack, elem);
}
elem list_stack_pop(impl_stack stack)
{
  elem elem = front((list)stack)->value;
  delete_link(front((list)stack));
  return elem;
}
void list_stack_free(impl_stack stack)
{
  free_list(stack);
}
stack_type list_stack = {
  .new_stack   = list_stack_new,
  .empty_stack = list_stack_empty,
  .top         = list_stack_top,
  .push        = list_stack_push,
  .pop         = list_stack_pop,
  .free_stack  = list_stack_free
};
If you want to use the dynamic arrays from Chapter 9, you could implement a stack like this:
impl_stack da_stack_new(void)
{
  struct dynarray *da = malloc(sizeof *da);
  if (!da) return 0;
  if (!da_init(da, 1, 0)) {
    free(da);
    return 0;
  }
  return da;
}
bool da_stack_empty(impl_stack stack)
{
  return ((struct dynarray *)stack)->used == 0;
}
elem da_stack_top(impl_stack stack)
{
  struct dynarray *da = stack;
  assert(da->used > 0);
  return da->data[da->used - 1];
}
bool da_stack_push(impl_stack stack, elem elem)
{
  return da_append(stack, elem);
}
elem da_stack_pop(impl_stack stack)
{
  struct dynarray *da = stack;
  assert(da->used > 0);
  return da->data[--(da->used)];
}
void da_stack_free(impl_stack stack)
{
  da_dealloc(stack);
  free(stack);
}
stack_type da_stack = {
  .new_stack   = da_stack_new,
  .empty_stack = da_stack_empty,
  .top         = da_stack_top,
  .push        = da_stack_push,
  .pop         = da_stack_pop,
  .free_stack  = da_stack_free
};
You can use both types of stacks, plus any other implementation you might write, in the same code. Whenever you have an implementation of all the operations, you can collect them in an instance of stack_type and create a new stack from them. After that, any operation you apply to the stack will call the correct function for the concrete implementation. After the stack creation, there isn’t any difference in how you use different concrete stacks:
int main(void)
{
  // Try with list stack
  stack stack = new_stack(&list_stack);
  for (int i = 0; i < 5; i++) {
    stack_push(stack, i);
  }
  while (!empty_stack(stack)) {
    int x = stack_pop(stack);
    printf("%d ", x);
  }
  free_stack(stack);
  // Try with dynamic array
  stack = new_stack(&da_stack);
  for (int i = 0; i < 5; i++) {
    stack_push(stack, i);
  }
  while (!empty_stack(stack)) {
    int x = stack_pop(stack);
    printf("%d ", x);
  }
  free_stack(stack);
  return 0;
}

Function pointers used this way are excellent tools for separating an interface from an implementation. However, if you use them for writing data structures, where runtime performance is important, then you must be careful. It is slower to call a function through a pointer than to call the function directly. There is the obvious overhead of having to load the value of the pointer variable before you can call the function, but that is a small overhead. More critical is that the computer’s cache and branch prediction doesn’t function well when calling functions at addresses you first need to compute. This can slow down a function call dramatically. For functions you call often, and where performance is critical, you don’t want to call indirectly if you can avoid it.

In many cases, calling one operation on an abstract data structure is not time critical. It is important that the operation is efficient, but that relies on the code after we have dispatched the operation to the correct implementation function. The indirect call overhead is not an issue. But in a tight inner loop of an algorithm, you are likely better served with a tighter coupling between the concrete implementation of the data structure and the algorithm, and the corresponding performance gain, than with keeping the data structure abstract. The correct choice, if there is such a beast, is application dependent.

Function Pointers for Polymorphic Data Structures

We can take the idea of combining data and function pointers one step further and use it to implement rudimentary object-oriented programming with dynamic dispatch. It is the same idea as for abstract data structures, but we will need to allow for derived objects to carry more data than the types they inherit from, and we must allow derived classes to have more functions than their base classes. Meanwhile, any object of a derived class must have a form where we can cast it to a base class and use it as such.

To meet these requirements , we can exploit that the C standard guarantees that the first object you put in a struct goes at the first memory address of instances of that struct. If you have a pointer to an instance of the struct, then you can safely cast it to a pointer to the first element. This means that if you nest structs, you can cast your way into the nesting. For example, with
struct A {
    int a;
};
struct B {
  struct A a;
  int b;
};
struct C {
  struct B b;
  int c;
};
you can access members of a struct C as if they were members of the nested struct B or the nested struct A in the struct B.
struct C *x = /* some allocation */;
assert(((B*)x)->b == x->b.b);
assert(((A*)x)->a == x->b.a.a);

Anywhere you have a function that works with pointers to struct A or struct B, you can call the function with a pointer to an instance of struct C. (They have to be pointers, of course, because otherwise you copy members, and you will only copy members of the type the function expects).

The C standard promises a little more about the memory layout of structs, and you wouldn’t have to nest them here. If the structs share a prefix of members, it also works.
struct A {
  int a;
};
struct B {
  int a;
  int b;
};
struct C {
  int a;
  int b;
  int c;
};

If you want to use one struct as another, though, it is easier to nest them.

Single Inheritance Objects and Classes

We can use this to create classes and objects (or instances) in an object-oriented programming sense. It is close to how C++ was initially implemented as a preprocessor to C. Use nested structs for objects, so derived objects contain the data their base cases have. For classes, have a struct for virtual functions (or functions with dynamic dispatch, or whatever you want to call them), and use nested structs for derived classes.

Since we need to be able to both extend instances, so objects of derived classes can carry more data than the base classes, and extend classes, so derived classes can have more virtual functions than base classes, we need two parallel hierarchies of nested structs. Obviously, we cannot put both of these at the top of a struct, so the casting trick cannot work that way. Instead, we make objects and classes separate structs. Each object will need to know its class, so objects will have a pointer to their class struct. This also saves some memory, because each object doesn’t have to carry with it all the function pointers; they just need a single pointer through which they can find them. When we need to apply a polymorphic function on an object, we can get the struct of function pointers from the object and call the appropriate function there.

We can define a class pointer as void *, so it can point to any structure, and define the most basic object as something that has such a pointer. I have also defined a macro, basetype(), for the casting, just to make it explicit what we are doing. Then I have a macro, vtbl, that gets the virtual function table, cast to a class type.
typedef void * cls;
typedef struct obj { cls *cls; } obj;
#define basetype(x, base) ((base *)x)
#define vtbl(inst, base)
           basetype(((obj *)inst)->cls, base)

You can make the basetype() more type-safe by going into the nested classes rather than casting, but it puts constraints on how the structs must be nested, and if you modify the type hierarchy above a class, you would need to update the code. The cast does what it is supposed to do if you are careful with it.

To call a polymorphic function , f, defined in class A_cls, you need to look it up in an object’s vtbl as vtbl(x,A_cls)->f. You will probably want to do that by wrapping the call in a function, for example:
int f(A *x, double d) { return vtbl(x,A_cls)->f(d); }
Classes must be allocated and initialized before we can use them. There’s a function and macro for that:
void *alloc_cls(size_t cls_size)
{
  cls *cls = malloc(cls_size);
  if (!cls) abort(); // error handling
  return cls;
}
#define INIT_CLS(p, init)       
  do {                          
    if (!p) {                   
      p = alloc_cls(sizeof *p);
      init(p);                  
    }                           
} while(0)

The INIT_CLS() gets a pointer to the class, which I expect is a global variable, initially NULL. If the class hasn’t been initialized yet, we allocate it and use the init function provided to initialize it.

For objects, we can use
void *alloc_obj(size_t obj_size, cls cls)
{
  obj *obj = malloc(obj_size);
  if (!obj) abort(); // error handling
  obj->cls = cls;
  return obj;
}
#define NEW_OBJ(p, cls) alloc_obj(sizeof *p, cls)
The NEW_OBJ() macro creates an object and sets its class. There is not an initialization function here because I expect that initializers will need arguments, so we cannot handle that generically. The same might be true for classes, but if that day arises, we can deal with it then.
void print_expr(EXP e)  { vtbl(e, base_expr_cls)->print(e); }
double eval_expr(EXP e) { return vtbl(e, base_expr_cls)->eval(e); }

A Hierarchy of Expression Classes

For a concrete example, we can have generic arithmetic expressions. We can define their main interface as having a print() and an eval() function.
// Generic expression type
typedef struct base_expr *EXP;
// Base class, class definition
typedef struct base_expr_cls {
  void    (*print)(EXP);
  double  (*eval) (EXP);
} base_expr_cls;
The functions are generic, so the implementation dispatches to the table in the class:
void   print(EXP e) { vtbl(e, base_expr_cls)->print(e); }
double eval (EXP e) { return vtbl(e, base_expr_cls)->eval(e); }
When we initialize the class, we don’t put any methods in there. They are abstract.
void init_base_expr_cls(base_expr_cls *cls)
{
  cls->print = 0; // abstract method
  cls->eval  = 0; // abstract method
}
There is nothing in instances of the base class (except the nested obj needed for the class pointer).
// Base class, object definition
typedef struct base_expr { obj obj; } base_expr;
// Base class, methods (init does nothing)
void init_base_expr(base_expr *inst) {}
A concrete expression type is one that merely holds a value. It can look like this:
// Value expressions
typedef struct value_expr_cls {
  base_expr_cls base_expr_cls;
} value_expr_cls;
typedef struct value_expr {
  base_expr base_expr;
  double value;
} value_expr;

The class struct has the base class struct as its first (and only) member, and the object struct has the base expression object struct as its first member and the value the class should hold.

This is not an abstract class, but one we can instantiate, so we need a place to put the class struct, and we define a pointer for it:
// Concrete class, so must have a struct
value_expr_cls *VALUE_EXPR_CLS = 0; // must be initialised

We will initialize it later.

The class should define the print() and eval() functions, so we write functions for that, and in the function that initializes the class, we insert them in the nested/base class struct.
void value_expr_print(EXP val)
{
  printf("%.3f", ((value_expr *)val)->value);
}
double value_expr_eval(EXP val)
{
  return ((value_expr *)val)->value;
}
void init_value_expr_cls(value_expr_cls *cls)
{
  init_base_expr_cls(basetype(cls, base_expr_cls));
  // override virtual functions
  base_expr_cls *base_expr = basetype(cls, base_expr_cls);
  base_expr->print = value_expr_print;
  base_expr->eval  = value_expr_eval;
}
For initializing objects of the type, we need a function that calls the base initializer and sets the value:
void init_value_expr(value_expr *val, double value)
{
  init_base_expr(basetype(val, base_expr));
  val->value = value;
}
We want a function that creates objects as well, a so-called constructor, and it can look like this:
EXP value(double value)
{
  INIT_CLS(VALUE_EXPR_CLS, init_value_expr_cls);
  value_expr *val = NEW_OBJ(val, VALUE_EXPR_CLS);
  init_value_expr(val, value);
  return (EXP)val;
}

It initializes the class, if it isn’t already created, then it allocates a new object, initializes it, and returns it.

We want binary operators, and we can define a class for that. I will write one that implements the print() virtual method, but not the eval() method; we will add eval() in subclasses . The implementation can look like this:
typedef struct binexpr_cls {
  base_expr_cls base_expr_cls;
} binexpr_cls;
typedef struct binexpr {
  base_expr base_expr;
  char symb; EXP left, right;
} binexpr;
void print_binexpr(EXP exp)
{
  binexpr *binop = (binexpr *)exp;
  putchar('('); print(binop->left); putchar(')');
  putchar(binop->symb);
  putchar('('); print(binop->right); putchar(')');
}
void init_binexpr_cls(binexpr_cls *cls)
{
  init_base_expr_cls(basetype(cls, base_expr_cls));
  base_expr_cls *base_expr = basetype(cls, base_expr_cls);
  base_expr->print = print_binexpr;
}
void init_binexpr(binexpr *binop, char symb,
                  EXP left, EXP right)
{
  init_base_expr(basetype(binop, base_expr));
  binop->symb = symb;
  binop->left = left;
  binop->right = right;
}
It follows the pattern we saw for values, except that we do not have a pointer for the class or a constructor because we are not supposed to create instances of this class. It doesn’t define eval(), so our program would crash if we did and then tried to evaluate an expression. In the following, you can read the implementation of an addition and substitution class:
// Addition
typedef struct add_expr_cls {
  binexpr_cls binexpr_cls;
} add_expr_cls;
typedef struct add_expr {
  binexpr binexpr;
} add_expr;
add_expr_cls *ADD_EXPR_CLS = 0;
double eval_add_expr(EXP expr)
{
  binexpr *base = basetype(expr, binexpr);
  return eval(base->left) + eval(base->right);
}
void init_add_expr_cls(add_expr_cls *cls)
{
  init_binexpr_cls(basetype(cls, binexpr_cls));
  base_expr_cls *base_expr = basetype(cls, base_expr_cls);
  base_expr->eval = eval_add_expr;
}
void init_add_expr(add_expr *expr, EXP left, EXP right)
{
  init_binexpr(basetype(expr, binexpr), '+', left, right);
}
// Constructor
EXP add(EXP left, EXP right)
{
  INIT_CLS(ADD_EXPR_CLS, init_add_expr_cls);
  add_expr *expr = NEW_OBJ(expr, ADD_EXPR_CLS);
  init_add_expr(expr, left, right);
  return (EXP)expr;
}
// Subtraction
typedef struct sub_expr_cls {
  binexpr_cls binexpr_cls;
} sub_expr_cls;
typedef struct sub_expr {
  binexpr binexpr;
} sub_expr;
sub_expr_cls *SUB_EXPR_CLS = 0;
double eval_sub_expr(EXP expr)
{
  binexpr *base = basetype(expr, binexpr);
  return eval(base->left) - eval(base->right);
}
void init_sub_expr_cls(sub_expr_cls *cls)
{
  init_binexpr_cls(basetype(cls, binexpr_cls));
  base_expr_cls *base_expr = basetype(cls, base_expr_cls);
  base_expr->eval = eval_sub_expr;
}
void init_sub_expr(add_expr *expr, EXP left, EXP right)
{
  init_binexpr(basetype(expr, binexpr), '-', left, right);
}
// Constructor
EXP sub(EXP left, EXP right)
{
  INIT_CLS(SUB_EXPR_CLS, init_sub_expr_cls);
  add_expr *expr = NEW_OBJ(expr, SUB_EXPR_CLS);
  init_sub_expr(expr, left, right);
  return (EXP)expr;
}

The code, again, follows the same pattern as we saw for the value_expr class.

The last example I will give is an expression that represents a variable. It is an expression with a named variable, but where we can bind values to the variable and unbind them again. We will make these two operations virtual functions (i.e., functions with a dynamic dispatch through the class) to see how that is done. I cannot think of a situation where we wouldn’t use plain old functions for that, but it is an example, so go with it. It is what you are going to get.

The class will have a bind() and an unbind() virtual function, and instances will have the name of the variable and the value we have bound to it.
// Variables
typedef struct var_expr *VAR;
typedef struct var_expr_cls {
  base_expr_cls base_expr_cls;
  void (*bind)  (VAR var, EXP val);
  void (*unbind)(VAR var);
} var_expr_cls;
typedef struct var_expr {
  base_expr base_expr;
  char const *name;
  double value;
} var_expr;
// new virtual functions
void bind(VAR var, EXP e) { vtbl(var, var_expr_cls)->bind(var, e); }
void unbind(VAR var)      { vtbl(var, var_expr_cls)->unbind(var); }
The implementation we give the bind() function will evaluate the second argument and put it in the value of the first. For unbind(), we set value to NAN (not a number, defined in <math.h>).
// implementations of new virtual functions
void var_expr_bind  (VAR var, EXP e) { var->value = eval(e); }
void var_expr_unbind(VAR var)        { var->value = NAN; }
From here on, we follow the pattern we have seen already. It is a concrete class, so we have a pointer for it, we have code for initializing the class and the instances, and we have a constructor for the type.
var_expr_cls *VAR_EXPR_CLS = 0;
// overriding virtual functions
void var_expr_print(EXP expr)
{
  VAR var = (VAR)expr;
  if (isnan(var->value)) { // isnan from <math.h>
    printf("%s", var->name);
  } else {
    printf("%f", var->value);
  }
}
double var_expr_eval(EXP expr)
{
  VAR var = (VAR)expr;
  return var->value;
}
void init_var_expr_cls(var_expr_cls *cls)
{
  init_base_expr_cls(basetype(cls, base_expr_cls));
  // override virtual functions
  base_expr_cls *base_expr = basetype(cls, base_expr_cls);
  base_expr->print = var_expr_print;
  base_expr->eval  = var_expr_eval;
  // new virtual functions
  cls->bind        = var_expr_bind;
  cls->unbind      = var_expr_unbind;
}
void init_var_expr(var_expr *var, char const *name)
{
  init_base_expr(basetype(var, base_expr));
  var->name = name;
  var->value = NAN; // NAN from <math.h>
}
// constructor
VAR var(char const *name)
{
  INIT_CLS(VAR_EXPR_CLS, init_var_expr_cls);
  VAR var = NEW_OBJ(var, VAR_EXPR_CLS);
  init_var_expr(var, name);
  return var;
}
You can try it out in action with code like this:
int main(void)
{
  VAR x = var("x");
  EXP expr = add(value(1.0), sub((EXP)x, value(2.0)));
  // prints 'x' for x and evaluates to nan
  print(expr); putchar(' ');
  printf("evaluates to %f ", eval(expr));
  // set x to 42
  bind(x, add(value(40.0), value(2.0)));
  // now prints 42 for x ane evaluates to 41
  print(expr); putchar(' ');
  printf("evaluates to %f ", eval(expr));
  return 0;
}

We create one variable, and we keep track of it, so we can bind it. Then we have the expression 1.0 + (x - 2.0), written as expressions. If we print it, the output shows x for the variable; if we evaluate it, we get nan because we use a nan from x in the computation. Give x a value, and this changes.

Generating Functions

If we have function pointers, could we point one at a random memory address and start executing the code there? Yes and no. In the good old days, before anyone worried about security, you could write machine code to a buffer, assign the buffer address to a function pointer, and call the function. These days, things are more complicated. That being said, you can usually still do it, although not in any portable way. In this last section of the chapter, I will give you an idea about how you can create and run code, but it will only be a taste of it. There is much more to it, but as we consider architecture and platform-dependent code, I will not dig too deep into the topic.

Typical memory protection in a modern operating system will have protection bits on memory locations, which allows you any combination of reading, writing, and executing. The code your compiler generated for you sits in memory that can be read and executed. The memory on our stack and that we allocate from the heap can be read and written to. But a rule of thumb is that you should not have memory that you can both write to and execute (and some operating systems enforce this rule). To generate code, we need memory that we can write to, and to execute it, we must change the protection bits so we can read and execute instead. There will be a system call in your operating system that lets you change the protection bits, but the resolution of protection is not individual bytes. Instead, it is so-called pages, of whatever size the hardware and/or operating system specifies. If you have a memory address that falls on the border of a page, a page’s memory alignment, you can change the memory protection bit; otherwise, you cannot. With malloc(), we do not necessarily get correctly aligned memory, but we could use aligned_alloc() and implement a function such as this for getting a block of memory we can change permissions for:
#include <unistd.h>
void *alloc_code_block(size_t size)
{
  long pagesize = sysconf(_SC_PAGE_SIZE);
  // with aligned alloc, the size must be a
  // multiple of align size
  size_t pages = (size + pagesize - 1) / pagesize;
  return aligned_alloc(pagesize, pages * pagesize);
}

We get the size of a page using the sysconf() function (specified in the POSIX standard). Then we round up the memory we need because aligned_alloc() requires that we allocate multiples of the alignment value and get an aligned chunk of memory.

Alignment alone might not be enough, though. Various other constraints vary from system to system, so more typical is using the (POSIX) mmap() function. It is a POSIX standard function, but it is not so standard that you can get what you want, unfortunately. Some systems have further requirements to what we can change protection bits on, but on macOS and the Linux systems I am familiar with, this function will work for allocation:
#include <sys/mman.h>
// Allocate page-aligned memory with mmap()
void *alloc_code_block(size_t size)
{
  // We want a read/write block
  int protection = PROT_READ | PROT_WRITE;
  // MAP_ANONYMOUS not POSIX but necessary on some systems
  // e.g. SELinux
  int flags = MAP_ANONYMOUS | MAP_PRIVATE;
  char *buf = mmap(0, size, protection, flags, -1, 0);
  return (buf != MAP_FAILED) ? buf : 0;
}

On a Windows machine, you want to look at the VirtualAlloc() function instead.

This gives us a correctly allocated memory block that we can write to. Once we have generated our code, we need to change the protection bits, and we can do that with mprotect(), another POSIX function:
void *set_exec_prot(void *buf, size_t size)
{
  // Change to a read/exec block
  int res = mprotect(buf, size, PROT_READ | PROT_EXEC);
  if (res == -1) {
    // munmap can fail, but there is nothing we
    // can do about it here...
    munmap(buf, size);
    return 0;
  }
  return buf;
}

The munmap() function is the call we have to use to deallocate memory. We didn’t allocate with malloc() or its cousins, so free() is not an option. In any case, we set the protection bits to PROT_READ and PROT_EXEC, so we can now execute code in the buffer.

To free a code buffer, once we are done, we use
void free_code_block(void *buf, size_t size)
{
  // munmap can fail, but there is nothing we
  // can do about it here...
  munmap(buf, size);
}

On Windows, you want VirtualProtect() to change permissions and VirtualFree() to free the buffer.

With these three functions, we can allocate memory for code, put code in the buffer and make it executable, and free the buffer once we are done with it. You need to generate raw machine code into the buffer, of course, which is tedious, but you can always write a library to help you. A simple example of code generation could look like this:
int main(void)
{
  // Adds two to its input and returns
  unsigned char code[] = {
    0x8d, 0x47, 0x02,        // lea eax,[rdi+0x2]
    0xc3                     // ret
  };
  /*
  Solaris, Linux, FreeBSD and macOS uses the System V AMD64 ABI
  where the first integer/pointer argument comes in register rdi.
  On windows, it would come in rcx.
  If you are there, change "rdi+0x2" to "rcx+0x2".
  */
  // Raw memory...
  void *code_block = alloc_code_block(sizeof code);
  if (!code_block) abort();
  memcpy(code_block, code, sizeof code);
  code_block = set_exec_prot(code_block, sizeof code);
  if (!code_block) abort();
  int (*f)(int) = (int (*)(int))code_block;
  printf("%d ", f(3));
  free_code_block(code_block, sizeof code);
  return 0;
}

The code is for x64 chips and adds 2 to the function’s first parameter and returns the result, and it will run on Solaris, Linux, FreeBSD, and macOS (and any system that uses the System V AMD64 ABI calling convention). It will not run on Windows because although Windows run on the same hardware, the convention for where functions get their input differs. On Windows, the input is in register rcx instead of rdi. Sorry, it is hard to write portable code when you write directly to the machine.

There might be another issue between writing your code to memory and executing it, although not on my architecture. The instruction and the data cache/bus might not be the same. So, you could be writing code to memory as data and then executing code at the same addresses, but getting old cached data. If that is the case, you need to flush the caches, and your compiler or system will have ways of doing that—not portable ways, though.

Tagged Pointers

Since we have left any attempts at writing portable code behind us now, I feel that I can show you a trick that isn’t exactly portable, but that you can often use anyway. We are writing code for a machine where pointers are simple 64-bit integers, and we can treat them as such. The C standard does not guarantee that, but practically all architectures will let us manipulate pointers as integers, and we can exploit that if we are brave enough.

There is a trick used by virtual machines that exploit this. If we have data that we know has stricter alignment rules than char, that is, that cannot lie on all possible addresses, then we have bits in pointers to them that will always be zero. If integers, for example, align at offset 4, then the two lower bits must always be zero. That means that we can use those bits for something else, as long as we remember to set them back to zero before we use them as pointers. Imagine that we have a virtual machine that represents integers as some structure that can have arbitrary size, perhaps encoded as arrays of int. That means that general integers are int *. But smaller integers can fit into a pointer, so we could put them there if we don’t need to put them on the heap. If bit 0 is always 0 for pointers, we could set it to 1 to indicate that we have put the actual integer in the pointer instead. For a small integer, we shift it one bit up and set the lowest bit to one, and that is their representation . For general integers, we have a pointer. To extract an integer, check the lowest bit. If it is 0, dereference; if it is 1, shift the remaining bits down. You get one bit less to represent small integers, but you can save all integers in the same (integer pointer) structure:
#include <stdio.h>
#define smallint(i)     (int *)((i) << 1 | 1)
#define get_smallint(p) ((int)(p) >> 1)
#define is_smallint(p)  ((int)(p) & 1)
#define get_int(p)
  (is_smallint(p) ? get_smallint(p) : (*(p)))
int main(void)
{
  int *p1 = smallint(1);
  int *p2 = smallint(2);
  printf("%d + %d = %d ",
          get_int(p1), get_int(p2),
          get_int(p1) + get_int(p2));
  int i3 = 3;
  p2 = &i3;
  printf("%d + %d = %d ",
         get_int(p1), get_int(p2),
         get_int(p1) + get_int(p2));
  return 0;
}
We could use that to encode the size of the blocks of memory we allocated when we generate code. We can allocate whole pages at a time, which we would have to anyway because the granularity of protection bits is whole pages, and encode how many pages we allocated in the lower bits of pointers. Then we pack those into what I will call a JIT pointer (for just-in-time compilation, the term usually used when we generate code on the fly). If page sizes are 4k (they are on my machine), then we have 12 bits for the size. With appropriate masking, we can pack both the pointer and the size into one pointer. The following jit_ptr() macro packs the size and pointer together, using simple binary or. The jit_pages() macro masks out the lower 12 bits to give us the size, and the jit_func() macro masks away the size bits. The macro uses a compiler extension, __typeof__(), for type-casting. We are no longer writing portable code, so I will use what my compiler provides. If you don’t have __typeof__() or something similar, you have to cast where you use the macro.
// You can get PAGESIZE from POSIX sysconf(_SC_PAGESIZE);
// from sysconf() from <unistd.h>
#define PAGESIZE 4P96
// You can get the number of free bits from
// POSIX ffs() from <strings.h>  as ffs(page_size) - 1;
#define CODE_SIZE_BITS  12
#define CODE_SIZE_MASK  ((1ull << CODE_SIZE_BITS) - 1)
#define MAX_CODE_PAGES  CODE_SIZE_MASK
#define CODE_PTR_MASK   (~CODE_SIZE_MASK)
#include <stdint.h>
#define jit_ptr(f,s)    (void *)((uint64_t)f | s)
#define jit_pages(p)    ((uint64_t)p & CODE_SIZE_MASK)
// using compiler extension __typeof__ for cast
#define jit_func(p)     ((__typeof__(p))((uint64_t)p & CODE_PTR_MASK))
void jit_free_void(void *p)
{
  free_code_block((void *)jit_func(p), jit_pages(p));
}
// to avoid function/void pointer warnings
#define jit_free(p) jit_free_void((void*)(p))
Here is a function that generates code and returns a pointer, with size encoded:
void *create_exec_buf(unsigned char *code, size_t size)
{
  size_t pages = (size + PAGESIZE - 1) / PAGESIZE;
  if (pages > MAX_CODE_PAGES) {
    // Too large for us to store the size
    return 0;
  }
  size_t alloc_size = PAGESIZE * pages;
  char *buf = alloc_code_bock(alloc_size);
  if (!buf) return 0;
  memcpy(buf, code, size);
  buf = set_exec_prot(buf, alloc_size);
  if (!buf) return 0;
  return jit_ptr(buf, pages);
}
And you can use it to generate new functions, where you don’t need to worry about remembering the allocated size:
typedef int (*ii_fn)(int);
ii_fn adder(int j)
{
  unsigned char code[] = {
    // lea eax,[rdi+<j>] (0x87 because we use 32-bit int now)
    0x8d, 0x87, // j starts at index 2...
                0x00, 0x00, 0x00, 0x00,
    // ret
    0xc3
  };
  // the int starts at index 2 and goes in the next four
  // bytes, little endian...
  unsigned char *j_code = code + 2;
  for (int offset = 0; offset < 4; offset++) {
    j_code[offset] = (j >> 8 * offset) & 0xff;
  }
  return (ii_fn)create_exec_buf(code, sizeof code);
}

Here, the functions we generated add a number to an integer. Not too exciting, but I don’t want too long machine code listings.

You use generated functions like this, where you must remember to jit_free() them once you are done:
ii_fn add2 = adder(2);
ii_fn add5 = adder(5);
for (int i = 0; i < 5; i++) {
  printf("%d, %d, %d ", i,
  jit_func(add2)(i), jit_func(add5)(i));
}
jit_free(add2);
jit_free(add5);
If you didn’t get 12 bits from the page alignment, but something substantially smaller, you could also encode the size in the high bits. On an x64 architecture, pointers are 64 bits, but only 48 of the bits are used. The rest are left for future extensions. Therefore, you have the 16 high bits to play with! The people who specified the machine architecture knew that we would do something like that, so they made rules to prevent it. We are not allowed to set the bits arbitrarily, relying on the system ignoring them. Instead, all the high bits must be the same as bit 47. If it is set, all the high bits must be set. If it is zero, all the high bits must be zero. It is not exactly hard crypto , so we can easily circumvent the rule they made to prevent us from doing exactly what we are doing.2 We can put the size in the high 16 bits by shifting up 48 bits. When we want the size back, we shift the size back. When we want to use the pointer as an actual pointer, we remove the size and set the top bits to their canonical form:
#define CODE_SIZE_BITS  16
#define MAX_CODE_PAGES  ((1ull << CODE_SIZE_BITS) - 1)
#define CODE_PTR_MASK   ((1ull << 48) - 1)
#define CODE_SIZE_MASK  (~CODE_PTR_MASK)
#include <stdint.h>
#define jit_ptr(f,s)
  (void *)(((uint64_t)f & CODE_PTR_MASK) | (s << 48))
#define jit_pages(p)
  (((uint64_t)p & CODE_SIZE_MASK) >> 48)
// upper 16 set if 47 set
#define upper_bits(p)
  ~(((uint64_t)p & (1ull << 47)) - 1)
#define lower_bits(p)
  ((uint64_t)p & CODE_PTR_MASK)
#define canonical_ptr(p)
   (lower_bits(p) | upper_bits(p))
// using compiler extension __typeof__ for cast
#define jit_func(p)
    ((__typeof__(p))(canonical_ptr(p)))

I don’t recommend that you use the high bits this way, but you can if you want to. You are in danger when machines are eventually updated to use more bits, but with code this low level, and with code generation, it is a price you might be willing to pay.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.126.241