C arrays and pointers

C arrays are a collection of items of the same type, stored contiguously in memory. Before digging into the details, it is helpful to understand (or review) how memory is managed in C.

Variables in C are like containers. When creating a variable, a space in memory is reserved to store its value. For example, if we create a variable containing a 64 bit floating point number (double), the program will allocate 64 bit (16 bytes) of memory. This portion of memory can be accessed through an address to that memory location.

To obtain the address of a variable, we can use the address operator denoted by the & symbol. We can also use the printf function, as follows, available in the libc.stdio Cython module to print the address of this variable:

    %%cython 
cdef double a
from libc.stdio cimport printf
printf("%p", &a)
# Output:
# 0x7fc8bb611210

Memory addresses can be stored in special variables, pointers, that can be declared by putting a * prefix in front of the variable name, as follows:

    from libc.stdio cimport printf 
cdef double a
cdef double *a_pointer
a_pointer = &a # a_pointer and &a are of the same type

If we have a pointer, and we want to grab the value contained in the address it's pointing at, we can use the dereference operator denoted by the * symbol. Be careful, the * used in this context has a different meaning from the * used in the variable declaration:

    cdef double a 
cdef double *a_pointer
a_pointer = &a

a = 3.0
print(*a_pointer) # prints 3.0

When declaring a C array, the program allocates enough space to accommodate all the elements requested. For instance, to create an array that has 10 double values (16 bytes each), the program will reserve 16 * 10 = 160 bytes of contiguous space in memory. In Cython, we can declare such arrays using the following syntax:

    cdef double arr[10]

We can also declare a multidimensional array, such as an array with 5 rows and 2 columns, using the following syntax:

    cdef double arr[5][2] 

The memory will be allocated in a single block of memory, row after row. This order is commonly referred to as row-major and is depicted in the following figure. Arrays can also be ordered column-major, as is the case for the FORTRAN programming language:

Array ordering has important consequences. When iterating a C array over the last dimension, we access contiguous memory blocks (in our example, 0, 1, 2, 3 ...) while when we iterate on the first dimension, we skip a few positions (0, 2, 4, 6, 8, 1 ... ). You should always try to access memory sequentially as this optimizes cache and memory usage.

We can store and retrieve elements from the array using standard indexing; C arrays don't support fancy indexing or slices:

    arr[0] = 1.0 

C arrays have many of the same behaviors as pointers. The arr variable, in fact, points to the memory location of the first element of the array. We can verify that the address of the first element of the array is the same as the address contained in the arr variable using the dereference operator, as follows:

    %%cython 
from libc.stdio cimport printf
cdef double arr[10]
printf("%pn", arr)
printf("%pn", &arr[0])

# Output
# 0x7ff6de204220
# 0x7ff6de204220

You should use C arrays and pointers when interfacing with the existing C libraries or when you need a fine control over the memory (also, they are very performant). This level of fine control is also prone to mistakes as it doesn't prevent you from accessing the wrong memory locations. For more common use cases and improved safety, you can use NumPy arrays or typed memoryviews.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.98.207