The overall goal of this book is to give a picture of how computers work on many levels, from the transistors by which they are constructed all the way up to the software they run. The first five chapters of this book work up through the lower levels of abstraction, from transistors to gates to logic design. Chapters 6 through 8 jump up to architecture and work back down to microarchitecture to connect the hardware with the software. This Appendix on C programming fits logically between Chapters 5 and 6, covering C programming as the highest level of abstraction in the text. It motivates the architecture material and links this book to programming experience that may already be familiar to the reader. This material is placed in the Appendix so that readers may easily cover or skip it depending on previous experience.
Programmers use many different languages to tell a computer what to do. Fundamentally, computers process instructions in machine language consisting of 1’s and 0’s, as will be explored in Chapter 6. But programming in machine language is tedious and slow, leading programmers to use more abstract languages to get their meaning across more efficiently. Table C.1 lists some examples of languages at various levels of abstraction.
Table C.1 Languages at roughly decreasing levels of abstraction
Language | Description |
Matlab | Designed to facilitate heavy use of math functions |
Perl | Designed for scripting |
Python | Designed to emphasize code readability |
Java | Designed to run securely on any computer |
C | Designed for flexibility and overall system access, including device drivers |
Assembly Language | Human-readable machine language |
Machine Language | Binary representation of a program |
One of the most popular programming languages ever developed is called C. It was created by a group including Dennis Ritchie and Brian Kernighan at Bell Laboratories between 1969 and 1973 to rewrite the UNIX operating system from its original assembly language. By many measures, C (including a family of closely related languages such as C++, C#, and Objective C) is the most widely used language in existence. Its popularity stems from a number of factors including:
Availability on a tremendous variety of platforms, from supercomputers down to embedded microcontrollers
Relative ease of use, with a huge user base
Moderate level of abstraction providing higher productivity than assembly language, yet giving the programmer a good understanding of how the code will be executed
Dennis Ritchie, 1941–2011
Brian Kernighan, 1942–
C was formally introduced in 1978 by Brian Kernighan and Dennis Ritchie’s classic book, The C Programming Language. In 1989, the American National Standards Institute (ANSI) expanded and standardized the language, which became known as ANSI C, Standard C, or C89. Shortly thereafter, in 1990, this standard was adopted by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). ISO/IEC updated the standard in 1999 to what is called C99, which we will be discussing in this text.
This chapter is devoted to C programming for a variety of reasons. Most importantly, C allows the programmer to directly access addresses in memory, illustrating the connection between hardware and software emphasized in this book. C is a practical language that all engineers and computer scientists should know. Its uses in many aspects of implementation and design – e.g., software development, embedded systems programming, and simulation – make proficiency in C a vital and marketable skill.
The following sections describe the overall syntax of a C program, discussing each part of the program including the header, function and variable declarations, data types, and commonly used functions provided in libraries. Section 8.6 describes a hands-on application by using C to program a PIC32 microcontroller.
High-level programming: High-level programming is useful at many levels of design, from writing analysis or simulation software to programming microcontrollers that interact with hardware.
Low-level access: C code is powerful because, in addition to high-level constructs, it provides access to low-level hardware and memory.
C is the language used to program such ubiquitous systems as Linux, Windows, and iOS. C is a powerful language because of its direct access to hardware. As compared with other high level languages, for example Perl and Matlab, C does not have as much built-in support for specialized operations such as file manipulation, pattern matching, matrix manipulation, and graphical user interfaces. It also lacks features to protect the programmer from common mistakes, such as writing data past the end of an array. Its power combined with its lack of protection has assisted hackers who exploit poorly written software to break into computer systems.
A C program is a text file that describes operations for the computer to perform. The text file is compiled, converted into a machine-readable format, and run or executed on a computer. C Code Example C.1 is a simple C program that prints the phrase “Hello world!” to the console, the computer screen. C programs are generally contained in one or more text files that end in “.c”. Good programming style requires a file name that indicates the contents of the program – for example, this file could be called hello.c.
C Code Example C.1 Simple C Program
// Write "Hello world!" to the console
#include <stdio.h>
int main(void){
printf("Hello world! ");
}
Hello world!
In general, a C program is organized into one or more functions. Every program must include the main function, which is where the program starts executing. Most programs use other functions defined elsewhere in the C code and/or in a library. The overall sections of the hello.c program are the header, the main function, and the body.
While this chapter provides a fundamental understanding of C programming, entire texts are written that describe C in depth. One of our favorites is the classic text The C Programming Language by Brian Kernighan and Dennis Ritchie, the developers of C. This text gives a concise description of the nuts and bolts of C. Another good text is A Book on C by Al Kelley and Ira Pohl.
The header includes the library functions needed by the program. In this case, the program uses the printf function, which is part of the standard I/O library, stdio.h. See Section C.9 for further details on C’s built-in libraries.
All C programs must include exactly one main function. Execution of the program occurs by running the code inside main, called the body of main. Function syntax is described in Section C.6. The body of a function contains a sequence of statements. Each statement ends with a semicolon. int denotes that the main function outputs, or returns, an integer result that indicates whether the program ran successfully.
The body of this main function contains one statement, a call to the printf function, which prints the phrase “Hello world!” followed by a newline character indicated by the special sequence “ ”. Further details about I/O functions are described in Section C.9.1.
All programs follow the general format of the simple hello.c program. Of course, very complex programs may contain millions of lines of code and span hundreds of files.
C programs can be run on many different machines. This portability is another advantage of C. The program is first compiled on the desired machine using the C compiler. Slightly different versions of the C compiler exist, including cc (C compiler), or gcc (GNU C compiler). Here we show how to compile and run a C program using gcc, which is freely available for download. It runs directly on Linux machines and is accessible under the Cygwin environment on Windows machines. It is also available for many embedded systems such as Microchip PIC32 microcontrollers. The general process described below of C file creation, compilation, and execution is the same for any C program.
1. Create the text file, for example hello.c.
2. In a terminal window, change to the directory that contains the file hello.c and type gcc hello.c at the command prompt.
3. The compiler creates an executable file. By default, the executable is called a.out (or a.exe on Windows machines).
4. At the command prompt, type ./a.out (or ./a.exe on Windows) and press return.
filename.c: C program files are typically named with a .c extension.
main: Each C program must have exactly one main function.
#include: Most C programs use functions provided by built-in libraries. These functions are used by writing #include <library.h> at the top of the C file.
gcc filename.c: C files are converted into an executable using a compiler such as the GNU compiler (gcc) or the C compiler (cc).
Execution: After compilation, C programs are executed by typing ./a.out (or ./a.exe) at the command line prompt.
A compiler is a piece of software that reads a program in a high-level language and converts it into a file of machine code called an executable. Entire textbooks are written on compilers, but we describe them here briefly. The overall operation of the compiler is to (1) preprocess the file by including referenced libraries and expanding macro definitions, (2) ignore all unnecessary information such as comments, (3) translate the high-level code into simple instructions native to the processor that are represented in binary, called machine language, and (4) compile all the instructions into a single binary executable that can be read and executed by the computer. Each machine language is specific to a given processor, so a program must be compiled specifically for the system on which it will run. For example, the MIPS machine language is covered in Chapter 6 in detail.
Programmers use comments to describe code at a high-level and clarify code function. Anyone who has read uncommented code can attest to their importance. C programs use two types of comments: Single-line comments begin with // and terminate at the end of the line; multiple-line comments begin with /* and end with */. While comments are critical to the organization and clarity of a program, they are ignored by the compiler.
// This is an example of a one-line comment.
/* This is an example
of a multi-line comment. */
A comment at the top of each C file is useful to describe the file’s author, creation and modification dates, and purpose. The comment below could be included at the top of the hello.c file.
// hello.c
// 1 June 2012 [email protected], [email protected]
//
// This program prints "Hello world!" to the screen
Constants are named using the #define directive and then used by name throughout the program. These globally defined constants are also called macros. For example, suppose you write a program that allows at most 5 user guesses, you can use #define to identify that number.
#define MAXGUESSES 5
The # indicates that this line in the program will be handled by the preprocessor. Before compilation, the preprocessor replaces each occurrence of the identifier MAXGUESSES in the program with 5. By convention, #define lines are located at the top of the file and identifiers are written in all capital letters. By defining constants in one location and then using the identifier in the program, the program remains consistent, and the value is easily modified – it need only be changed at the #define line instead of at each line in the code where the value is needed.
Number constants in C default to decimal but can also be hexadecimal (prefix "0x") or octal (prefix "0"). Binary constants are not defined in C99 but are supported by some compilers (prefix "0b"). For example, the following assignments are equivalent:
char x = 37;
char x = 0x25;
char x = 045;
C Code Example C.2 shows how to use the #define directive to convert inches to centimeters. The variables inch and cm are declared to be float to indicate they represent single-precision floating point numbers. If the conversion factor (INCH2CM) were used throughout a large program, having it declared using #define obviates errors due to typos (for example, typing 2.53 instead of 2.54) and makes it easy to find and change (for example, if more significant digits were required).
C Code Example C.2 Using #define to Declare Constants
// Convert inches to centimeters
#include <stdio.h>
#define INCH2CM 2.54
int main(void) {
float inch = 5.5; // 5.5 inches
float cm;
cm = inch * INCH2CM;
printf("%f inches = %f cm ", inch, cm);
}
5.500000 inches = 13.970000 cm
Globally defined constants eradicate magic numbers from a program. A magic number is a constant that shows up in a program without a name. The presence of magic numbers in a program often introduces tricky bugs – for example, when the number is changed in one location but not another.
Modularity encourages us to split programs across separate files and functions. Commonly used functions can be grouped together for easy reuse. Variable declarations, defined values, and function definitions located in a header file can be used by another file by adding the #include preprocesser directive. Standard libraries that provide commonly used functions are accessed in this way. For example, the following line is required to use the functions defined in the standard input/output (I/O) library, such as printf.
#include <stdio.h>
The “.h” postfix of the include file indicates it is a header file. While #include directives can be placed anywhere in the file before the included functions, variables, or identifiers are needed, they are conventionally located at the top of a C file.
Programmer-created header files can also be included by using quotation marks (" ") around the file name instead of brackets (< >). For example, a user-created header file called myfunctions.h would be included using the following line.
#include "myfunctions.h"
At compile time, files specified in brackets are searched for in system directories. Files specified in quotes are searched for in the same local directory where the C file is found. If the user-created header file is located in a different directory, the path of the file relative to the current directory must be included.
Comments: C provides single-line comments (//) and multi-line comments (/* */).
#define NAME val: the #define directive allows an identifier (NAME) to be used throughout the program. Before compilation, all instances of NAME are replaced with val.
#include: #include allows common functions to be used in a program. For built-in libraries, include the following line at the top of the code: #include <library.h> To include a user-defined header file, the name must be in quotes, listing the path relative to the current directory as needed: i.e., #include "other/myFuncs.h".
Variables in C programs have a type, name, value, and memory location. A variable declaration states the type and name of the variable. For example, the following declaration states that the variable is of type char (which is a 1-byte type), and that the variable name is x. The compiler decides where to place this 1-byte variable in memory.
char x;
Variable names are case sensitive and can be of your choosing. However, the name may not be any of C’s reserved words (i.e., int, while, etc.), may not start with a number (i.e., int 1x; is not a valid declaration), and may not include special characters such as , *, ?, or -. Underscores (_) are allowed.
C views memory as a group of consecutive bytes, where each byte of memory is assigned a unique number indicating its location or address, as shown in Figure C.1. A variable occupies one or more bytes of memory, and the address of multiple-byte variables is indicated by the lowest numbered byte. The type of a variable indicates whether to interpret the byte(s) as an integer, floating point number, or other type. The rest of this section describes C’s primitive data types, the declaration of global and local variables, and the initialization of variables.
Figure C.1 C’s view of memory
C has a number of primitive, or built-in, data types available. They can be broadly characterized as integers, floating-point variables, and characters. An integer represents a 2’s complement or unsigned number within a finite range. A floating-point variable uses IEEE floating point representation to describe real numbers with a finite range and precision. A character can be viewed as either an ASCII value or an 8-bit integer.1 Table C.2 lists the size and range of each primitive type. Integers may be 16, 32, or 64 bits. They use 2’s complement unless qualified as unsigned. The size of the int type is machine dependent and is generally the native word size of the machine. For example, on a 32-bit MIPS processor, the size of int or unsigned int is 32 bits. Floating point numbers may be 32- or 64-bit single or double precision. Characters are 8 bits.
Table C.2 Primitive data types and sizes
The machine-dependent nature of the int data type is a blessing and a curse. On the bright side, it matches the native word size of the processor so it can be fetched and manipulated efficiently. On the down side, programs using ints may behave differently on different computers. For example, a banking program might store the number of cents in your bank account as an int. When compiled on a 64-bit PC, it will have plenty of range for even the wealthiest entrepreneur. But if it is ported to a 16-bit microcontroller, it will overflow for accounts exceeding $327.67, resulting in unhappy and poverty-stricken customers.
C Code Example C.3 shows the declaration of variables of different types. As shown in Figure C.2, x requires one byte of data, y requires two, and z requires four. The program decides where these bytes are stored in memory, but each type always requires the same amount of data. For illustration, the addresses of x, y, and z in this example are 1, 2, and 4. Variable names are case-sensitive, so, for example, the variable x and the variable X are two different variables. (But it would be very confusing to use both in the same program!)
C Code Example C.3 Example Data Types
// Examples of several data types and their binary representations
unsigned char x = 42; // x = 00101010
short y = −10; // y = 11111111 11110110
unsigned long z = 0; // z = 00000000 00000000 00000000 00000000
Figure C.2 Variable storage in memory for C Code Example C.3
Global and local variables differ in where they are declared and where they are visible. A global variable is declared outside of all functions, typically at the top of a program, and can be accessed by all functions. Global variables should be used sparingly because they violate the principle of modularity, making large programs more difficult to read. However, a variable accessed by many functions can be made global.
The scope of a variable is the context in which it can be used. For example, for a local variable, its scope is the function in which it is declared. It is out of scope everywhere else.
A local variable is declared inside a function and can only be used by that function. Therefore, two functions can have local variables with the same names without interfering with each other. Local variables are declared at the beginning of a function. They cease to exist when the function ends and are recreated when the function is called again. They do not retain their value from one invocation of a function to the next.
C Code Examples C.4 and C.5 compare programs using global versus local variables. In C Code Example C.4, the global variable max can be accessed by any function. Using a local variable, as shown in C Code Example C.5, is the preferred style because it preserves the well-defined interface of modularity.
C Code Example C.4 Global Variables
// Use a global variable to find and print the maximum of 3 numbers
int max; // global variable holding the maximum value
void findMax(int a, int b, int c) {
max = a;
if (b > max) {
if (c > b) max = c;
else max = b;
} else if (c > max) max = c;
}
void printMax(void) {
printf("The maximum number is: %d ", max);
}
int main(void) {
findMax(4, 3, 7);
printMax();
}
C Code Example C.5 Local Variables
// Use local variables to find and print the maximum of 3 numbers
int getMax(int a, int b, int c) {
int result = a; // local variable holding the maximum value
if (b > result) {
if (c > b) result = c;
else result = b;
} else if (c > result) result = c;
return result;
}
void printMax(int m) {
printf("The maximum number is: %d ", m);
}
int main(void) {
int max;
max = getMax(4, 3, 7);
printMax(max);
}
A variable needs to be initialized – assigned a value – before it is read. When a variable is declared, the correct number of bytes is reserved for that variable in memory. However, the memory at those locations retains whatever value it had last time it was used, essentially a random value. Global and local variables can be initialized either when they are declared or within the body of the program. C Code Example C.3 shows variables initialized at the same time they are declared. C Code Example C.4 shows how variables are initialized before their use, but after declaration; the global variable max is initialized by the getMax function before it is read by the printMax function. Reading from uninitialized variables is a common programming error, and can be tricky to debug.
Variables: Each variable is defined by its data type, name, and memory location. A variable is declared as datatype name.
Data types: A data type describes the size (number of bytes) and representation (interpretation of the bytes) of a variable. Table C.2 lists C’s built-in data types.
Memory: C views memory as a list of bytes. Memory stores variables and associates each variable with an address (byte number).
Global variables: Global variables are declared outside of all functions and can be accessed anywhere in the program.
Local variables: Local variables are declared within a function and can be accessed only within that function.
Variable initialization: Each variable must be initialized before it is read. Initialization can happen either at declaration or afterward.
The most common type of statement in a C program is an expression, such as
y = a + 3;
An expression involves operators (such as + or *) acting on one or more operands, such as variables or constants. C supports the operators shown in Table C.3, listed by category and in order of decreasing precedence. For example, multiplicative operators take precedence over additive operators. Within the same category, operators are evaluated in the order that they appear in the program.
Table C.3 Operators listed by decreasing precedence
Unary operators, also called monadic operators, have a single operand. Ternary operators have three operands, and all others have two. The ternary operator (from the Latin ternarius meaning consisting of three) chooses the second or third operand depending on whether the first value is TRUE (nonzero) or FALSE (zero), respectively. C Code Example C.6 shows how to compute y = max(a,b) using the ternary operator, along with an equivalent but more verbose if/else statement.
The Truth, the Whole Truth, and Nothing But the Truth
C considers a variable to be TRUE if it is nonzero and FALSE if it is zero. Logical and ternary operators, as well as control-flow statements such as if and while, depend on the truth of a variable. Relational and logical operators produce a result that is 1 when TRUE or 0 when FALSE.
Simple assignment uses the = operator. C code also allows for compound assignment, that is, assignment after a simple operation such as addition (+=) or multiplication (*=). In compound assignments, the variable on the left side is both operated on and assigned the result. C Code Example C.7 shows these and other C operations. Binary values in the comments are indicated with the prefix “0b”.
C Code Example C.7 Operator Examples
Expression | Result | Notes |
44 / 14 | 3 | Integer division truncates |
44 % 14 | 2 | 44 mod 14 |
0x2C && 0xE //0b101100 && 0b1110 | 1 | Logical AND |
0x2C || 0xE //0b101100 || 0b1110 | 1 | Logical OR |
0x2C & 0xE //0b101100 & 0b1110 | 0xC (0b001100) | Bitwise AND |
0x2C | 0xE //0b101100 | 0b1110 | 0x2E (0b101110) | Bitwise OR |
0x2C ^ 0xE //0b101100 ^ 0b1110 | 0x22 (0b100010) | Bitwise XOR |
0xE << 2 //0b1110 << 2 | 0x38 (0b111000) | Left shift by 2 |
0x2C >> 3 //0b101100 >> 3 | 0x5 (0b101) | Right shift by 3 |
x = 14; x += 2; | x=16 | |
y = 0x2C; // y = 0b101100 y &= 0xF; // y &= 0b1111 | y=0xC (0b001100) | |
x = 14; y = 44; y = y + x++; | x=15, y=58 | Increment x after using it |
x = 14; y = 44; y = y + ++x; | x=15, y=59 | Increment x before using it |
Modularity is key to good programming. A large program is divided into smaller parts called functions that, similar to hardware modules, have well-defined inputs, outputs, and behavior. C Code Example C.8 shows the sum3 function. The function declaration begins with the return type, int, followed by the name, sum3, and the inputs enclosed within parentheses (int a, int b, int c). Curly braces {} enclose the body of the function, which may contain zero or more statements. The return statement indicates the value that the function should return to its caller; this can be viewed as the output of the function. A function can only return a single value.
C Code Example C.8 sum3 Function
// Return the sum of the three input variables
int sum3(int a, int b, int c) {
int result = a + b + c;
return result;
}
After the following call to sum3, y holds the value 42.
int y = sum3(10, 15, 17);
Although a function may have inputs and outputs, neither is required. C Code Example C.9 shows a function with no inputs or outputs. The keyword void before the function name indicates that nothing is returned. void between the parentheses indicates that the function has no input arguments.
Nothing between the parentheses also indicates no input arguments. So, in this case we could have written:
void printPrompt()
C Code Example C.9 Function printPrompt with no Inputs or Outputs
// Print a prompt to the console
void printPrompt(void)
{
printf("Please enter a number from 1-3: ");
}
A function must be declared in the code before it is called. This may be done by placing the called function earlier in the file. For this reason, main is often placed at the end of the C file after all the functions it calls. Alternatively, a function prototype can be placed in the program before the function is defined. The function prototype is the first line of the function, declaring the return type, function name, and function inputs. For example, the function prototypes for the functions in C Code Examples C.8 and C.9 are:
int sum3(int a, int b, int c);
void printPrompt(void);
With careful ordering of functions, prototypes may be unnecessary. However, they are unavoidable in certain cases, such as when function f1 calls f2 and f2 calls f1. It is good style to place prototypes for all of a program’s functions near the beginning of the C file or in a header file.
C Code Example C.10 shows how function prototypes are used. Even though the functions themselves are after main, the function prototypes at the top of the file allow them to be used in main.
C Code Example C.10 Function Prototypes
#include <stdio.h>
// function prototypes
int sum3(int a, int b, int c);
void printPrompt(void);
int main(void)
{
int y = sum3(10, 15, 20);
printf("sum3 result: %d ", y);
printPrompt();
}
int sum3(int a, int b, int c) {
int result = a+b+c;
return result;
}
void printPrompt(void) {
printf("Please enter a number from 1-3: ");
}
sum3 result: 45
Please enter a number from 1-3:
As with variable names, function names are case sensitive, cannot be any of C’s reserved words, may not contain special characters (except underscore _), and cannot start with a number. Typically function names include a verb to indicate what they do.
Be consistent in how you capitalize your function and variable names so you don’t have to constantly look up the correct capitalization. Two common styles are to camelCase, in which the initial letter of each word after the first is capitalized like the humps of a camel (e.g., printPrompt), or to use underscores between words (e.g., print_prompt). We have unscientifically observed that reaching for the underscore key exacerbates carpal tunnel syndrome (my pinky finger twinges just thinking about the underscore!) and hence prefer camelCase. But the most important thing is to be consistent in style within your organization.
The main function is always declared to return an int, which conveys to the operating system the reason for program termination. A zero indicates normal completion, while a nonzero value signals an error condition. If main reaches the end without encountering a return statement, it will automatically return 0. Most operating systems do not automatically inform the user of the value returned by the program.
C provides control-flow statements for conditionals and loops. Conditionals execute a statement only if a condition is met. A loop repeatedly executes a statement as long as a condition is met.
if, if/else, and switch/case statements are conditional statements commonly used in high-level languages including C.
An if statement executes the statement immediately following it when the expression in parentheses is TRUE (i.e., nonzero). The general format is:
if (expression)
statement
C Code Example C.11 shows how to use an if statement in C. When the variable aintBroke is equal to 1, the variable dontFix is set to 1. A block of multiple statements can be executed by placing curly braces {} around the statements, as shown in C Code Example C.12.
Curly braces, {}, are used to group one or more statements into a compound statement or block.
C Code Example C.11 if Statement
int dontFix = 0;
if (aintBroke == 1)
dontFix = 1;
C Code Example C.12 if Statement with A Block of Code
// If amt >= $2, prompt user and dispense candy
if (amt >= 2) {
printf("Select candy. ");
dispenseCandy = 1;
}
if/else statements execute one of two statements depending on a condition, as shown below. When the expression in the if statement is TRUE, statement1 is executed. Otherwise, statement2 is executed.
if (expression)
statement1
else
statement2
C Code Example C.6(b) gives an example if/else statement in C. The code sets max equal to a if a is greater than b; otherwise max = b.
switch/case statements execute one of several statements depending on the conditions, as shown in the general format below.
switch (variable) {
case (expression1): statement1 break;
case (expression2): statement2 break;
case (expression3): statement3 break;
default: statement4
}
For example, if variable is equal to expression2, execution continues at statement2 until the keyword break is reached, at which point it exits the switch/case statement. If no conditions are met, the default executes.
If the keyword break is omitted, execution begins at the point where the condition is TRUE and then falls through to execute the remaining cases below it. This is usually not what you want and is a common error among beginning C programmers.
C Code Example C.13 shows a switch/case statement that, depending on the variable option, determines the amount of money amt to be disbursed. A switch/case statement is equivalent to a series of nested if/else statements, as shown by the equivalent code in C Code Example C.14.
C Code Example C.13 switch/case Statement
// Assign amt depending on the value of option
switch (option) {
case 1: amt = 100; break;
case 2: amt = 50; break;
case 3: amt = 20; break;
case 4: amt = 10; break;
default: printf("Error: unknown option. ");
}
C Code Example C.14 Nested if/else Statement
// Assign amt depending on the value of option
if (option == 1) amt = 100;
else if (option == 2) amt = 50;
else if (option == 3) amt = 20;
else if (option == 4) amt = 10;
else printf("Error: unknown option. ");
while, do/while, and for loops are common loop constructs used in many high-level languages including C. These loops repeatedly execute a statement as long as a condition is satisfied.
while loops repeatedly execute a statement until a condition is not met, as shown in the general format below.
while (condition)
statement
The while loop in C Code Example C.15 computes the factorial of 9 = 9 × 8 × 7 × … × 1. Note that the condition is checked before executing the statement. In this example, the statement is a compound statement or block, so curly braces are required.
C Code Example C.15 while Loop
// Compute 9! (the factorial of 9)
int i = 1, fact = 1;
// multiply the numbers from 1 to 9
while (i < 10) { // while loops check the condition first
fact *= i;
i++;
}
do/while loops are like while loops but the condition is checked only after the statement is executed once. The general format is shown below. The condition is followed by a semi-colon.
do
statement
while (condition);
The do/while loop in C Code Example C.16 queries a user to guess a number. The program checks the condition (if the user’s number is equal to the correct number) only after the body of the do/while loop executes once. This construct is useful when, as in this case, something must be done (for example, the guess retrieved from the user) before the condition is checked.
C Code Example C.16 do/while Loop
// Query user to guess a number and check it against the correct number.
#define MAXGUESSES 3
#define CORRECTNUM 7
int guess, numGuesses = 0;
do {
printf("Guess a number between 0 and 9. You have %d more guesses. ",
(MAXGUESSES-numGuesses));
scanf("%d”, &guess); // read user input
numGuesses++;
} while ( (numGuesses < MAXGUESSES) & (guess != CORRECTNUM) );
// do loop checks the condition after the first iteration
if (guess == CORRECTNUM)
printf("You guessed the correct number! ");
for loops, like while and do/while loops, repeatedly execute a statement until a condition is not satisfied. However, for loops add support for a loop variable, which typically keeps track of the number of loop executions. The general format of the for loop is
for (initialization; condition; loop operation)
statement
The initialization code executes only once, before the for loop begins. The condition is tested at the beginning of each iteration of the loop. If the condition is not TRUE, the loop exits. The loop operation executes at the end of each iteration. C Code Example C.17 shows the factorial of 9 computed using a for loop.
C Code Example C.17 for Loop
// Compute 9!
int i; // loop variable
int fact = 1;
for (i=1; i<10; i++)
fact *= i;
Whereas the while and do/while loops in C Code Examples C.15 and C.16 include code for incrementing and checking the loop variable i and numGuesses, respectively, the for loop incorporates those statements into its format. A for loop could be expressed equivalently, but less conveniently, as
initialization;
while (condition) {
statement
loop operation;
}
Control-flow statements: C provides control-flow statements for conditional statements and loops.
Conditional statements: Conditional statements execute a statement when a condition is TRUE. C includes the following conditional statements: if, if/else, and switch/case.
Loops: Loops repeatedly execute a statement until a condition is FALSE. C provides while, do/while, and for loops.
Beyond various sizes of integers and floating-point numbers, C includes other special data types including pointers, arrays, strings, and structures. These data types are introduced in this section along with dynamic memory allocation.
A pointer is the address of a variable. C Code Example C.18 shows how to use pointers. salary1 and salary2 are variables that can contain integers, and ptr is a variable that can hold the address of an integer. The compiler will assign arbitrary locations in RAM for these variables depending on the runtime environment. For the sake of concreteness, suppose this program is compiled on a 32-bit system with salary1 at addresses 0x70-73, salary2 at addresses 0x74-77, and ptr at 0x78-7B. Figure C.3 shows memory and its contents after the program is executed.
Figure C.3 Contents of memory after C Code Example C.18 executes shown (a) by value and (b) by byte using little-endian memory
In a variable declaration, a star (*) before a variable name indicates that the variable is a pointer to the declared type. In using a pointer variable, the * operator dereferences a pointer, returning the value stored at the indicated memory address contained in the pointer. The & operator is pronounced “address of,” and it produces the memory address of the variable being referenced.
Dereferencing a pointer to a non-existent memory location or an address outside of the range accessible by the program will usually cause a program to crash. The crash is often called a segmentation fault.
Pointers are particularly useful when a function needs to modify a variable, instead of just returning a value. Because functions can’t modify their inputs directly, a function can make the input a pointer to the variable. This is called passing an input variable by reference instead of by value, as shown in prior examples. C Code Example C.19 gives an example of passing x by reference so that quadruple can modify the variable directly.
C Code Example C.18 Pointers
// Example pointer manipulations
int salary1, salary2; // 32-bit numbers
int *ptr; // a pointer specifying the address of an int variable
salary1 = 67500; // salary1 = $67,500 = 0x000107AC
ptr = &salary1; // ptr = 0x0070, the address of salary1
salary2 = *ptr + 1000; /* dereference ptr to give the contents of address 70 = $67,500,
then add $1,000 and set salary2 to $68,500 */
C Code Example C.19 Passing an Input Variable by Reference
// Quadruple the value pointed to by a
#include <stdio.h>
void quadruple(int *a)
{
*a = *a * 4;
}
int main(void)
{
int x = 5;
printf("x before: %d ", x);
quadruple(&x);
printf("x after: %d ", x);
return 0;
}
x before: 5
x after: 20
A pointer to address 0 is called a null pointer and indicates that the pointer is not actually pointing to meaningful data. It is written as NULL in a program.
An array is a group of similar variables stored in consecutive addresses in memory. The elements are numbered from 0 to N−1, where N is the size of the array. C Code Example C.20 declares an array variable called scores that holds the final exam scores for three students. Memory space is reserved for three longs, that is, 3 × 4 = 12 bytes. Suppose the scores array starts at address 0x40. The address of the 1st element (i.e., scores[0]) is 0x40, the 2nd element is 0x44, and the 3rd element is 0x48, as shown in Figure C.4. In C, the array variable, in this case scores, is a pointer to the 1st element. It is the programmer’s responsibility not to access elements beyond the end of the array. C has no internal bounds checking, so a program that writes beyond the end of an array will compile fine but may stomp on other parts of memory when it runs.
Figure C.4 scores array stored in memory
C Code Example C.20 Array Declaration
long scores[3]; // array of three 4-byte numbers
The elements of an array can be initialized either at declaration using curly braces {}, as shown in C Code Example C.21, or individually in the body of the code, as shown in C Code Example C.22. Each element of an array is accessed using brackets []. The contents of memory containing the array are shown in Figure C.4. Array initialization using curly braces {} can only be performed at declaration, and not afterward. for loops are commonly used to assign and read array data, as shown in C Code Example C.23.
C Code Example C.21 Array Initialization at Declaration Using { }
long scores[3]={93, 81, 97}; // scores[0]=93; scores[1]=81; scores[2]=97;
C Code Example C.22 Array Initialization Using Assignment
long scores[3];
scores[0] = 93;
scores[1] = 81;
scores[2] = 97;
C Code Example C.23 Array Initialization Using A for Loop
// User enters 3 student scores into an array
long scores[3];
int i, entered;
printf("Please enter the student’s 3 scores. ");
for (i=0; i<3; i++) {
printf("Enter a score and press enter. ");
scanf("%d", &entered);
scores[i] = entered;
}
printf("Scores: %d %d %d ", scores[0], scores[1], scores[2]);
When an array is declared, the length must be constant so that the compiler can allocate the proper amount of memory. However, when the array is passed to a function as an input argument, the length need not be defined because the function only needs to know the address of the beginning of the array. C Code Example C.24 shows how an array is passed to a function. The input argument arr is simply the address of the 1st element of an array. Often the number of elements in an array is also passed as an input argument. In a function, an input argument of type int[] indicates that it is an array of integers. Arrays of any type may be passed to a function.
C Code Example C.24 Passing an Array as an Input Argument
// Initialize a 5-element array, compute the mean, and print the result.
#include <stdio.h>
// Returns the mean value of an array (arr) of length len
float getMean(int arr[], int len) {
int i;
float mean, total = 0;
for (i=0; i < len; i++)
total += arr[i];
mean = total / len;
return mean;
}
int main(void) {
int data[4] = {78, 14, 99, 27};
float avg;
avg = getMean(data, 4);
printf("The average value is: %f. ", avg);
}
The average value is: 54.500000.
An array argument is equivalent to a pointer to the beginning of the array. Thus, getMean could also have been declared as
float getMean(int *arr, int len);
Although functionally equivalent, datatype[] is the preferred method for passing arrays as input arguments because it more clearly indicates that the argument is an array.
A function is limited to a single output, i.e., return variable. However, by receiving an array as an input argument, a function can essentially output more than a single value by changing the array itself. C Code Example C.25 sorts an array from lowest to highest and leaves the result in the same array. The three function prototypes below are equivalent. The length of an array in a function declaration is ignored.
void sort(int *vals, int len);
void sort(int vals[], int len);
void sort(int vals[100], int len);
C Code Example C.25 Passing an Array and its Size as Inputs
// Sort the elements of the array vals of length len from lowest to highest
void sort(int vals[], int len)
{
int i, j, temp;
for (i=0; i<len; i++) {
for (j=i+1; j<len; j++) {
if (vals[i] > vals[j]) {
temp = vals[i];
vals[i] = vals[j];
vals[j] = temp;
}
}
}
}
Arrays may have multiple dimensions. C Code Example C.26 uses a two-dimensional array to store the grades across eight problem sets for ten students. Recall that initialization of array values using {} is only allowed at declaration.
C Code Example C.26 Two-Dimensional Array Initialization
// Initialize 2-D array at declaration
int grades[10][8] = { {100, 107, 99, 101, 100, 104, 109, 117},
{103, 101, 94, 101, 102, 106, 105, 110},
{101, 102, 92, 101, 100, 107, 109, 110},
{114, 106, 95, 101, 100, 102, 102, 100},
{98, 105, 97, 101, 103, 104, 109, 109},
{105, 103, 99, 101, 105, 104, 101, 105},
{103, 101, 100, 101, 108, 105, 109, 100},
{100, 102, 102, 101, 102, 101, 105, 102},
{102, 106, 110, 101, 100, 102, 120, 103},
{99, 107, 98, 101, 109, 104, 110, 108} };
C Code Example C.27 shows some functions that operate on the 2-D grades array from C Code Example C.26. Multi-dimensional arrays used as input arguments to a function must define all but the first dimension. Thus, the following two function prototypes are acceptable:
void print2dArray(int arr[10][8]);
void print2dArray(int arr[][8]);
C Code Example C.27 Operating on Multi-Dimensional Arrays
#include <stdio.h>
// Print the contents of a 10 × 8 array
void print2dArray(int arr[10][8])
{
int i, j;
for (i=0; i<10; i++) { // for each of the 10 students
printf("Row %d ", i);
for (j=0; j<8; j++) {
printf("%d ", arr[i][j]); // print scores for all 8 problem sets
}
printf(" ");
}
}
// Calculate the mean score of a 10 × 8 array
float getMean(int arr[10][8])
{
int i, j;
float mean, total = 0;
// get the mean value across a 2D array
for (i=0; i<10; i++) {
for (j=0; j<8; j++) {
total += arr[i][j]; // sum array values
}
}
mean = total/(10*8);
printf("Mean is: %f ", mean);
return mean;
}
Note that because an array is represented by a pointer to the initial element, C cannot copy or compare arrays using the = or == operators. Instead, you must use a loop to copy or compare each element one at a time.
A character (char) is an 8-bit variable. It can be viewed either as a 2’s complement number between −128 and 127 or as an ASCII code for a letter, digit, or symbol. ASCII characters can be specified as a numeric value (in decimal, hexadecimal, etc.) or as a printable character enclosed in single quotes. For example, the letter A has the ASCII code 0x41, B=0x42, etc. Thus ‘A’ + 3 is 0x44, or ‘D’. Table 6.2 on page 323 lists the ASCII character encodings, and Table C.4 lists characters used to indicate formatting or special characters. Formatting codes include carriage return ( ), newline ( ), horizontal tab ( ), and the end of a string (