You’ll often need to store many data values of a particular kind in your programs. In a program to track the performance of a basketball team, you might want to store the scores for a season of games and the scores for individual players. You could then output the scores for a particular player over the season or work out an ongoing average as the season progresses. Armed with what you’ve learned so far, you could write a program that does this using a different variable for each score. However, if there are a lot of games in the season, this will be rather tedious because you’ll need as many variables for each player as there are games. All your basketball scores are really the same kind of thing. The values are different, but they’re all basketball scores. Ideally, you want to be able to group these values together under a single name—perhaps the name of the player—so that you wouldn’t need separate variables for each item of data.
This chapter will show you how to do just that using arrays. I’ll then show you how powerful referencing a set of values through a single name can be when you process arrays.
What arrays are
How to use arrays in your programs
How memory is used by an array
What a multidimensional array is
How to write a program to work out your hat size
How to write a game of tic-tac-toe
An Introduction to Arrays
The best way to show you what an array is and how powerful it can be is to go through an example. This will demonstrate how much easier a program becomes when you use an array. For this example, you’ll look at ways in which you can find the average grade score for the students in a class.
Programming Without Arrays
If you’re interested only in the average, then you don’t have to remember what the previous grades were. You accumulate the sum of all the values, which you then divide by count, which has the value 10. This program uses a single variable, grade, to store each grade as it is entered within the loop. The loop repeats for values of i from 0 to 9, so there are ten iterations.
Let’s assume you want to develop this into a more sophisticated program in which you need to store the values entered. Perhaps you want to output each person’s grade, with the average grade next to it. In the previous program, you had only one variable. Each time you add a grade, the old value is overwritten, and you can’t get it back.
This is more or less okay for ten students, but what if your class has 30 students or 100 or 1,000? How can you do it then? This approach would become wholly impractical, and an alternative mechanism is essential.
What Is an Array?
The number between square brackets defines how many elements the array contains and is called the array dimension . An array has a type, which is a combination of the element type and the number of elements in the array. Thus, two arrays are of the same type if they have the same number of elements of the same type.
Don’t forget, index values start from zero, not one. It’s a common mistake to assume that they start from one when you’re working with arrays for the first time, and this is sometimes referred to as the off-by-one error . In a ten-element array, the index value for the last element is 9. To access the fourth value in your array, you use the expression numbers[3]. You can think of the index value for an array element as the offset from the first element. The first element is the first element, so its offset is 0. The second element is offset by 1 from the first element, the third element is offset by 2 from the first element, and so on.
You can specify an index for an array element by an expression in the square brackets following the array name. The expression must result in an integer value that corresponds to one of the possible index values. For example, you could write numbers[i-2]. If i is 3, this accesses numbers[1], the second element in the array. Thus, you can use a simple integer to explicitly reference the element that you want to access, or you can use an integer expression that’s evaluated during the execution of the program. When you use an expression, the only constraints are that it must produce an integer result and the result must be a legal index value for the array.
Note that if you use an expression for an index value that’s outside the legal range for the array, the program won’t work properly. The compiler can’t check for this, so your program will still compile, but execution is likely to be less than satisfactory. You’ll pick up a junk value from somewhere so that the results are incorrect and may vary from one run to the next. It’s possible that the program may overwrite something important and lock up your computer, so a reboot becomes necessary. It is also possible that the effect will be much more subtle with the program sometimes working and sometimes not, or the program may appear to work, but the results are wrong but not obviously so. It is therefore most important to ensure that your array indexes are always within bounds.
Using an Array
Let’s put what you’ve just learned about arrays into practice in calculating average grades.
How It Works
The count variable is type unsigned int because it must be nonnegative.
The for loop is in the standard form with the loop continuing as long as i is less than the limit, count. Because the loop counts from 0 to 9, rather than from 1 to 10, you can use the loop variable i directly to reference each of the members of the array. The printf() call outputs the current value of i + 1 followed by >, so it has the effect you see in the output. By using %2u as the format specifier, you ensure that each value is output in a two-character field, so the numbers are aligned. If you used %u instead, the output for the tenth value would be out of alignment.
You read each grade into element i of the array using the scanf() function ; the first value will be stored in grades[0], the second number entered will be stored in grades[1], and so on up to the tenth value entered, which will be stored in grades[9]. You add each grade value to sum on each loop iteration.
You’ve calculated the average by dividing sum by count , the number of grades. Notice how you convert sum (which is type long) to type float in the call to printf(). This conversion ensures that the division is done using floating-point values, so you don’t discard any fractional part of the result. The format specification, %.2f, limits the output value for the average to two decimal places.
How It Works
This for loop steps through the elements in the array and outputs each value. You use the loop control variable to produce the sequence number for the value of the number of the element and to access the corresponding array element. These values obviously correspond to the numbers you typed in. To get the grades starting from 1, you use the expression i + 1 in the output statement so grades are numbered from 1 to 10 because i runs from 0 to 9.
Before I go any further with arrays, I’ll explain a bit more about the address of operator and how arrays are stored in memory.
The Address of Operator
The address of operator, &, produces the address in memory of its operand. You have been using the address of operator extensively with the scanf() function . You’ve been using it as a prefix to the name of the variable where the input is to be stored. This makes the address that the variable occupies available to scanf(), which allows the function to store the data that are entered from the keyboard in the variable. When you use the variable name by itself as an argument to a function, only the value stored in the variable is available to the function. Prefixing the variable name with the address of operator makes the address of the variable available to the function. This enables the function to modify the value that’s stored in the variable. You will learn why this is so in Chapter 8. Let’s see what some addresses look like.
The addresses that you get will almost certainly be different from these. The addresses depend on the operating system you’re using and how your compiler allocates memory.
How It Works
You use %u for the value produced by sizeof because it will be an unsigned integer value. You use a new format specifier, %p, to output the addresses of the variables. This format specifier is for outputting a memory address, and the value is presented in hexadecimal format. A memory address is typically 32 or 64 bits, and the size of the address will determine the maximum amount of memory that can be referenced. A memory address on my computer is 64 bits and is presented as 16 hexadecimal digits; on your machine, it may be different.
There’s a gap between the locations of the variables d and c in Figure 5-2. Why is this? Many compilers allocate space for variables at addresses that are a multiple of their size, so 4-byte variables are at addresses that are a multiple of 4, and 8-byte variables are at addresses that are a multiple of 8. This ensures that accessing the memory is done most efficiently. My compiler left the 4-byte gap between d and c to make the address of d a multiple of 8. If the program defined another variable of type long following c, it would occupy the 4-byte gap, and no gap would be apparent.
Caution If the output shows that the addresses for the variables of the same type are separated by greater amounts than the size for the type, it is most likely because you compiled the program as a debug version. In debug mode, your compiler allocates extra space to store additional information about the variable that will be used when you’re executing the program in debug mode.
Arrays and Addresses
The array name, number, identifies the address in memory where the array elements are stored. The specific location of an element is found by combining the address corresponding to the array name with the index value, because the index value represents the offset of a number of elements from the beginning of the array.
The value of i is displayed between the square brackets following the array name. You can see that the address of each element is 4 greater than the previous element, so each element occupies 4 bytes.
Initializing an Array
This declares the values array with five elements. The elements are initialized with values[0] having the value 1.5, value[1] having the initial value 2.5, and so on.
the first three elements will be initialized with the values between braces, and the last two elements will be initialized with 0.
The entire array will then be initialized with 0.0.
The size of the array is determined by the number of initial values in the list, so the primes array will have ten elements.
Finding the Size of an Array
The parentheses around the type name following the sizeof operator are required. If you leave them out, the code won’t compile. As you know, you can also apply the sizeof operator to a variable, and it will compute the number of bytes occupied by the variable.
The sizeof operator produces a value of type size_t, which is an implementation-defined unsigned integer type. If you use the %u specifier for output and your compiler happens to define size_t as unsigned long or unsigned long long, you may get warning messages from the compiler that the specifier %u does not match the value being output by the printf() function. Using %zu will eliminate the warning messages.
After executing this statement, element_count will contain the number of elements in the values array. I declared element_count to be type size_t because that is the type that the sizeof operator produces.
The result is the same as before because the array is of type double, so sizeof(double) produces the number of bytes occupied by an element. There’s the risk that you might accidentally use the wrong type, so it’s better to use the former statement in practice.
This loop totals the values of all the array elements. Using sizeof to compute the number of elements in the array ensures that the upper limit for the loop variable, i, is always correct, whatever the size of the array.
Multidimensional Arrays
This declares the carrots array with 25 sets of 50 floating-point elements. Note how each dimension is between its own pair of square brackets.
There’s another way of looking at a two-dimensional array. Figure 5-4 also illustrates how you can envision a two-dimensional array as a one-dimensional array of elements, where each element is itself a one-dimensional array. You can view the numbers array as a one-dimensional array of three elements, where each element is an array containing five elements of type float. The first row of five is located at the address labeled numbers[0], the next row at numbers[1], and the last row of five elements at numbers[2].
Because the array elements are of type float, which on my machine occupies 4 bytes, the total memory occupied by this array on my computer will be 4 × 10 × 4 bytes, which amounts to a total of 160 bytes.
This declares an array with 800 elements. You can visualize it as storing yields from bean plants where there are three fields, each containing 10 rows of 20 plants. I’m sure you can see that the idea can be extended to define arrays with as many dimensions as you require.
Initializing Multidimensional Arrays
Each set of values that initializes the elements in a row is between braces, and the whole lot goes between another pair of braces. The values for a row are separated by commas, and each set of row values is separated from the next set by a comma.
As you can see, the initializing values are between an outer pair of braces that enclose two blocks of three rows, each between braces. Each row is also between braces, so you have three levels of nested braces for a three-dimensional array. This is true generally; for instance, a six-dimensional array will have six levels of nested braces enclosing the initial values for the elements. You can omit the braces around the list for each row, and the initialization will still work; but including the braces for the row values is much safer because you are much less likely to make a mistake. Of course, if you want to supply fewer initial values than there are elements in a row, you must include the braces around the row values.
Each loop iterates over one array dimension. For each value of i, the loop controlled by j will execute completely, and for each value of j, the loop controlled by k will execute completely.
You can visualize the numbers array as an array of two-dimensional arrays. The expression sizeof(numbers) results in the number of bytes that the entire numbers array occupies, and sizeof(numbers[0]) results in the number of bytes occupied by one of the two-dimensional subarrays. Thus, the expression sizeof(numbers)/sizeof(numbers[0]) is going to result in the number of elements in the first array dimension. Similarly, you can visualize each two-dimensional subarray as a one-dimensional array of one-dimensional arrays. Dividing the size of a two-dimensional array by the size of one of its subarrays results in the number of subarrays, which is the second numbers dimension. Finally, dividing the size of the one-dimensional sub-subarray by the size of one element results in the third dimension value.
How It Works
Before I start discussing this example, I should give you a word of caution. Do not allow large football players to use it to determine their hat size unless they’re known for their sense of humor.
The example looks a bit complicated because of the nature of the problem, but it does illustrate using arrays. Let’s go through what’s happening.
Apart from hats that are designated as “one size fits all” or as small, medium, and large, hats are typically available in sizes from 6 1/2 to 7 7/8 in increments of 1/8 inch. The size array shows one way in which you could store such sizes in the program. This array corresponds to 12 possible hat sizes, each of which comprises three values. For each hat size, you store three characters, making it more convenient to output the fractional sizes. The smallest hat size is 6 1/2, so the first three characters corresponding to the first size are in size[0][0], size[1][0], and size[2][0]. They contain the characters '6', '1', and '2', representing the size 6 1/2. The biggest hat size is 7 7/8, and it’s stored in the elements size[0][11], size[1][11], size[2][11].
The values in the array are all whole eighths of an inch. They correspond to the values in the size array containing the hat sizes. This means that a head size of 164 eighths of an inch (20.5 inches) will give a hat size of 6 1/2, and at the other end of the scale, 197 eighths corresponds to a hat size of 7 7/8.
Notice that the head sizes don’t run consecutively. You could get a head size of 171, for example, which doesn’t fall into a definite hat size. You need to be aware of this later in the program so that you can decide which is the closest hat size for the head size.
Notice that cranium is declared as type float, but your_head is type int. This becomes important later. You declare the variable hat_found as type bool so you use the symbol false to initialize this. The hat_found variable will record when you have found a size that fits.
Because cranium contains the circumference of a head in inches, multiplying by 8.0f results in the number of eighths of an inch that that represents. Thus, the value stored in your_head will then be in the same units as the values stored in the array headsize. Note that you need to cast the result of the multiplication to type int here to avoid a warning message from the compiler. The code will still work if you omit the cast, but the compiler must then insert the cast to type int. The warning is because this cast potentially loses information. The parentheses around the expression (8.0f*cranium) are also necessary; without them, you would only cast the value 8.0f to type int, not the whole expression.
You declare the loop index, i, before the loop because you want to use the value outside the loop. The if-else statement first checks for the head size matching the first headsize array element, in which case you have found it. When this is not the case, the for loop is executed. The loop index runs from the second element in the array to the last element. This is necessary because you use i - 1 to index the headsize array in the if expression. On each loop iteration, you compare your head size with a pair of successive values stored in the headsize array to find the element value that is greater than or equal to your input size with the preceding value less than your input size. The index found will correspond to the hat size that fits.
As I said earlier, the hat sizes are stored in the array size as characters to simplify the outputting of fractions. The printf() uses the conditional operator to decide when to output a space and when to output a slash (/) for the fractional output value. For example, the fifth element of the headsize array corresponds to a hat size of exactly 7. You want to output 7 rather than 7/. Therefore, you customize the printf() output depending on whether or not the size[1][i] element contains a space character. In this way, you omit the slash for any size where the numerator of the fractional part is a space, so this will still work even if you add new sizes to the array.
If the value in your_head is less than the first headsize element , the head is too small for the available hats; otherwise, it must be too large.
Remember that if you lie about the size of your head when you use this program, your hat won’t fit. The more mathematically astute, and any hatters reading this book, will appreciate that the hat size is simply the diameter of a notionally circular head. Therefore, if you have the circumference of your head in inches, you can produce your hat size by dividing this value by π.
Constant Arrays
We have seen constants in integers so far. In present times, the immutable concept can be a beneficial and safe way to code (trendy in functional programming), but this is not the case; however, we would like to achieve the same behavior.
The keyword "const" can also be used for generating string literal or constant string (to learn more details of strings, please check the next chapter).
This means that if, after definition, we try to modify the value, a compilation error will be thrown (see Program 5.6a).
Another practical example is to give relative values to chess pieces and add them up, retrieving a final score to choose which is the best move (a technique for pruning in a search tree algorithm).
The value can vary depending on the state of the game (opening, middle game, ending). We will use the most traditional values:
Then we can iterate the chessboard and check the final score.
Variable-Length Arrays
In this fragment, you read a value from the keyboard into size. The value of size is then used to specify the dimension for the values array. Because size_t is an implementation-defined integer type, you may get compiler errors if you try to read such a value using %d. The z in %zd tells the compiler that it applies to size_t, so the compiler will make the specifier suitable for reading whatever integer type size_t is.
Here you read both dimensions for a two-dimensional array from the keyboard. Both array dimensions are determined at execution time.
This uses preprocessor directives that you’ll learn about in Chapter 13. The printf() statement and the following exit() statement will be included in the program if the symbol __STDC_NO_VLA__ is defined. If you place this fragment at the beginning of main(), if variable-length arrays are not supported, you’ll see a message from the printf() function call, and the program will end immediately. Despite the fact that constant __STDC_NO_VLA__ is 1 in Visual Studio 2019, this compiler does not support VLA.
You can see a variable-length, one-dimensional array working in a revised version of Program 5.3 as shown in Program 5.7.
How It Works
Obviously, the value for the array dimension must be defined prior to this statement.
The rest of the code is what you have seen before, except that input and output of the size_t values use the %zd specifier. Note how the remainder operator is used in the loop that outputs the grades to start a new line after every fifth output value.
The Microsoft Windows command line may be too narrow to display five grades. If so, you can output fewer per line by changing the code, or you can change the default size of the window by clicking the icon at the left of the title bar and selecting Properties from the menu.
Designing a Program
Now that you’ve learned about arrays, let’s see how you can apply them in a bigger problem. Let’s try writing another game—tic-tac-toe—also known as noughts and crosses.
The Problem
Implementing the game with the computer as the adversary is beyond what I have covered up to now, so you are just going to write a program that allows two people to play tic-tac-toe.
The Analysis
A 3 × 3 grid in which to store the turns of the two players: That’s easy. You can use a two-dimensional array with three rows of three elements.
A simple way for a player to select a square on his or her turn: You can label the nine squares with digits from 1 to 9. A player will just need to enter the number of the square to select it.
A way to get the two players to take alternate turns: You can identify the two players as 1 and 2, with player 1 going first. You can then determine the player number by the number of the turn. On odd-numbered turns, it’s player 1. On even-numbered turns, it’s player 2.
- Some way of specifying where to place the player symbol on the grid and checking to see if it’s a valid selection: A valid selection is a digit from 1 to 9. If you label the first row of squares with 1, 2, and 3, the second row with 4, 5, and 6, and the third row with 7, 8, and 9, you can calculate a row index and a column index from the square number. Let’s assume the player’s choice is stored in a variable, choice.
If you subtract 1 from the player’s chosen square number in choice, the square numbers are effectively 0 through 8, as shown in the following image:
Then the expression choice/3 gives the row number, as you can see here:
The expression choice%3 will give the column number:
A method of finding out if one of the players has won: After each turn, you must check to see if any row, column, or diagonal in the board grid contains identical symbols. If it does, the last player has won.
A way to detect the end of the game: Because the board has nine squares, a game consists of up to nine turns. The game ends when a winner is discovered or after nine turns.
The Solution
This section outlines the steps you’ll take to solve the problem.
Step 1
Here, you’ve declared the following variables: i, for the loop variable; player, which stores the identifier for the current player, 1 or 2; winner, which contains the identifier for the winning player; and the array board, which is of type char, because you want to place the symbol 'X' or 'O' in the squares. You initialize the array with the characters for the digits that identify the squares. The main game loop continues for as long as the loop condition is true. It will be false if winner contains a value other than 0 (which indicates that a winner has been found) or the loop counter is equal to or greater than 9 (which will be the case when all nine squares on the board have been filled).
When you display the grid in the loop, you use vertical bars and dash characters to delineate the squares. When a player selects a square, the symbol for that player will replace the digit character.
Step 2
The square number is less than the minimum, 0.
The square number is greater than the maximum, 8.
The square number selects a square that already contains 'X' or 'O'.
In the latter case, the contents of the square will have a value greater than the character '9', because the character codes for 'X' and 'O' are greater than the character code for '9'. If the choice falls on any of these conditions, you just repeat the request to select a valid square.
Step 3
To check for a winning line, you compare one element in a line with the other two to test for equality. If all three are identical, then you have a winning line. You check both diagonals in the board array with the if expression, and if either diagonal has identical symbols in all three elements, you set winner to the current player. The current player must be the winner because he or she was the last to place a symbol on a square. If neither diagonal has identical symbols, you check the rows and the columns in the else clause using a for loop. The for loop body consists of one if statement that checks both a row and a column for identical elements on each iteration. If either is found, winner is set to the current player. Of course, if winner is set to a value here, the main loop condition will be false, so the loop ends and execution continues with the code following the main loop.
Step 4
Summary
This chapter explored the ideas behind arrays. An array is a fixed number of elements of the same type, and you access any element within the array using the array name and one or more index values. Index values for an array are unsigned integer values starting from zero, and there is one index for each array dimension.
Processing an array with a loop provides a powerful programming capability. The amount of program code you need for the operation on array elements within a loop is essentially the same, regardless of how many elements there are. You have also seen how you can organize your data using multidimensional arrays. You can structure an array such that each array dimension selects a set of elements with a particular characteristic, such as the data pertaining to a particular time or location. By applying nested loops to multidimensional arrays, you can process all the array elements with a very small amount of code.
Up until now, you’ve mainly concentrated on processing numbers. The examples haven’t really dealt with text to any great extent. In the next chapter, you’ll learn how you can process and analyze strings of characters, but first some exercises to establish what you have learned in this chapter.
The following exercises enable you to try out what you’ve learned about arrays. If you get stuck, look back over the chapter for help. If you’re still stuck, you can download the solutions from the Source Code/Download area of the Apress website (www.apress.com), but that really should be a last resort.
Exercise 5-1. Write a program that will read five values of type double from the keyboard and store them in an array. Calculate the reciprocal of each value (the reciprocal of value x is 1.0/x) and store it in a separate array. Output the values of the reciprocals and calculate and output the sum of the reciprocals.
Multiply the result of this by 4.0, add 3.0, and output the final result. Do you recognize the value you get?
Exercise 5-3. Write a program that will read five values from the keyboard and store them in an array of type float with the name amounts. Create two arrays of five elements of type long with the names dollars and cents. Store the whole number part of each value in the amounts array in the corresponding element of dollars and the fractional part of the amount as a two-digit integer in cents (e.g., 2.75 in amounts[1] would result in 2 being stored in dollars[1] and 75 being stored in cents[1]). Output the values from the two arrays of type long as monetary amounts (e.g., $2.75).
Exercise 5-4. Define a two-dimensional array, data[11][5], of type double. Initialize the elements in the first column with values from 2.0 to 3.0 inclusive in steps of 0.1. If the first element in a row has value x, populate the remaining elements in each row with the values 1/x, x2, x3, and x4 . Output the values in the array with each row on a separate line and with a heading for each column.
Exercise 5-5. Write a program that will calculate the average grade for the students in each of an arbitrary number of classes. The program should read in all the grades for students in all classes before calculating the averages. Output the student grades for each class followed by the average for that class.