We now consider a larger example. Computers are commonly used for survey data analysis to compile and analyze the results of surveys and opinion polls. Figure 6.16 uses array response
initialized with 99 responses to a survey. Each response is a number from 1 to 9. The program computes the mean, median and mode of the 99 values. Figure 6.17 contains a sample run of this program. This example includes most of the common manipulations usually required in array problems, including passing arrays to functions.
1 // Fig. 6.16: fig06_16.c
2 // Survey data analysis with arrays:
3 // computing the mean, median and mode of the data.
4 #include <stdio.h>
5 #define SIZE 99
6
7 // function prototypes
8 void mean( const unsigned int answer[] );
9 void median( unsigned int answer[] );
10 void mode( unsigned int freq[], unsigned const int answer[] ) ;
11 void bubbleSort( int a[] );
12 void printArray( unsigned const int a[] );
13
14 // function main begins program execution
15 int main( void )
16 {
17 unsigned int frequency[ 10 ] = { 0 }; // initialize array frequency
18
19 // initialize array response
20 unsigned int response[ SIZE ] =
21 { 6, 7, 8, 9, 8, 7, 8, 9, 8, 9,
22 7, 8, 9, 5, 9, 8, 7, 8, 7, 8,
23 6, 7, 8, 9, 3, 9, 8, 7, 8, 7,
24 7, 8, 9, 8, 9, 8, 9, 7, 8, 9,
25 6, 7, 8, 7, 8, 7, 9, 8, 9, 2,
26 7, 8, 9, 8, 9, 8, 9, 7, 5, 3,
27 5, 6, 7, 2, 5, 3, 9, 4, 6, 4,
28 7, 8, 9, 6, 8, 7, 8, 9, 7, 8,
29 7, 4, 4, 2, 5, 3, 8, 7, 5, 6,
30 4, 5, 6, 1, 6, 5, 7, 8, 7 };
31
32 // process responses
33 mean( response );
34 median( response );
35 mode( frequency, response );
36 } // end main
37
38 // calculate average of all response values
39 void mean( const unsigned int answer[] )
40 {
41 size_t j; // counter for totaling array elements
42 unsigned int total = 0; // variable to hold sum of array elements
43
44 printf( "%s
%s
%s
", "********", " Mean", "********" );
45
46 // total response values
47 for ( j = 0; j < SIZE; ++j ) {
48 total += answer[ j ];
49 } // end for
50
51 printf( "The mean is the average value of the data
"
52 "items. The mean is equal to the total of
"
53 "all the data items divided by the number
"
54 "of data items ( %u ). The mean value for
"
55 "this run is: %u / %u = %.4f
",
56 SIZE, total, SIZE, ( double ) total / SIZE );
57 } // end function mean
58
59 // sort array and determine median element's value
60 void median( unsigned int answer[] )
61 {
62 printf( "
%s
%s
%s
%s",
63 "********", " Median", "********",
64 "The unsorted array of responses is" );
65
66 printArray( answer ); // output unsorted array
67
68 bubbleSort( answer ); // sort array
69
70 printf( "%s", "
The sorted array is" );
71 printArray( answer ); // output sorted array
72
73 // display median element
74 printf( "
The median is element %u of
"
75 "the sorted %u element array.
"
76 "For this run the median is %u
",
77 SIZE / 2, SIZE, answer[ SIZE / 2 ] );
78 } // end function median
79
80 // determine most frequent response
81 void mode( unsigned int freq[], const unsigned int answer[] )
82 {
83 size_t rating; // counter for accessing elements 1-9 of array freq
84 size_t j; // counter for summarizing elements 0-98 of array answer
85 unsigned int h; // counter for diplaying histograms freq array values
86 unsigned int largest = 0; // represents largest frequency
87 unsigned int modeValue = 0; // represents most frequent response
88
89 printf( "
%s
%s
%s
",
90 "********", " Mode", "********" );
91
92 // initialize frequencies to 0
93 for ( rating = 1; rating <= 9; ++rating ) {
94 freq[ rating ] = 0;
95 } // end for
96
97 // summarize frequencies
98 for ( j = 0; j < SIZE; ++j ) {
99 ++freq[ answer[ j ] ];
100 } // end for
101
102 // output headers for result columns
103 printf( "%s%11s%19s
%54s
%54s
",
104 "Response", "Frequency", "Histogram",
105 "1 1 2 2", "5 0 5 0 5" );
106
107 // output results
108 for ( rating = 1; rating <= 9; ++rating ) {
109 printf( "%8u%11u ", rating, freq[ rating ] );
110
111 // keep track of mode value and largest frequency value
112 if ( freq[ rating ] > largest ) {
113 largest = freq[ rating ];
114 modeValue = rating;
115 } // end if
116
117 // output histogram bar representing frequency value
118 for ( h = 1; h <= freq[ rating ]; ++h ) {
119 printf( "%s", "*" );
120 } // end inner for
121
122 puts( "" ); // being new line of output
123 } // end outer for
124
125 // display the mode value
126 printf( "
The mode is the most frequent value.
"
127 "For this run the mode is %u which occurred"
128 " %u times.
", modeValue, largest );
129 } // end function mode
130
131 // function that sorts an array with bubble sort algorithm
132 void bubbleSort( unsigned int a[] )
133 {
134 unsigned int pass; // pass counter
135 size_t j; // comparison counter
136 unsigned int hold; // temporary location used to swap elements
137
138 // loop to control number of passes
139 for ( pass = 1; pass < SIZE; ++pass ) {
140
141 // loop to control number of comparisons per pass
142 for ( j = 0; j < SIZE - 1; ++j ) {
143
144 // swap elements if out of order
145 if ( a[ j ] > a[ j + 1 ] ) {
146 hold = a[ j ];
147 a[ j ] = a[ j + 1 ];
148 a[ j + 1 ] = hold;
149 } // end if
150 } // end inner for
151 } // end outer for
152 } // end function bubbleSort
153
154 // output array contents (20 values per row)
155 void printArray( const unsigned int a[] )
156 {
157 size_t j; // counter
158
159 // output array contents
160 for ( j = 0; j < SIZE; ++j ) {
161
162 if ( j % 20 == 0 ) { // begin new line every 20 values
163 puts( "" );
164 } // end if
165
166 printf( "%2u", a[ j ] );
167 } // end for
168 } // end function printArray
********
Mean
********
The mean is the average value of the data
items. The mean is equal to the total of
all the data items divided by the number
of data items ( 99 ). The mean value for
this run is: 681 / 99 = 6.8788
********
Median
********
The unsorted array of responses is
6 7 8 9 8 7 8 9 8 9 7 8 9 5 9 8 7 8 7 8
6 7 8 9 3 9 8 7 8 7 7 8 9 8 9 8 9 7 8 9
6 7 8 7 8 7 9 8 9 2 7 8 9 8 9 8 9 7 5 3
5 6 7 2 5 3 9 4 6 4 7 8 9 6 8 7 8 9 7 8
7 4 4 2 5 3 8 7 5 6 4 5 6 1 6 5 7 8 7
The sorted array is
1 2 2 2 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 5
5 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7
7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8
8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
The median is element 49 of
the sorted 99 element array.
For this run the median is 7
********
Mode
********
Response Frequency Histogram
1 1 2 2
5 0 5 0 5
1 1 *
2 3 ***
3 4 ****
4 5 *****
5 8 ********
6 9 *********
7 23 ***********************
8 27 ***************************
9 19 *******************
The mode is the most frequent value.
For this run the mode is 8 which occurred 27 times.
The mean is the arithmetic average of the 99 values. Function mean
(Fig. 6.16, lines 39–57) computes the mean by totaling the 99 elements and dividing the result by 99.
The median is the middle value. Function median
(lines 60–78) determines the median by calling function bubbleSort
(defined in lines 132–152) to sort the array of responses into ascending order, then picking answer[SIZE / 2]
(the middle element) of the sorted array. When the number of elements is even, the median should be calculated as the mean of the two middle elements. Function median
does not currently provide this capability. Function printArray
(lines 155–168) is called to output the response
array.
The mode is the value that occurs most frequently among the 99 responses. Function mode
(lines 81–129) determines the mode by counting the number of responses of each type, then selecting the value with the greatest count. This version of function mode
does not handle a tie in this example. Function mode
also produces a histogram to aid in determining the mode graphically.
You’ll often work with large amounts of data stored in arrays. It may be necessary to determine whether an array contains a value that matches a certain key value. The process of finding a particular element of an array is called searching. In this section we discuss two searching techniques—the simple linear search technique and the more efficient (but more complex) binary search technique.
The linear search (Fig. 6.18) compares each element of the array with the search key. Because the array is not in any particular order, it’s just as likely that the value will be found in the first element as in the last. On average, therefore, the program will have to compare the search key with half the elements of the array.
1 // Fig. 6.18: fig06_18.c
2 // Linear search of an array.
3 #include <stdio.h>
4 #define SIZE 100
5
6 // function prototype
7 size_t linearSearch( const int array[], int key, size_t size );
8
9 // function main begins program execution
10 int main( void )
11 {
12 int a[ SIZE ]; // create array a
13 size_t x; // counter for initializing elements 0-99 of array a
14 int searchKey; // value to locate in array a
15 size_t element; // variable to hold location of searchKey or -1
16
17 // create some data
18 for ( x = 0; x < SIZE; ++x ) {
19 a[ x ] = 2 * x;
20 } // end for
21
22 puts( "Enter integer search key:" );
23 scanf( "%d", &searchKey );
24
25 // attempt to locate searchKey in array a
26 element = linearSearch( a, searchKey, SIZE );
27
28 // display results
29 if ( element != -1 ) {
30 printf( "Found value in element %d
", element );
31 } // end if
32 else {
33 puts( "Value not found" );
34 } // end else
35 } // end main
36
37 // compare key to every element of array until the location is found
38 // or until the end of array is reached; return subscript of element
39 // if key is found or -1 if key is not found
40 size_t linearSearch( const int array[], int key, size_t size )
41 {
42 size_t n; // counter
43
44 // loop through array
45 for ( n = 0; n < size; ++n ) {
46
47 if ( array[ n ] == key ) {
48 return n; // return location of key
49 } // end if
50 } // end for
51
52 return -1; // key not found
53 } // end function linearSearch
Enter integer search key:
36
Found value in element 18
Enter integer search key:
37
Value not found
The linear searching method works well for small or unsorted arrays. However, for large arrays linear searching is inefficient. If the array is sorted, the high-speed binary search technique can be used.
The binary search algorithm eliminates from consideration one-half of the elements in a sorted array after each comparison. The algorithm locates the middle element of the array and compares it to the search key. If they’re equal, the search key is found and the array subscript of that element is returned. If they’re not equal, the problem is reduced to searching one-half of the array. If the search key is less than the middle element of the array, the first half of the array is searched, otherwise the second half is searched. If the search key is not found in the specified subarray (piece of the original array), the algorithm is repeated on one-quarter of the original array. The search continues until the search key is equal to the middle element of a subarray, or until the subarray consists of one element that’s not equal to the search key (i.e., the search key is not found).
In a worst case-scenario, searching an array of 1023 elements takes only 10 comparisons using a binary search. Repeatedly dividing 1,024 by 2 yields the values 512, 256, 128, 64, 32, 16, 8, 4, 2 and 1. The number 1,024 (210) is divided by 2 only 10 times to get the value 1. Dividing by 2 is equivalent to one comparison in the binary search algorithm. An array of 1,048,576 (220) elements takes a maximum of only 20 comparisons to find the search key. An array of one billion elements takes a maximum of only 30 comparisons to find the search key. This is a tremendous increase in performance over the linear search that required comparing the search key to an average of half of the array elements. For a one-billion-element array, this is a difference between an average of 500 million comparisons and a maximum of 30 comparisons! The maximum comparisons for any array can be determined by finding the first power of 2 greater than the number of array elements.
Figure 6.19 presents the iterative version of function binarySearch
(lines 42–72). The function receives four arguments—an integer array b
to be searched, an integer searchKey
, the low
array subscript and the high
array subscript (these define the portion of the array to be searched). If the search key does not match the middle element of a subarray, the low
subscript or high
subscript is modified so that a smaller subarray can be searched. If the search key is less than the middle element, the high
subscript is set to middle - 1
and the search is continued on the elements from low
to middle - 1
. If the search key is greater than the middle element, the low
subscript is set to middle + 1
and the search is continued on the elements from middle + 1
to high
. The program uses an array of 15 elements. The first power of 2 greater than the number of elements in this array is 16 (24), so no more than 4 comparisons are required to find the search key. The program uses function printHeader
(lines 75–94) to output the array subscripts and function printRow
(lines 98–118) to output each subarray during the binary search process. The middle element in each subarray is marked with an asterisk (*
) to indicate the element to which the search key is compared.
1 // Fig. 6.19: fig06_19.c
2 // Binary search of a sorted array.
3 #include <stdio.h>
4 #define SIZE 15
5
6 // function prototypes
7 size_t binarySearch(const int b[], int searchKey, size_t low, size_t high);
8 void printHeader( void );
9 void printRow( const int b[], size_t low, size_t mid, size_t high );
10
11 // function main begins program execution
12 int main( void )
13 {
14 int a[ SIZE ]; // create array a
15 size_t i; // counter for initializing elements of array a
16 int key; // value to locate in array a
17 size_t result; // variable to hold location of key or -1
18
19 // create data
20 for ( i = 0; i < SIZE; ++i ) {
21 a[ i ] = 2 * i;
22 } // end for
23
24 printf( "%s", "Enter a number between 0 and 28: " );
25 scanf( "%d", &key );
26
27 printHeader();
28
29 // search for key in array a
30 result = binarySearch( a, key, 0, SIZE - 1 );
31
32 // display results
33 if ( result != -1 ) {
34 printf( "
%d found in array element %d
", key, result );
35 } // end if
36 else {
37 printf( "
%d not found
", key );
38 } // end else
39 } // end main
40
41 // function to perform binary search of an array
42 size_t binarySearch(const int b[], int searchKey, size_t low, size_t high)
43 {
44 int middle; // variable to hold middle element of array
45
46 // loop until low subscript is greater than high subscript
47 while ( low <= high ) {
48
49 // determine middle element of subarray being searched
50 middle = ( low + high ) / 2;
51
52 // display subarray used in this loop iteration
53 printRow( b, low, middle, high );
54
55 // if searchKey matched middle element, return middle
56 if ( searchKey == b[ middle ] ) {
57 return middle;
58 } // end if
59
60 // if searchKey less than middle element, set new high
61 else if ( searchKey < b[ middle ] ) {
62 high = middle - 1; // search low end of array
63 } // end else if
64
65 // if searchKey greater than middle element, set new low
66 else {
67 low = middle + 1; // search high end of array
68 } // end else
69 } // end while
70
71 return -1; // searchKey not found
72 } // end function binarySearch
73
74 // Print a header for the output
75 void printHeader( void )
76 {
77 unsigned int i; // counter
78
79 puts( "
Subscripts:" );
80
81 // output column head
82 for ( i = 0; i < SIZE; ++i ) {
83 printf( "%3u ", i );
84 } // end for
85
86 puts( "" ); // start new line of output
87
88 // output line of - characters
89 for ( i = 1; i <= 4 * SIZE; ++i ) {
90 printf( "%s", "-" );
91 } // end for
92
93 puts( "" ); // start new line of output
94 } // end function printHeader
95
96 // Print one row of output showing the current
97 // part of the array being processed.
98 void printRow( const int b[], size_t low, size_t mid, size_t high )
99 {
100 size_t i; // counter for iterating through array b
101
102 // loop through entire array
103 for ( i = 0; i < SIZE; ++i ) {
104
105 // display spaces if outside current subarray range
106 if ( i < low || i > high ) {
107 printf( "%s", " " );
108 } // end if
109 else if ( i == mid ) { // display middle element
110 printf( "%3d*", b[ i ] ); // mark middle value
111 } // end else if
112 else { // display other elements in subarray
113 printf( "%3d ", b[ i ] );
114 } // end else
115 } // end for
116
117 puts( "" ); // start new line of output
118 } // end function printRow
Enter a number between 0 and 28: 25
Subscripts:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
------------------------------------------------------------
0 2 4 6 8 10 12 14* 16 18 20 22 24 26 28
16 18 20 22* 24 26 28
24 26* 28
24*
25 not found
Enter a number between 0 and 28: 8
Subscripts:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
------------------------------------------------------------
0 2 4 6 8 10 12 14* 16 18 20 22 24 26 28
0 2 4 6* 8 10 12
8 10* 12
8*
8 found in array element 4
Arrays in C can have multiple subscripts. A common use of multiple-subscripted arrays, which the C standard refers to as multidimensional arrays, is to represent tables of values consisting of information arranged in rows and columns. To identify a particular table element, we must specify two subscripts: The first (by convention) identifies the element’s row and the second (by convention) identifies the element’s column. Tables or arrays that require two subscripts to identify a particular element are called double-subscripted arrays. Multidimensional arrays can have more than two subscripts.
Figure 6.20 illustrates a double-subscripted array, a
. The array contains three rows and four columns, so it’s said to be a 3-by-4 array. In general, an array with m rows and n columns is called an m-by-n array.
Every element in array a
is identified in Fig. 6.20 by an element name of the form a[i][j]
; a
is the name of the array, and i
and j
are the subscripts that uniquely identify each element in a
. The names of the elements in row 0 all have a first subscript of 0
; the names of the elements in column 3 all have a second subscript of 3
.
Common Programming Error 6.7
Referencing a double-subscripted array element as a[ x, y ]
instead of a[ x ][ y ]
is a logic error. C interprets a[ x, y ]
as a[ y ]
(because the comma in this context is treated as a comma operator), so this programmer error is not a syntax error.
A multidimensional array can be initialized when it’s defined, much like a single-subscripted array. For example, a double-subscripted array int b[2][2]
could be defined and initialized with
int b[ 2 ][ 2 ] = { { 1, 2 }, { 3, 4 } };
The values are grouped by row in braces. The values in the first set of braces initialize row 0 and the values in the second set of braces initialize row 1. So, the values 1
and 2
initialize elements b[0][0]
and b[0][1]
, respectively, and the values 3
and 4
initialize elements b[1][0]
and b[1][1]
, respectively. If there are not enough initializers for a given row, the remaining elements of that row are initialized to 0. Thus,
int b[ 2 ][ 2 ] = { { 1 }, { 3, 4 } };
would initialize b[0][0]
to 1
, b[0][1]
to 0
, b[1][0]
to 3
and b[1][1]
to 4
. Figure 6.21 demonstrates defining and initializing double-subscripted arrays.
1 // Fig. 6.21: fig06_21.c
2 // Initializing multidimensional arrays.
3 #include <stdio.h>
4
5 void printArray( int a[][ 3 ] ); // function prototype
6
7 // function main begins program execution
8 int main( void )
9 {
10 // initialize array1, array2, array3
11 int array1[ 2 ][ 3 ] = { { 1, 2, 3 }, { 4, 5, 6 } };
12 int array2[ 2 ][ 3 ] = { 1, 2, 3, 4, 5 };
13 int array3[ 2 ][ 3 ] = { { 1, 2 }, { 4 } };
14
15 puts( "Values in array1 by row are:" );
16 printArray( array1 );
17
18 puts( "Values in array2 by row are:" );
19 printArray( array2 );
20
21 puts( "Values in array3 by row are:" );
22 printArray( array3 );
23 } // end main
24
25 // function to output array with two rows and three columns
26 void printArray( int a[][ 3 ] )
27 {
28 size_t i; // row counter
29 size_t j; // column counter
30
31 // loop through rows
32 for ( i = 0; i <= 1; ++i ) {
33
34 // output column values
35 for ( j = 0; j <= 2; ++j ) {
36 printf( "%d ", a[ i ][ j ] );
37 } // end inner for
38
39 printf( "
" ); // start new line of output
40 } // end outer for
41 } // end function printArray
Values in array1 by row are:
1 2 3
4 5 6
Values in array2 by row are:
1 2 3
4 5 0
Values in array3 by row are:
1 2 0
4 0 0
The program defines three arrays of two rows and three columns (six elements each). The definition of array1
(line 11) provides six initializers in two sublists. The first sublist initializes row 0 of the array to the values 1, 2 and 3; and the second sublist initializes row 1 of the array to the values 4, 5 and 6.
If the braces around each sublist are removed from the array1
initializer list, the compiler initializes the elements of the first row followed by the elements of the second row. The definition of array2
(line 12) provides five initializers. The initializers are assigned to the first row, then the second row. Any elements that do not have an explicit initializer are initialized to zero automatically, so array2[1][2]
is initialized to 0.
The definition of array3
(line 13) provides three initializers in two sublists. The sublist for the first row explicitly initializes the first two elements of the first row to 1 and 2. The third element is initialized to zero. The sublist for the second row explicitly initializes the first element to 4. The last two elements are initialized to zero.
The program calls printArray
(lines 26–41) to output each array’s elements. The function definition specifies the array parameter as const int a[][3]
. When we receive a single-subscripted array as a parameter, the array brackets are empty in the function’s parameter list. The first subscript of a multidimensional array is not required either, but all subsequent subscripts are required. The compiler uses these subscripts to determine the locations in memory of elements in multidimensional arrays. All array elements are stored consecutively in memory regardless of the number of subscripts. In a double-subscripted array, the first row is stored in memory followed by the second row.
Providing the subscript values in a parameter declaration enables the compiler to tell the function how to locate an element in the array. In a double-subscripted array, each row is basically a single-subscripted array. To locate an element in a particular row, the compiler must know how many elements are in each row so that it can skip the proper number of memory locations when accessing the array. Thus, when accessing a[1][2]
in our example, the compiler knows to skip the three elements of the first row to get to the second row (row 1). Then, the compiler accesses element 2 of that row.
Many common array manipulations use for
repetition statements. For example, the following statement sets all the elements in row 2 of array a
in Fig. 6.20 to zero:
for ( column = 0; column <= 3; ++column ) {
a[ 2 ][ column ] = 0;
}
We specified row 2, so the first subscript is always 2
. The loop
varies only the second (column) subscript. The preceding for
statement is equivalent to the assignment statements:
a[ 2 ][ 0 ] = 0;
a[ 2 ][ 1 ] = 0;
a[ 2 ][ 2 ] = 0;
a[ 2 ][ 3 ] = 0;
The following nested for
statement determines the total of all the elements in array a
.
total = 0;
for ( row = 0; row <= 2; ++row ) {
for ( column = 0; column <= 3; ++column ) {
total += a[ row ][ column ];
}
}
The for
statement totals the elements of the array one row at a time. The outer for
statement begins by setting row
(i.e., the row subscript) to 0
so that the elements of that row may be totaled by the inner for
statement. The outer for
statement then increments row
to 1
, so the elements of that row can be totaled. Then, the outer for
statement increments row
to 2
, so the elements of the third row can be totaled. When the nested for
statement terminates, total
contains the sum of all the elements in the array a
.
Figure 6.22 performs several other common array manipulations on 3-by-4 array studentGrades
using for
statements. Each row of the array represents a student and each column represents a grade on one of the four exams the students took during the semester. The array manipulations are performed by four functions. Function minimum
(lines 41–60) determines the lowest grade of any student for the semester. Function maximum
(lines 63–82) determines the highest grade of any student for the semester. Function average
(lines 85–96) determines a particular student’s semester average. Function printArray
(lines 99–118) outputs the double-subscripted array in a neat, tabular format.
1 // Fig. 6.22: fig06_22.c
2 // Double-subscripted array manipulations.
3 #include <stdio.h>
4 #define STUDENTS 3
5 #define EXAMS 4
6
7 // function prototypes
8 int minimum( int grades[][ EXAMS ], size_t pupils, size_t tests );
9 int maximum( int grades[][ EXAMS ], size_t pupils, size_t tests );
10 double average( const int setOfGrades[], size_t tests );
11 void printArray( int grades[][ EXAMS ], size_t pupils, size_t tests );
12
13 // function main begins program execution
14 int main( void )
15 {
16 size_t student; // student counter
17
18 // initialize student grades for three students (rows)
19 int studentGrades[ STUDENTS ][ EXAMS ] =
20 { { 77, 68, 86, 73 },
21 { 96, 87, 89, 78 },
22 { 70, 90, 86, 81 } };
23
24 // output array studentGrades
25 puts( "The array is:" );
26 printArray( studentGrades, STUDENTS, EXAMS );
27
28 // determine smallest and largest grade values
29 printf( "
Lowest grade: %d
Highest grade: %d
",
30 minimum( studentGrades, STUDENTS, EXAMS ),
31 maximum( studentGrades, STUDENTS, EXAMS ) );
32
33 // calculate average grade for each student
34 for ( student = 0; student < STUDENTS; ++student ) {
35 printf( "The average grade for student %u is %.2f
",
36 student, average( studentGrades[ student ], EXAMS ) );
37 } // end for
38 } // end main
39
40 // Find the minimum grade
41 int minimum( int grades[][ EXAMS ], size_t pupils, size_t tests )
42 {
43 size_t i; // student counter
44 size_t j; // exam counter
45 int lowGrade = 100; // initialize to highest possible grade
46
47 // loop through rows of grades
48 for ( i = 0; i < pupils; ++i ) {
49
50 // loop through columns of grades
51 for ( j = 0; j < tests; ++j ) {
52
53 if ( grades[ i ][ j ] < lowGrade ) {
54 lowGrade = grades[ i ][ j ];
55 } // end if
56 } // end inner for
57 } // end outer for
58
59 return lowGrade; // return minimum grade
60 } // end function minimum
61
62 // Find the maximum grade
63 int maximum( int grades[][ EXAMS ], size_t pupils, size_t tests )
64 {
65 size_t i; // student counter
66 size_t j; // exam counter
67 int highGrade = 0; // initialize to lowest possible grade
68
69 // loop through rows of grades
70 for ( i = 0; i < pupils; ++i ) {
71
72 // loop through columns of grades
73 for ( j = 0; j < tests; ++j ) {
74
75 if ( grades[ i ][ j ] > highGrade ) {
76 highGrade = grades[ i ][ j ];
77 } // end if
78 } // end inner for
79 } // end outer for
80
81 return highGrade; // return maximum grade
82 } // end function maximum
83
84 // Determine the average grade for a particular student
85 double average( const int setOfGrades[], size_t tests )
86 {
87 size_t i; // exam counter
88 int total = 0; // sum of test grades
89
90 // total all grades for one student
91 for ( i = 0; i < tests; ++i ) {
92 total += setOfGrades[ i ];
93 } // end for
94
95 return ( double ) total / tests; // average
96 } // end function average
97
98 // Print the array
99 void printArray( int grades[][ EXAMS ], size_t pupils, size_t tests )
100 {
101 size_t i; // student counter
102 size_t j; // exam counter
103
104 // output column heads
105 printf( "%s", " [0] [1] [2] [3]" );
106
107 // output grades in tabular format
108 for ( i = 0; i < pupils; ++i ) {
109
110 // output label for row
111 printf( "
studentGrades[%d] ", i );
112
113 // output grades for one student
114 for ( j = 0; j < tests; ++j ) {
115 printf( "%-5d", grades[ i ][ j ] );
116 } // end inner for
117 } // end outer for
118 } // end function printArray
The array is:
[0] [1] [2] [3]
studentGrades[0] 77 68 86 73
studentGrades[1] 96 87 89 78
studentGrades[2] 70 90 86 81
Lowest grade: 68
Highest grade: 96
The average grade for student 0 is 76.00
The average grade for student 1 is 87.50
The average grade for student 2 is 81.75
Functions minimum
, maximum
and printArray
each receive three arguments—the studentGrades
array (called grades
in each function), the number of students (rows of the array) and the number of exams (columns of the array). Each function loops through array grades
using nested for
statements. The following nested for
statement is from the function minimum
definition:
// loop through rows of grades
for ( i = 0; i < pupils; ++i ) {
// loop through columns of grades
for ( j = 0; j < tests; ++j ) {
if ( grades[ i ][ j ] < lowGrade ) {
lowGrade = grades[ i ][ j ];
} // end if
} // end inner for
} // end outer for
The outer for
statement begins by setting i
(i.e., the row subscript) to 0
so the elements of that row (i.e., the grades of the first student) can be compared to variable lowGrade
in the body of the inner for
statement. The inner for
statement loops through the four grades of a particular row and compares each grade to lowGrade
. If a grade is less than lowGrade
, lowGrade
is set to that grade. The outer for
statement then increments the row subscript to 1
. The elements of that row are compared to variable lowGrade
. The outer for
statement then increments the row subscript to 2
. The elements of that row are compared to variable lowGrade
. When execution of the nested statement is complete, lowGrade
contains the smallest grade in the double-subscripted array. Function maximum
works similarly to function minimum
.
Function average
(lines 85–96) takes two arguments—a single-subscripted array of test results for a particular student called setOfGrades
and the number of test results in the array. When average
is called, the first argument studentGrades[student]
is passed. This causes the address of one row of the double-subscripted array to be passed to average
. The argument studentGrades[1]
is the starting address of row 1 of the array. Remember that a double-subscripted array is basically an array of single-subscripted arrays and that the name of a single-subscripted array is the address of the array in memory. Function average
calculates the sum of the array elements, divides the total by the number of test results and returns the floating-point result.
In early versions of C, all arrays had constant size. But what if you don’t know an array’s size at compilation time? To handle this, you’d have to use dynamic memory allocation with malloc
and related functions. The C standard allows you to handle arrays of unknown size using variable-length arrays (VLAs). These are not arrays whose size can change—that would compromise the integrity of nearby locations in memory. A variable-length array is an array whose length, or size, is defined in terms of an expression evaluated at execution time. The program of Fig. 6.23 declares and prints several VLAs. [Note: This feature is not supported in Microsoft Visual C++.]
1 // Fig. 6.23: fig06_14.c
2 // Using variable-length arrays
3 #include <stdio.h>
4
5 // function prototypes
6 void print1DArray( int size, int arr[ size ] );
7 void print2DArray( int row, int col, int arr[ row ][ col ] );
8
9 int main( void )
10 {
11 int arraySize; // size of 1-D array
12 int row1, col1, row2, col2; // number of rows and columns in 2-D arrays
13
14 printf( "%s", "Enter size of a one-dimensional array: " );
15 scanf( "%d", &arraySize );
16
17 printf( "%s", "Enter number of rows and columns in a 2-D array: " );
18 scanf( "%d %d", &row1, &col1 );
19
20 printf( "%s",
21 "Enter number of rows and columns in another 2-D array: " );
22 scanf( "%d %d", &row2, &col2 );
23
24 int array[ arraySize ]; // declare 1-D variable-length array
25 int array2D1[ row1 ][ col1 ]; // declare 2-D variable-length array
26 int array2D2[ row2 ][ col2 ]; // declare 2-D variable-length array
27
28 // test sizeof operator on VLA
29 printf( "
sizeof(array) yields array size of %d bytes
",
30 sizeof( array ) );
31
32 // assign elements of 1-D VLA
33 for ( int i = 0; i < arraySize; ++i ) {
34 array[ i ] = i * i;
35 } // end for
36
37 // assign elements of first 2-D VLA
38 for ( int i = 0; i < row1; ++i ) {
39 for ( int j = 0; j < col1; ++j ) {
40 array2D1[ i ][ j ] = i + j;
41 } // end for
42 } // end for
43
44 // assign elements of second 2-D VLA
45 for ( int i = 0; i < row2; ++i ) {
46 for ( int j = 0; j < col2; ++j ) {
47 array2D2[ i ][ j ] = i + j;
48 } // end for
49 } // end for
50
51 puts( "
One-dimensional array:" );
52 print1DArray( arraySize, array ); // pass 1-D VLA to function
53
54 puts( "
First two-dimensional array:" );
55 print2DArray( row1, col1, array2D1 ); // pass 2-D VLA to function
56
57 puts( "
Second two-dimensional array:" );
58 print2DArray( row2, col2, array2D2 ); // pass other 2-D VLA to function
59 } // end main
60
61 void print1DArray( int size, int array[ size ] )
62 {
63 // output contents of array
64 for ( int i = 0; i < size; i++ ) {
65 printf( "array[%d] = %d
", i, array[ i ] );
66 } // end for
67 } // end function print1DArray
68
69 void print2DArray( int row, int col, int arr[ row ][ col ] )
70 {
71 // output contents of array
72 for ( int i = 0; i < row; ++i ) {
73 for ( int j = 0; j < col; ++j ) {
74 printf( "%5d", arr[ i ][ j ] );
75 } // end for
76
77 puts( "" );
78 } // end for
79 } // end function print2DArray
Enter size of a one-dimensional array: 6
Enter number of rows and columns in a 2-D array: 2 5
Enter number of rows and columns in another 2-D array: 4 3
First, we prompt the user for the desired sizes for a one-dimensional array and two two-dimensional arrays (lines 14–22). Lines 24–26 then declare VLAs of the appropriate size. This is valid as long as the variables representing the array sizes are of an integral type.
After declaring the arrays, we use the sizeof
operator in lines 29–30 to make sure that our VLA is of the proper length. In early versions of C sizeof
was always a compile-time operation, but when applied to a VLA, sizeof
operates at runtime. The output window shows that the sizeof
operator returns a size of 24 bytes—four times that of the number we entered because the size of an int
on our machine is 4 bytes.
Next we assign values to our VLAs’ elements (lines 33–49). We use i < arraySize
as our loop-continuation condition when filling the one-dimensional array. As with fixed-length arrays, there is no protection against stepping outside the array bounds.
Lines 61–67 define function print1DArray
that takes a one-dimensional VLA. The syntax for passing VLAs as parameters to functions is the same as with a normal, fixed-length array. We use the variable size
in the declaration of the array
parameter, but no checking is performed other than the variable being defined and of integral type—it’s purely documentation for the programmer.
Function print2DArray
(lines 69–79) takes a variable-length two-dimensional array and displays it to the screen. Recall from Section 6.9 that all but the first subscript of a multidimensional array must be specified when declaring a function parameter. The same restriction holds true for VLAs, except that the sizes can be specified by variables. The initial value of col
passed to the function is used to convert from two-dimensional indices to offsets into the contiguous memory the array is stored in, just as with a fixed-size array. Changing the value of col
inside the function will not cause any changes to the indexing, but passing an incorrect value to the function will.
It’s important to ensure that every subscript you use to access an array element is within the array’s bounds—that is, greater than or equal to 0 and less than the number of array elements. A two-dimensional array’s row and column subscripts must be greater than or equal to 0 and less than the numbers of rows and columns, respectively. This extends to arrays with additional dimensions as well.
Allowing programs to read from or write to array elements outside the bounds of arrays are common security flaws. Reading from out-of-bounds array elements can cause a program to crash or even appear to execute correctly while using bad data. Writing to an out-of-bounds element (known as a buffer overflow) can corrupt a program’s data in memory, crash a program and allow attackers to exploit the system and execute their own code.
As we stated in Section 6.4, C provides no automatic bounds checking for arrays, so you must provide your own. For techniques that help you prevent such problems, see CERT guideline ARR30-C at www.securecoding.cert.org
.
Bounds checking is also important in string processing. When reading a string into a char
array, scanf
does not prevent buffer overflows. If the number of characters input is greater than or equal to the array’s length, scanf
will write characters—including the string’s terminating null character (' '
)—beyond the end of the array. This might overwrite other variables’ values, and eventually the program might overwrite the string’s ' '
if it writes to those other variables.
Functions determine where strings end by looking for their terminating ' '
character. For example, function printf
outputs a string by reading characters from the beginning of the string in memory and continuing until the string’s ' '
is encountered. If the ' '
is missing, printf
might read far beyond the end of the string until it encounters some other ' '
in memory.
The C standard’s optional Annex K provides new, more secure, versions of many string-processing and input/output functions, including scanf_s
—a version of scanf
that performs additional checks to ensure that it does not write beyond the end of a character array used to store a string. Assuming that myString
is a 20-character array, the statement
scanf_s( "%19s", myString, 20 );
reads a string into myString
. Function scanf_s
requires two arguments for each %s
in the format string—a character array in which to place the input string and the number of array elements. The second of these arguments is used by scanf_s
to prevent buffer overflows. For example, it’s possible to supply a field width for %s
that’s too long for the underlying character array, or to simply omit the field width entirely. If the number of characters input plus the terminating null character is larger than the number of array elements, the %s
conversion would fail. Because the preceding statement contains only one conversion specifier, scanf_s
would return 0
indicating that no conversions were performed, and myString
would be unaltered.
In general, if your compiler supports the functions from the C standard’s optional Annex K, you should use them. We discuss additional Annex K functions in later Secure C Programming sections.
You might have noticed that throughout this book, we never use single-argument printf
s. Instead we use one of the following forms:
• When we need to output a '
'
after the string, we use function puts
(which automatically outputs a '
'
after its single string argument), as in
puts( "Welcome to C!" );
• When we need the cursor to remain on the same line as the string, we use function printf
, as in
printf( "%s", "Enter first integer: " );
Because we were displaying string literals, we certainly could have used the one-argument form of printf
, as in
printf( "Welcome to C!
" );
printf( "Enter first integer: " );
When printf
evaluates the format-control string in its first (and possibly its only) argument, the function performs tasks based on the conversion specifier(s) in that string. If the format-control string were obtained from the user, an attacker could supply malicious conversion specifiers that would be “executed” by the formatted output function. Now that you know how to read strings into character arrays, it’s important to note that you should never use as a printf
’s format-control string a character array that might contain user input. For more information, see CERT guideline FIO30-C at www.securecoding.cert.org
.
18.225.234.24