6.7. Case Study: Computing Mean, Median and Mode Using Arrays

We now consider a larger example. Computers are commonly used for survey data analysis to compile and analyze the results of surveys and opinion polls. Figure 6.16 uses array response initialized with 99 responses to a survey. Each response is a number from 1 to 9. The program computes the mean, median and mode of the 99 values. Figure 6.17 contains a sample run of this program. This example includes most of the common manipulations usually required in array problems, including passing arrays to functions.


 1   // Fig. 6.16: fig06_16.c
 2   // Survey data analysis with arrays:
 3   // computing the mean, median and mode of the data.
 4   #include <stdio.h>
 5   #define SIZE 99
 6
 7   // function prototypes
 8   void mean( const unsigned int answer[] );
 9   void median( unsigned int answer[] );
10   void mode( unsigned int freq[], unsigned const int answer[] ) ;
11   void bubbleSort( int a[] );
12   void printArray( unsigned const int a[] );
13
14   // function main begins program execution
15   int main( void )
16   {
17      unsigned int frequency[ 10 ] = { 0 }; // initialize array frequency
18
19      // initialize array response
20      unsigned int response[ SIZE ] =
21         { 6, 7, 8, 9, 8, 7, 8, 9, 8, 9,
22           7, 8, 9, 5, 9, 8, 7, 8, 7, 8,
23           6, 7, 8, 9, 3, 9, 8, 7, 8, 7,
24           7, 8, 9, 8, 9, 8, 9, 7, 8, 9,
25           6, 7, 8, 7, 8, 7, 9, 8, 9, 2,
26           7, 8, 9, 8, 9, 8, 9, 7, 5, 3,
27           5, 6, 7, 2, 5, 3, 9, 4, 6, 4,
28           7, 8, 9, 6, 8, 7, 8, 9, 7, 8,
29           7, 4, 4, 2, 5, 3, 8, 7, 5, 6,
30           4, 5, 6, 1, 6, 5, 7, 8, 7 };
31
32      // process responses
33      mean( response );
34      median( response );
35      mode( frequency, response );
36   } // end main
37
38   // calculate average of all response values
39   void mean( const unsigned int answer[] )
40   {
41      size_t j; // counter for totaling array elements
42      unsigned int total = 0; // variable to hold sum of array elements
43
44      printf( "%s %s %s ", "********", "  Mean", "********" );
45
46      // total response values
47      for ( j = 0; j < SIZE; ++j ) {
48         total += answer[ j ];
49      } // end for
50
51      printf( "The mean is the average value of the data "
52              "items. The mean is equal to the total of "
53              "all the data items divided by the number "
54              "of data items ( %u ). The mean value for "
55              "this run is: %u / %u = %.4f ",
56              SIZE, total, SIZE, (  double  ) total / SIZE );
57   } // end function mean
58
59   // sort array and determine median element's value
60   void median( unsigned int answer[] )
61   {
62      printf( " %s %s %s %s",
63              "********", " Median", "********",
64              "The unsorted array of responses is" );
65
66      printArray( answer ); // output unsorted array
67
68      bubbleSort( answer ); // sort array   
69
70      printf( "%s", " The sorted array is" );
71      printArray( answer ); // output sorted array
72
73      // display median element
74      printf( " The median is element %u of "
75              "the sorted %u element array. "
76              "For this run the median is %u ",
77              SIZE / 2, SIZE, answer[ SIZE / 2 ] );
78   } // end function median
79
80   // determine most frequent response
81   void mode( unsigned int freq[], const unsigned int answer[] )
82   {
83      size_t rating; // counter for accessing elements 1-9 of array freq
84      size_t j; // counter for summarizing elements 0-98 of array answer
85      unsigned int h; // counter for diplaying histograms freq array values
86      unsigned int largest = 0; // represents largest frequency
87      unsigned int modeValue = 0; // represents most frequent response
88
89      printf( " %s %s %s ",
90              "********", "  Mode", "********" );
91
92      // initialize frequencies to 0
93      for ( rating = 1; rating <= 9; ++rating ) {
94         freq[ rating ] = 0;
95      } // end for
96
97      // summarize frequencies
98      for ( j = 0; j < SIZE; ++j ) {
99         ++freq[ answer[ j ] ];
100     } // end for
101
102     // output headers for result columns
103     printf( "%s%11s%19s %54s %54s ",
104             "Response", "Frequency", "Histogram",
105             "1    1    2    2", "5    0    5    0    5" );
106
107     // output results
108     for ( rating = 1; rating <= 9; ++rating ) {
109        printf( "%8u%11u          ", rating, freq[ rating ] );
110
111        // keep track of mode value and largest frequency value   
112        if ( freq[ rating ] > largest ) {                         
113           largest = freq[ rating ];                              
114           modeValue = rating;                                    
115        } // end if                                               
116
117        // output histogram bar representing frequency value
118        for ( h = 1; h <= freq[ rating ]; ++h ) {
119           printf( "%s", "*" );
120        } // end inner for
121
122        puts( "" ); // being new line of output
123     } // end outer for
124
125     // display the mode value
126     printf( " The mode is the most frequent value. "
127             "For this run the mode is %u which occurred"
128             " %u times. ", modeValue, largest );
129  } // end function mode
130
131  // function that sorts an array with bubble sort algorithm
132  void bubbleSort( unsigned int a[] )
133  {
134     unsigned int pass; // pass counter
135     size_t j; // comparison counter
136     unsigned int hold; // temporary location used to swap elements
137
138     // loop to control number of passes
139     for ( pass = 1; pass < SIZE; ++pass ) {
140
141        // loop to control number of comparisons per pass
142        for ( j = 0; j < SIZE - 1; ++j ) {
143
144           // swap elements if out of order
145           if ( a[ j ] > a[ j + 1 ] ) {
146              hold = a[ j ];
147              a[ j ] = a[ j + 1 ];
148              a[ j + 1 ] = hold;
149           } // end if
150        } // end inner for
151     } // end outer for
152  } // end function bubbleSort
153
154  // output array contents (20 values per row)
155  void printArray( const unsigned int a[] )
156  {
157     size_t j; // counter
158
159     // output array contents
160     for ( j = 0; j < SIZE; ++j ) {
161
162        if ( j % 20 == 0 ) { // begin new line every 20 values
163           puts( "" );
164        } // end if
165
166        printf( "%2u", a[ j ] );
167     } // end for
168  } // end function printArray


Fig. 6.16 Survey data analysis with arrays: computing the mean, median and mode of the data.


********
  Mean
********
The mean is the average value of the data
items. The mean is equal to the total of
all the data items divided by the number
of data items ( 99 ). The mean value for
this run is: 681 / 99 = 6.8788


********
 Median
********
The unsorted array of responses is
 6 7 8 9 8 7 8 9 8 9 7 8 9 5 9 8 7 8 7 8
 6 7 8 9 3 9 8 7 8 7 7 8 9 8 9 8 9 7 8 9
 6 7 8 7 8 7 9 8 9 2 7 8 9 8 9 8 9 7 5 3
 5 6 7 2 5 3 9 4 6 4 7 8 9 6 8 7 8 9 7 8
 7 4 4 2 5 3 8 7 5 6 4 5 6 1 6 5 7 8 7

The sorted array is
 1 2 2 2 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 5
 5 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7
 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8
 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9

The median is element 49 of
the sorted 99 element array.
For this run the median is 7
********
  Mode
********
Response  Frequency          Histogram

                                      1    1    2    2
                                 5    0    5    0    5

       1          1          *
       2          3          ***
       3          4          ****
       4          5          *****
       5          8          ********
       6          9          *********
       7         23          ***********************
       8         27          ***************************
       9         19          *******************

The mode is the most frequent value.
For this run the mode is 8 which occurred 27 times.


Fig. 6.17 Sample run for the survey data analysis program.

Mean

The mean is the arithmetic average of the 99 values. Function mean (Fig. 6.16, lines 39–57) computes the mean by totaling the 99 elements and dividing the result by 99.

Median

The median is the middle value. Function median (lines 60–78) determines the median by calling function bubbleSort (defined in lines 132–152) to sort the array of responses into ascending order, then picking answer[SIZE / 2] (the middle element) of the sorted array. When the number of elements is even, the median should be calculated as the mean of the two middle elements. Function median does not currently provide this capability. Function printArray (lines 155–168) is called to output the response array.

Mode

The mode is the value that occurs most frequently among the 99 responses. Function mode (lines 81–129) determines the mode by counting the number of responses of each type, then selecting the value with the greatest count. This version of function mode does not handle a tie in this example. Function mode also produces a histogram to aid in determining the mode graphically.

6.8. Searching Arrays

You’ll often work with large amounts of data stored in arrays. It may be necessary to determine whether an array contains a value that matches a certain key value. The process of finding a particular element of an array is called searching. In this section we discuss two searching techniques—the simple linear search technique and the more efficient (but more complex) binary search technique.

Searching an Array with Linear Search

The linear search (Fig. 6.18) compares each element of the array with the search key. Because the array is not in any particular order, it’s just as likely that the value will be found in the first element as in the last. On average, therefore, the program will have to compare the search key with half the elements of the array.


 1   // Fig. 6.18: fig06_18.c
 2   // Linear search of an array.
 3   #include <stdio.h>
 4   #define SIZE 100
 5
 6   // function prototype
 7   size_t linearSearch( const int array[], int key, size_t size );
 8
 9   // function main begins program execution
10   int main( void )
11   {
12      int a[ SIZE ]; // create array a
13      size_t x; // counter for initializing elements 0-99 of array a
14      int searchKey; // value to locate in array a
15      size_t element; // variable to hold location of searchKey or -1
16
17      // create some data
18      for ( x = 0; x < SIZE; ++x ) {
19         a[ x ] = 2 * x;
20      } // end for
21
22      puts( "Enter integer search key:" );
23      scanf( "%d", &searchKey );
24
25      // attempt to locate searchKey in array a    
26      element = linearSearch( a, searchKey, SIZE );
27
28      // display results
29      if ( element != -1 ) {
30         printf( "Found value in element %d ", element );
31      } // end if
32      else {
33         puts( "Value not found" );
34      } // end else
35   } // end main
36
37   // compare key to every element of array until the location is found
38   // or until the end of array is reached; return subscript of element
39   // if key is found or -1 if key is not found                        
40   size_t linearSearch( const int array[], int key, size_t size )      
41   {                                                                   
42      size_t n; // counter                                             
43                                                                       
44      // loop through array                                            
45      for ( n = 0; n < size; ++n ) {                                   
46                                                                       
47         if ( array[ n ] == key ) {                                    
48            return n; // return location of key                        
49         } // end if                                                   
50      } // end for                                                     
51                                                                       
52      return -1; // key not found                                      
53   } // end function linearSearch                                      


Enter integer search key:
36
Found value in element 18



Enter integer search key:
37
Value not found


Fig. 6.18 Linear search of an array.

Searching an Array with Binary Search

The linear searching method works well for small or unsorted arrays. However, for large arrays linear searching is inefficient. If the array is sorted, the high-speed binary search technique can be used.

The binary search algorithm eliminates from consideration one-half of the elements in a sorted array after each comparison. The algorithm locates the middle element of the array and compares it to the search key. If they’re equal, the search key is found and the array subscript of that element is returned. If they’re not equal, the problem is reduced to searching one-half of the array. If the search key is less than the middle element of the array, the first half of the array is searched, otherwise the second half is searched. If the search key is not found in the specified subarray (piece of the original array), the algorithm is repeated on one-quarter of the original array. The search continues until the search key is equal to the middle element of a subarray, or until the subarray consists of one element that’s not equal to the search key (i.e., the search key is not found).

In a worst case-scenario, searching an array of 1023 elements takes only 10 comparisons using a binary search. Repeatedly dividing 1,024 by 2 yields the values 512, 256, 128, 64, 32, 16, 8, 4, 2 and 1. The number 1,024 (210) is divided by 2 only 10 times to get the value 1. Dividing by 2 is equivalent to one comparison in the binary search algorithm. An array of 1,048,576 (220) elements takes a maximum of only 20 comparisons to find the search key. An array of one billion elements takes a maximum of only 30 comparisons to find the search key. This is a tremendous increase in performance over the linear search that required comparing the search key to an average of half of the array elements. For a one-billion-element array, this is a difference between an average of 500 million comparisons and a maximum of 30 comparisons! The maximum comparisons for any array can be determined by finding the first power of 2 greater than the number of array elements.

Figure 6.19 presents the iterative version of function binarySearch (lines 42–72). The function receives four arguments—an integer array b to be searched, an integer searchKey, the low array subscript and the high array subscript (these define the portion of the array to be searched). If the search key does not match the middle element of a subarray, the low subscript or high subscript is modified so that a smaller subarray can be searched. If the search key is less than the middle element, the high subscript is set to middle - 1 and the search is continued on the elements from low to middle - 1. If the search key is greater than the middle element, the low subscript is set to middle + 1 and the search is continued on the elements from middle + 1 to high. The program uses an array of 15 elements. The first power of 2 greater than the number of elements in this array is 16 (24), so no more than 4 comparisons are required to find the search key. The program uses function printHeader (lines 75–94) to output the array subscripts and function printRow (lines 98–118) to output each subarray during the binary search process. The middle element in each subarray is marked with an asterisk (*) to indicate the element to which the search key is compared.


 1   // Fig. 6.19: fig06_19.c
 2   // Binary search of a sorted array.
 3   #include <stdio.h>
 4   #define SIZE 15
 5
 6   // function prototypes
 7   size_t binarySearch(const int b[], int searchKey, size_t low, size_t high);
 8   void printHeader( void );
 9   void printRow( const int b[], size_t low, size_t mid, size_t high );
10
11   // function main begins program execution
12   int main( void )
13   {
14      int a[ SIZE ]; // create array a
15      size_t i; // counter for initializing elements of array a
16      int key; // value to locate in array a
17      size_t result; // variable to hold location of key or -1
18
19      // create data
20      for ( i = 0; i < SIZE; ++i ) {
21         a[ i ] = 2 * i;
22      } // end for
23
24      printf( "%s", "Enter a number between 0 and 28: " );
25      scanf( "%d", &key );
26
27      printHeader();
28
29      // search for key in array a
30      result = binarySearch( a, key, 0, SIZE - 1 );
31
32      // display results
33      if ( result != -1 ) {
34         printf( " %d found in array element %d ", key, result );
35      } // end if
36      else {
37         printf( " %d not found ", key );
38      } // end else
39   } // end main
40
41   // function to perform binary search of an array
42   size_t binarySearch(const int b[], int searchKey, size_t low, size_t high)
43   {
44      int middle; // variable to hold middle element of array
45
46      // loop until low subscript is greater than high subscript
47      while ( low <= high ) {
48
49         // determine middle element of subarray being searched
50         middle = ( low + high ) / 2;
51
52         // display subarray used in this loop iteration
53         printRow( b, low, middle, high );
54
55         // if searchKey matched middle element, return middle   
56         if ( searchKey == b[ middle ] ) {                       
57            return middle;                                       
58         } // end if                                             
59
60         // if searchKey less than middle element, set new high   
61         else if ( searchKey < b[ middle ] ) {                    
62            high = middle - 1; // search low end of array         
63         } // end else if                                         
64
65         // if searchKey greater than middle element, set new low   
66         else {                                                     
67               low = middle + 1; // search high end of array        
68         } // end else                                              
69      } // end while   
70
71      return -1; // searchKey not found
72   } // end function binarySearch
73
74   // Print a header for the output
75   void printHeader( void )
76   {
77      unsigned int i; // counter
78
79      puts( " Subscripts:" );
80
81      // output column head
82      for ( i = 0; i < SIZE; ++i ) {
83         printf( "%3u ", i );
84      } // end for
85
86      puts( "" ); // start new line of output
87
88      // output line of - characters
89      for ( i = 1; i <= 4 * SIZE; ++i ) {
90         printf( "%s", "-" );
91      } // end for
92
93      puts( "" ); // start new line of output
94   } // end function printHeader
95
96   // Print one row of output showing the current
97   // part of the array being processed.
98   void printRow( const int b[], size_t low, size_t mid, size_t high )
99   {
100     size_t i; // counter for iterating through array b
101
102     // loop through entire array
103     for ( i = 0; i < SIZE; ++i ) {
104
105        // display spaces if outside current subarray range
106        if ( i < low || i > high ) {
107           printf( "%s", "    " );
108        } // end if
109        else if ( i == mid ) { // display middle element
110           printf( "%3d*", b[ i ] ); // mark middle value
111        } // end else if
112        else { // display other elements in subarray
113           printf( "%3d ", b[ i ] );
114        } // end else
115     } // end for
116
117     puts( "" ); // start new line of output
118  } // end function printRow


Enter a number between 0 and 28: 25

Subscripts:
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14
------------------------------------------------------------
  0   2   4   6   8  10  12  14* 16  18  20  22  24  26  28
                                 16  18  20  22* 24  26  28
                                                 24  26* 28
                                                 24*

25 not found



Enter a number between 0 and 28: 8

Subscripts:
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14
------------------------------------------------------------
  0   2   4   6   8  10  12  14* 16  18  20  22  24  26  28
  0   2   4   6*  8  10  12
                  8  10* 12
                  8*

8 found in array element 4



Enter a number between 0 and 28: 6

Subscripts:
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14
------------------------------------------------------------
  0   2   4   6   8  10  12  14* 16  18  20  22  24  26  28
  0   2   4   6*  8  10  12

6 found in array element 3


Fig. 6.19 Binary search of a sorted array.

6.9. Multidimensional Arrays

Arrays in C can have multiple subscripts. A common use of multiple-subscripted arrays, which the C standard refers to as multidimensional arrays, is to represent tables of values consisting of information arranged in rows and columns. To identify a particular table element, we must specify two subscripts: The first (by convention) identifies the element’s row and the second (by convention) identifies the element’s column. Tables or arrays that require two subscripts to identify a particular element are called double-subscripted arrays. Multidimensional arrays can have more than two subscripts.

Figure 6.20 illustrates a double-subscripted array, a. The array contains three rows and four columns, so it’s said to be a 3-by-4 array. In general, an array with m rows and n columns is called an m-by-n array.

Image

Fig. 6.20 Double-subscripted array with three rows and four columns.

Every element in array a is identified in Fig. 6.20 by an element name of the form a[i][j]; a is the name of the array, and i and j are the subscripts that uniquely identify each element in a. The names of the elements in row 0 all have a first subscript of 0; the names of the elements in column 3 all have a second subscript of 3.


Image Common Programming Error 6.7

Referencing a double-subscripted array element as a[ x, y ] instead of a[ x ][ y ] is a logic error. C interprets a[ x, y ] as a[ y ] (because the comma in this context is treated as a comma operator), so this programmer error is not a syntax error.


A multidimensional array can be initialized when it’s defined, much like a single-subscripted array. For example, a double-subscripted array int b[2][2] could be defined and initialized with

int b[ 2 ][ 2 ] = { { 1, 2 }, { 3, 4 } };

The values are grouped by row in braces. The values in the first set of braces initialize row 0 and the values in the second set of braces initialize row 1. So, the values 1 and 2 initialize elements b[0][0] and b[0][1], respectively, and the values 3 and 4 initialize elements b[1][0] and b[1][1], respectively. If there are not enough initializers for a given row, the remaining elements of that row are initialized to 0. Thus,

int b[ 2 ][ 2 ] = { { 1 }, { 3, 4 } };

would initialize b[0][0] to 1, b[0][1] to 0, b[1][0] to 3 and b[1][1] to 4. Figure 6.21 demonstrates defining and initializing double-subscripted arrays.


 1   // Fig. 6.21: fig06_21.c
 2   // Initializing multidimensional arrays.
 3   #include <stdio.h>
 4
 5   void printArray( int a[][ 3 ] ); // function prototype
 6
 7   // function main begins program execution
 8   int main( void )
 9   {
10      // initialize array1, array2, array3
11      int array1[ 2 ][ 3 ] = { { 1, 2, 3 }, { 4, 5, 6 } };
12      int array2[ 2 ][ 3 ] = { 1, 2, 3, 4, 5 };           
13      int array3[ 2 ][ 3 ] = { { 1, 2 }, { 4 } };         
14
15      puts( "Values in array1 by row are:" );
16      printArray( array1 );
17
18      puts( "Values in array2 by row are:" );
19      printArray( array2 );
20
21      puts( "Values in array3 by row are:" );
22      printArray( array3 );
23   } // end main
24
25   // function to output array with two rows and three columns   
26   void printArray( int a[][ 3 ] )                               
27   {                                                             
28      size_t i; // row counter                                   
29      size_t j; // column counter                                
30                                                                 
31      // loop through rows                                       
32      for ( i = 0; i <= 1; ++i ) {                               
33                                                                 
34         // output column values                                 
35         for ( j = 0; j <= 2; ++j ) {                            
36            printf( "%d ", a[ i ][ j ] );                        
37         } // end inner for                                      
38                                                                 
39         printf( " " ); // start new line of output             
40      } // end outer for                                         
41   } // end function printArray                                  


Values in array1 by row are:
1 2 3
4 5 6
Values in array2 by row are:
1 2 3
4 5 0
Values in array3 by row are:
1 2 0
4 0 0
 


Fig. 6.21 Initializing multidimensional arrays.

The program defines three arrays of two rows and three columns (six elements each). The definition of array1 (line 11) provides six initializers in two sublists. The first sublist initializes row 0 of the array to the values 1, 2 and 3; and the second sublist initializes row 1 of the array to the values 4, 5 and 6.

If the braces around each sublist are removed from the array1 initializer list, the compiler initializes the elements of the first row followed by the elements of the second row. The definition of array2 (line 12) provides five initializers. The initializers are assigned to the first row, then the second row. Any elements that do not have an explicit initializer are initialized to zero automatically, so array2[1][2] is initialized to 0.

The definition of array3 (line 13) provides three initializers in two sublists. The sublist for the first row explicitly initializes the first two elements of the first row to 1 and 2. The third element is initialized to zero. The sublist for the second row explicitly initializes the first element to 4. The last two elements are initialized to zero.

The program calls printArray (lines 26–41) to output each array’s elements. The function definition specifies the array parameter as const int a[][3]. When we receive a single-subscripted array as a parameter, the array brackets are empty in the function’s parameter list. The first subscript of a multidimensional array is not required either, but all subsequent subscripts are required. The compiler uses these subscripts to determine the locations in memory of elements in multidimensional arrays. All array elements are stored consecutively in memory regardless of the number of subscripts. In a double-subscripted array, the first row is stored in memory followed by the second row.

Providing the subscript values in a parameter declaration enables the compiler to tell the function how to locate an element in the array. In a double-subscripted array, each row is basically a single-subscripted array. To locate an element in a particular row, the compiler must know how many elements are in each row so that it can skip the proper number of memory locations when accessing the array. Thus, when accessing a[1][2] in our example, the compiler knows to skip the three elements of the first row to get to the second row (row 1). Then, the compiler accesses element 2 of that row.

Many common array manipulations use for repetition statements. For example, the following statement sets all the elements in row 2 of array a in Fig. 6.20 to zero:

for ( column = 0; column <= 3; ++column ) {
   a[ 2 ][ column ] = 0;
}

We specified row 2, so the first subscript is always 2. The loop varies only the second (column) subscript. The preceding for statement is equivalent to the assignment statements:

a[ 2 ][ 0 ] = 0;
a[ 2 ][ 1 ] = 0;
a[ 2 ][ 2 ] = 0;
a[ 2 ][ 3 ] = 0;

The following nested for statement determines the total of all the elements in array a.

total = 0;
for ( row = 0; row <= 2; ++row ) {
   for ( column = 0; column <= 3; ++column ) {
      total += a[ row ][ column ];
   }
}

The for statement totals the elements of the array one row at a time. The outer for statement begins by setting row (i.e., the row subscript) to 0 so that the elements of that row may be totaled by the inner for statement. The outer for statement then increments row to 1, so the elements of that row can be totaled. Then, the outer for statement increments row to 2, so the elements of the third row can be totaled. When the nested for statement terminates, total contains the sum of all the elements in the array a.

Two-Dimensonal Array Manipulations

Figure 6.22 performs several other common array manipulations on 3-by-4 array studentGrades using for statements. Each row of the array represents a student and each column represents a grade on one of the four exams the students took during the semester. The array manipulations are performed by four functions. Function minimum (lines 41–60) determines the lowest grade of any student for the semester. Function maximum (lines 63–82) determines the highest grade of any student for the semester. Function average (lines 85–96) determines a particular student’s semester average. Function printArray (lines 99–118) outputs the double-subscripted array in a neat, tabular format.


 1   // Fig. 6.22: fig06_22.c
 2   // Double-subscripted array manipulations.
 3   #include <stdio.h>
 4   #define STUDENTS 3
 5   #define EXAMS 4
 6
 7   // function prototypes
 8   int minimum( int grades[][ EXAMS ], size_t pupils, size_t tests );
 9   int maximum( int grades[][ EXAMS ], size_t pupils, size_t tests );
10   double average( const int setOfGrades[], size_t tests );
11   void printArray( int grades[][ EXAMS ], size_t pupils, size_t tests );
12
13   // function main begins program execution
14   int main( void )
15   {
16      size_t student; // student counter
17
18      // initialize student grades for three students (rows)
19      int studentGrades[ STUDENTS ][ EXAMS ] =
20         { { 77, 68, 86, 73 },
21           { 96, 87, 89, 78 },
22           { 70, 90, 86, 81 } };
23
24      // output array studentGrades
25      puts( "The array is:" );
26      printArray( studentGrades, STUDENTS, EXAMS );
27
28      // determine smallest and largest grade values
29      printf( " Lowest grade: %d Highest grade: %d ",
30         minimum( studentGrades, STUDENTS, EXAMS ),
31         maximum( studentGrades, STUDENTS, EXAMS ) );
32
33      // calculate average grade for each student
34      for ( student = 0; student < STUDENTS; ++student ) {
35         printf( "The average grade for student %u is %.2f ",
36            student, average( studentGrades[ student ], EXAMS ) );
37      } // end for
38   } // end main
39
40   // Find the minimum grade
41   int minimum( int grades[][ EXAMS ], size_t pupils, size_t tests )
42   {
43      size_t i; // student counter
44      size_t j; // exam counter
45      int lowGrade = 100; // initialize to highest possible grade
46
47      // loop through rows of grades
48      for ( i = 0; i < pupils; ++i ) {
49
50         // loop through columns of grades
51         for ( j = 0; j < tests; ++j ) {
52
53            if ( grades[ i ][ j ] < lowGrade ) {
54               lowGrade = grades[ i ][ j ];
55            } // end if
56         } // end inner for
57      } // end outer for
58
59      return lowGrade; // return minimum grade
60   } // end function minimum
61
62   // Find the maximum grade
63   int maximum( int grades[][ EXAMS ], size_t pupils, size_t tests )
64   {
65      size_t i; // student counter
66      size_t j; // exam counter
67      int highGrade = 0; // initialize to lowest possible grade
68
69      // loop through rows of grades
70      for ( i = 0; i < pupils; ++i ) {
71
72         // loop through columns of grades
73         for ( j = 0; j < tests; ++j ) {
74
75            if ( grades[ i ][ j ] > highGrade ) {
76               highGrade = grades[ i ][ j ];
77            } // end if
78         } // end inner for
79      } // end outer for
80
81      return highGrade; // return maximum grade
82   } // end function maximum
83
84   // Determine the average grade for a particular student   
85   double average( const int setOfGrades[], size_t tests )   
86   {                                                         
87      size_t i; // exam counter                              
88      int total = 0; // sum of test grades                   
89                                                             
90      // total all grades for one student                    
91      for ( i = 0; i < tests; ++i ) {                        
92         total += setOfGrades[ i ];                          
93      } // end for                                           
94                                                             
95      return ( double ) total / tests; // average            
96   } // end function average                                 
97
98   // Print the array
99   void printArray( int grades[][ EXAMS ], size_t pupils, size_t tests )
100  {
101     size_t i; // student counter
102     size_t j; // exam counter
103
104     // output column heads
105     printf( "%s", "                 [0]  [1]  [2]  [3]" );
106
107     // output grades in tabular format
108     for ( i = 0; i < pupils; ++i ) {
109
110        // output label for row
111        printf( " studentGrades[%d] ", i );
112
113        // output grades for one student
114        for ( j = 0; j < tests; ++j ) {
115           printf( "%-5d", grades[ i ][ j ] );
116        } // end inner for
117     } // end outer for
118  } // end function printArray


The array is:
                 [0]  [1]  [2]  [3]
studentGrades[0] 77   68   86   73
studentGrades[1] 96   87   89   78
studentGrades[2] 70   90   86   81

Lowest grade: 68
Highest grade: 96
The average grade for student 0 is 76.00
The average grade for student 1 is 87.50
The average grade for student 2 is 81.75


Fig. 6.22 Double-subscripted array manipulations.

Functions minimum, maximum and printArray each receive three arguments—the studentGrades array (called grades in each function), the number of students (rows of the array) and the number of exams (columns of the array). Each function loops through array grades using nested for statements. The following nested for statement is from the function minimum definition:

// loop through rows of grades
for ( i = 0; i < pupils; ++i ) {
   // loop through columns of grades
   for ( j = 0; j < tests; ++j ) {
      if ( grades[ i ][ j ] < lowGrade ) {
         lowGrade = grades[ i ][ j ];
      } // end if
   } // end inner for
} // end outer for

The outer for statement begins by setting i (i.e., the row subscript) to 0 so the elements of that row (i.e., the grades of the first student) can be compared to variable lowGrade in the body of the inner for statement. The inner for statement loops through the four grades of a particular row and compares each grade to lowGrade. If a grade is less than lowGrade, lowGrade is set to that grade. The outer for statement then increments the row subscript to 1. The elements of that row are compared to variable lowGrade. The outer for statement then increments the row subscript to 2. The elements of that row are compared to variable lowGrade. When execution of the nested statement is complete, lowGrade contains the smallest grade in the double-subscripted array. Function maximum works similarly to function minimum.

Function average (lines 85–96) takes two arguments—a single-subscripted array of test results for a particular student called setOfGrades and the number of test results in the array. When average is called, the first argument studentGrades[student] is passed. This causes the address of one row of the double-subscripted array to be passed to average. The argument studentGrades[1] is the starting address of row 1 of the array. Remember that a double-subscripted array is basically an array of single-subscripted arrays and that the name of a single-subscripted array is the address of the array in memory. Function average calculates the sum of the array elements, divides the total by the number of test results and returns the floating-point result.

6.10. Variable-Length Arrays

In early versions of C, all arrays had constant size. But what if you don’t know an array’s size at compilation time? To handle this, you’d have to use dynamic memory allocation with malloc and related functions. The C standard allows you to handle arrays of unknown size using variable-length arrays (VLAs). These are not arrays whose size can change—that would compromise the integrity of nearby locations in memory. A variable-length array is an array whose length, or size, is defined in terms of an expression evaluated at execution time. The program of Fig. 6.23 declares and prints several VLAs. [Note: This feature is not supported in Microsoft Visual C++.]


 1   // Fig. 6.23: fig06_14.c
 2   // Using variable-length arrays
 3   #include <stdio.h>
 4
 5   // function prototypes
 6   void print1DArray( int size, int arr[ size ] );
 7   void print2DArray( int row, int col, int arr[ row ][ col ] );
 8
 9   int main( void )
10   {
11      int arraySize; // size of 1-D array
12      int row1, col1, row2, col2; // number of rows and columns in 2-D arrays
13
14      printf( "%s", "Enter size of a one-dimensional array: " );
15      scanf( "%d", &arraySize );
16
17      printf( "%s", "Enter number of rows and columns in a 2-D array: " );
18      scanf( "%d %d", &row1, &col1 );
19
20      printf( "%s",
21         "Enter number of rows and columns in another 2-D array: " );
22      scanf( "%d %d", &row2, &col2 );
23
24      int array[ arraySize ]; // declare 1-D variable-length array      
25      int array2D1[ row1 ][ col1 ]; // declare 2-D variable-length array
26      int array2D2[ row2 ][ col2 ]; // declare 2-D variable-length array
27
28      // test sizeof operator on VLA
29      printf( " sizeof(array) yields array size of %d bytes ",
30         sizeof( array ) );                                     
31
32      // assign elements of 1-D VLA
33      for ( int i = 0; i < arraySize; ++i ) {
34         array[ i ] = i * i;
35      } // end for
36
37      // assign elements of first 2-D VLA
38      for ( int i = 0; i < row1; ++i ) {
39         for ( int j = 0; j < col1; ++j ) {
40            array2D1[ i ][ j ] = i + j;
41         } // end for
42      } // end for
43
44      // assign elements of second 2-D VLA
45      for ( int i = 0; i < row2; ++i ) {
46         for ( int j = 0; j < col2; ++j ) {
47            array2D2[ i ][ j ] = i + j;
48         } // end for
49      } // end for
50
51      puts( " One-dimensional array:" );
52      print1DArray( arraySize, array ); // pass 1-D VLA to function
53
54      puts( " First two-dimensional array:" );
55      print2DArray( row1, col1, array2D1 ); // pass 2-D VLA to function
56
57      puts( " Second two-dimensional array:" );
58      print2DArray( row2, col2, array2D2 ); // pass other 2-D VLA to function
59   } // end main
60
61   void print1DArray( int size, int array[ size ] )
62   {
63      // output contents of array
64      for ( int i = 0; i < size; i++ ) {
65         printf( "array[%d] = %d ", i, array[ i ] );
66      } // end for
67   } // end function print1DArray
68
69   void print2DArray( int row, int col, int arr[ row ][ col ] )
70   {
71      // output contents of array
72      for ( int i = 0; i < row; ++i ) {
73         for ( int j = 0; j < col; ++j ) {
74            printf( "%5d", arr[ i ][ j ] );
75         } // end for
76
77         puts( "" );
78      } // end for
79   } // end function print2DArray


Enter size of a one-dimensional array: 6
Enter number of rows and columns in a 2-D array: 2 5
Enter number of rows and columns in another 2-D array: 4 3



sizeof(array) yields array size of 24 bytes

One-dimensional array:
array[0] = 0
array[1] = 1
array[2] = 4
array[3] = 9
array[4] = 16
array[5] = 25

First two-dimensional array:
    0    1    2    3    4
    1    2    3    4    5

Second two-dimensional array:
    0    1    2
    1    2    3
    2    3    4
    3    4    5


Fig. 6.23 Using variable-length arrays.

First, we prompt the user for the desired sizes for a one-dimensional array and two two-dimensional arrays (lines 14–22). Lines 24–26 then declare VLAs of the appropriate size. This is valid as long as the variables representing the array sizes are of an integral type.

After declaring the arrays, we use the sizeof operator in lines 29–30 to make sure that our VLA is of the proper length. In early versions of C sizeof was always a compile-time operation, but when applied to a VLA, sizeof operates at runtime. The output window shows that the sizeof operator returns a size of 24 bytes—four times that of the number we entered because the size of an int on our machine is 4 bytes.

Next we assign values to our VLAs’ elements (lines 33–49). We use i < arraySize as our loop-continuation condition when filling the one-dimensional array. As with fixed-length arrays, there is no protection against stepping outside the array bounds.

Lines 61–67 define function print1DArray that takes a one-dimensional VLA. The syntax for passing VLAs as parameters to functions is the same as with a normal, fixed-length array. We use the variable size in the declaration of the array parameter, but no checking is performed other than the variable being defined and of integral type—it’s purely documentation for the programmer.

Function print2DArray (lines 69–79) takes a variable-length two-dimensional array and displays it to the screen. Recall from Section 6.9 that all but the first subscript of a multidimensional array must be specified when declaring a function parameter. The same restriction holds true for VLAs, except that the sizes can be specified by variables. The initial value of col passed to the function is used to convert from two-dimensional indices to offsets into the contiguous memory the array is stored in, just as with a fixed-size array. Changing the value of col inside the function will not cause any changes to the indexing, but passing an incorrect value to the function will.

6.11. Secure C Programming

Bounds Checking for Array Subscripts

It’s important to ensure that every subscript you use to access an array element is within the array’s bounds—that is, greater than or equal to 0 and less than the number of array elements. A two-dimensional array’s row and column subscripts must be greater than or equal to 0 and less than the numbers of rows and columns, respectively. This extends to arrays with additional dimensions as well.

Allowing programs to read from or write to array elements outside the bounds of arrays are common security flaws. Reading from out-of-bounds array elements can cause a program to crash or even appear to execute correctly while using bad data. Writing to an out-of-bounds element (known as a buffer overflow) can corrupt a program’s data in memory, crash a program and allow attackers to exploit the system and execute their own code.

As we stated in Section 6.4, C provides no automatic bounds checking for arrays, so you must provide your own. For techniques that help you prevent such problems, see CERT guideline ARR30-C at www.securecoding.cert.org.

scanf_s

Bounds checking is also important in string processing. When reading a string into a char array, scanf does not prevent buffer overflows. If the number of characters input is greater than or equal to the array’s length, scanf will write characters—including the string’s terminating null character ('')—beyond the end of the array. This might overwrite other variables’ values, and eventually the program might overwrite the string’s '' if it writes to those other variables.

Functions determine where strings end by looking for their terminating '' character. For example, function printf outputs a string by reading characters from the beginning of the string in memory and continuing until the string’s '' is encountered. If the '' is missing, printf might read far beyond the end of the string until it encounters some other '' in memory.

The C standard’s optional Annex K provides new, more secure, versions of many string-processing and input/output functions, including scanf_s—a version of scanf that performs additional checks to ensure that it does not write beyond the end of a character array used to store a string. Assuming that myString is a 20-character array, the statement

scanf_s( "%19s", myString, 20 );

reads a string into myString. Function scanf_s requires two arguments for each %s in the format string—a character array in which to place the input string and the number of array elements. The second of these arguments is used by scanf_s to prevent buffer overflows. For example, it’s possible to supply a field width for %s that’s too long for the underlying character array, or to simply omit the field width entirely. If the number of characters input plus the terminating null character is larger than the number of array elements, the %s conversion would fail. Because the preceding statement contains only one conversion specifier, scanf_s would return 0 indicating that no conversions were performed, and myString would be unaltered.

In general, if your compiler supports the functions from the C standard’s optional Annex K, you should use them. We discuss additional Annex K functions in later Secure C Programming sections.

Don’t Use Strings Read from the User as Format-Control Strings

You might have noticed that throughout this book, we never use single-argument printfs. Instead we use one of the following forms:

• When we need to output a ' ' after the string, we use function puts (which automatically outputs a ' ' after its single string argument), as in

     puts( "Welcome to C!" );

• When we need the cursor to remain on the same line as the string, we use function printf, as in

     printf( "%s", "Enter first integer: " );

Because we were displaying string literals, we certainly could have used the one-argument form of printf, as in

printf( "Welcome to C! " );
printf( "Enter first integer: " );

When printf evaluates the format-control string in its first (and possibly its only) argument, the function performs tasks based on the conversion specifier(s) in that string. If the format-control string were obtained from the user, an attacker could supply malicious conversion specifiers that would be “executed” by the formatted output function. Now that you know how to read strings into character arrays, it’s important to note that you should never use as a printf’s format-control string a character array that might contain user input. For more information, see CERT guideline FIO30-C at www.securecoding.cert.org.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.234.24