Hour 7. Storing Information in Arrays and Strings

What Is an Array?

An array is a collection of related data that all have the same data type. An array can be envisioned as a series of data storage locations. Each storage location is called an element of the array.

An array is declared by writing the data type and the array name followed by the number of elements the array holds inside square brace. Here’s an example:

long peaks[25];

The peaks array holds 25 long integers. This declaration causes the compiler to sets aside enough memory to hold all 25 elements. Because each long integer requires 4 bytes, this declaration sets aside 100 contiguous bytes of memory.

Array elements are numbered from 0 up to the largest element, so the peaks array holds elements 0 through 24. Each element is accessed by using its number in square braces. This statement that assigns a value to the first peaks element:

peaks[0] = 29029;

This statement assigns a value to the last:

peaks[24] = 7804;

The number of an array element also is called its subscript.

The zero-based numbering of array elements can be confusing—an array with three elements has elements numbered 0, 1, and 2 (not 1, 2, and 3).

The WeightGoals program in Listing 7.1 uses an array to calculate weight-loss milestones for a dieting person. The array holds floating-point values that represent progress of 10%, 25%, 50%, and 75% toward the dieter’s goal weight.

Listing 7.1 The Full Text of WeightGoals.cpp


 1: #include <iostream>
 2:
 3: int main()
 4: {
 5:     float goal[4];
 6:     goal[0] = 0.9;
 7:     goal[1] = 0.75;
 8:     goal[2] = 0.5;
 9:     goal[3] = 0.25;
10:     float weight, target;
11:
12:     std::cout << "Enter current weight: ";
13:     std::cin >> weight;
14:     std::cout << " Enter goal weight: ";
15:     std::cin >> target;
16:     std::cout << " ";
17:
18:     for (int i = 0; i < 4; i++)
19:     {
20:         float loss = (weight - target) * goal[i];
21:         std::cout << "Goal " << i << ": ";
22:         std::cout << weight - loss << " ";
23:     }
24:
25:     return 0;
26: }


This program asks a user’s current weight and goal weight, and then displays four intermediate weight milestones:

Enter current weight: 289

Enter goal weight: 225

Goal 0: 282.6
Goal 1: 273
Goal 2: 257
Goal 3: 241

The program stores the user’s current weight in the variable weight and the user’s target in the variable target. Both hold floating-point variables.

The goal array holds four values that will be used to calculate the weight milestones. The four-element array is created (line 5) and values of 0.9, 0.75, 0.5, and 0.25 are assigned to those elements (lines 6–9).

A for loop iterates through the elements of the array. The amount to lose to reach a milestone is stored in the loss variable (line 20). This variable is the total amount of weight to lose multiplied by the percentage.

The loss total is subtracted from weight and displayed as a milestone (line 21).


By the Way

The fact that arrays count up from 0 rather than 1 is a common cause of bugs in programs written by C++ novices. When you use an array, remember that an array with 10 elements counts from array[0] to array[9].


Writing Past the End of Arrays

When you assign a value to an array element, the compiler computes where to store the value in memory based on the size of each element and its subscript. If you store a new value in goal[3], the compiler multiplies the offset of 3 by the size of each element, which for long integers is 4 bytes. The compiler then moves that many bytes, 12, from the beginning of the array and stores the new value at that location.

The goal array in the WeightGoals program only has four elements. If you try to store something in goal[4], the compiler ignores the fact that there is no such element. Instead, it stores it in memory 20 bytes past the beginning of the first element, replacing whatever data is at that location. This can be almost any data, so writing the new out-of-bounds value might have unpredictable results, such as crashing immediately or running with strange results.

These errors can be difficult to spot as a program runs, so it’s important to pay attention to the size of arrays when they are accessed.

It is so common to write data one element past the end of an array that the bug has its own name: a fence post error. The name refers to the problem of counting how many posts you need for a 10-foot fence if you need one post for every foot. Some people answer 10, but you need 11, as shown in Figure 7.1.

Figure 7.1 Counting fence posts.

image

This sort of “off by one” mistake can be the bane of any programmer’s life. Over time, however, you’ll get used to the idea that a 25-element array counts only to element 24 and that everything counts from zero.

Initializing Arrays

You can initialize a simple array of built-in types, such as integers and characters, when you first declare the array. After the array name, put an equal sign and a list of comma-separated values enclosed in squiggly brace marks:

int post[10] = { 0, 10, 20, 30, 40, 50, 60, 70, 80, 90 };

This declares post to be an array of 10 integers. It assigns post[0] the value 0, post[1] the value 1, and so forth up to post[9] equaling 90.

If you omit the size of the array, an array just big enough to hold the initialization is created. Consider this statement:

int post[] = { 10, 20, 30, 40, 50 };

An integer array with five elements is created with post[0] equal to 10, post[1] equal to 20, and so on.

The built-in C++ function sizeof() can be used to count the number of elements in an array:

const int size = sizeof(post) / sizeof(post[0]);

This example obtains the size of the post array by dividing the size of the entire array by the size of an individual element in the array. The result is the number of members in the array.

You cannot initialize more elements than you’ve declared for the array. This statement generates a compiler error:

int post[5] = { 10, 20, 30, 40, 50, 60};

The error occurs because a five-element array has been initialized with six values. It is permitted to initialize an array with fewer values than it holds, as in this statement:

int long[5] = { 10, 20 };

Multidimensional Arrays

An array can be thought of as a single row of data. A second dimension could be conceptualized as a grid of data consisting of rows and columns. This is a two-dimensional array of data, with one dimension representing each row and the second dimension representing each column. A three-dimensional array could be a cube, with one dimension representing width, a second dimension representing height, and a third dimension representing depth. You can even have arrays of more than three dimensions, although they are harder to imagine as objects in space.

When you declare arrays, each dimension is represented as a subscript in the array. A two-dimensional array has two subscripts:

int grid[5, 13];

A three-dimensional array has three subscripts:

int cube[5, 13, 8];

Arrays can have any number of dimensions, although it is likely that most of the arrays you create will have one or two dimensions.

A good example of a two-dimensional array is a chessboard. One dimension represents the eight rows; the other dimension represents the eight columns. Figure 7.2 illustrates this idea.

Figure 7.2 A chessboard is a two-dimensional array of squares.

image

Suppose that you have an array of char values named board that represents the board. Each element could equal ‘w’ if a white piece occupies the square, ‘b’ if a black piece does and ″ otherwise. The following statement creates the array:

int board[8][8];

You also could represent the same data with a one-dimensional, 64-square array:

int board[64];

This doesn’t correspond as closely to the real-world object as the two-dimensional array, however. When the game begins, the king is located in the fourth position in the first row. Counting from zero, that position corresponds to board[0][3], assuming that the first subscript corresponds to row and the second to column. The layout of positions for the entire board is illustrated in Figure 7.2.


Watch Out!

Multidimensional arrays can rapidly grow to exceed available memory, so keep that in mind when creating large arrays with multiple dimensions.


Initializing Multidimensional Arrays

You can initialize multidimensional arrays with values just like single-dimension arrays. Values are assigned to array elements in order, with the last array subscript changing and each of the former ones holding steady like a car’s mileage odometer. Here’s an example:

int box[5][3] = { 8, 6, 7, 5, 3, 0, 9, 2, 1, 7, 8,
    9, 0, 5, 2 };

The first value is assigned to box[0][0], the second to box[0][1], and the third to box [0][2]. The next value is assigned to box[1][0], then box[1][1] and box[1][2].

This is demonstrated in the Box program in Listing 7.2.

Listing 7.2 The Full Text of Box.cpp


 1: #include <iostream>
 2:
 3: int main()
 4: {
 5:     int box[5][3] = { 8, 6, 7, 5, 3, 0, 9, 2, 1, 7, 8,
 6:         9, 0, 5, 2 };
 7:     for (int i = 0; i < 5; i++)
 8:     {
 9:         for (int j = 0; j < 3; j++)
10:         {
11:             std::cout << "box[" << i << "]";
12:             std::cout << "[" << j << "] = ";
13:             std::cout << box[i][j] << " ";
14:         }
15:     }
16: }


The program’s output displays the contents of each array element, which can be compared to the assignment statement in lines 5–6:

box[0][0] = 8
box[0][1] = 6
box[0][2] = 7
box[1][0] = 5
box[1][1] = 3
box[1][2] = 0
box[2][0] = 9
box[2][1] = 2
box[2][2] = 1
box[3][0] = 7
box[3][1] = 8
box[3][2] = 9
box[4][0] = 0
box[4][1] = 5
box[4][2] = 2

The box variable holds a two-dimensional array that has five integers in the first dimension and two integers in the second. This creates a 5-by-3 grid of elements.

Two for loops are used to cycle through the array, displaying each array element and its value.

For the sake of clarity, you could group the initializations with braces, organizing each row on its own line:

int box[5][3] = {
    {8, 6, 7},
    {5, 3, 0},
    {9, 2, 1},
    {7, 8, 9},
    {0, 5, 2} };

The compiler ignores the inner braces. This makes it easier to see how the numbers are distributed.

Each value must be separated by a comma without regard to the braces. The entire initialization set must be within braces, and it must end with a semicolon.

A Word About Memory

When you declare an array, you tell the compiler exactly how many elements you expect to store in it. The compiler sets aside memory the proper amount of memory for an array given the size of the data type and the number of elements it contains. Arrays are suitable for data that consists of a known number of elements, such as squares on a chessboard (64) or years in a century (100).

When you have no idea how many elements are needed, you must use more advanced data structures.

Future hours of this book cover arrays of pointers, arrays built on the heap, and other structures. In Hour 19, “Storing Information in Linked Lists,” we look at an advanced data structure known as a linked list.

Character Arrays

Familiarity with arrays makes it possible to work with longer text than the single characters represented by the char data type. A string is a series of characters. The only strings you’ve worked with up to this point have been string literals used in std::cout statements:

std::cout << "Solidum petit in profundis! ";

In C++, a string is an array of characters ending with a null character, a special character coded as ''. You can declare and initialize a string like any other array:

char yum[] = { 'Z', 'o', 'm', 'b', 'i', 'e',
    ' ','E','a','t',' ', 'B', 'r', 'a', 'i', 'n',
    's', '' };

The last character, '', is the null character that terminates the string.

Because this character-by-character approach is difficult to type and admits too many opportunities for error, C++ enables a shorthand form of string initialization using a literal:

char yum[] = "Zombie Eat Brains";

This form of initialization doesn’t require the null character; the compiler adds it automatically.

The string “Zombie Eat Brains” is 18 bytes, including null.

You also can create uninitialized character arrays, which are called buffers. As with all arrays, it is important to ensure that you don’t put more into the buffer than there is room for.

Buffers can be used to store input typed by a user. Several programs created in past hours used the std::cin object to collect user input and store it in a variable:

std::cin >> yum;

Although this approach works, two major problems arise. First, if the user enters more characters than the size of the buffer, cin writes past the end of the buffer, making the program run improperly and causing security concerns. Second, if the user enters a space, cin treats it as the end of the string and stops writing to the buffer.

To solve these problems, you must call a method of the cin object called getline() with two arguments:

• The buffer to fill

• The maximum number of characters to get

The following statement stores user input of up to 18 characters (including null) and stores it in the yum character array:

std::cin.getline(yum, 18);

The method also can be called with a third argument, the delimiter that terminates input:

std::cin.getline(yum, 18, ' '),

This statement terminates input at the first space. When the third argument is omitted, the newline character (' ') is the delimiter.

The BridgeKeeper program in Listing 7.3 asks three famous questions from film, storing them in buffers.

Listing 7.3 The Full Text of BridgeKeeper.cpp


 1: #include <iostream>
 2:
 3: int main()
 4: {
 5:     char name[50];
 6:     char quest[80];
 7:     char velocity[80];
 8:
 9:     std::cout << " What is your name? ";
10:     std::cin.getline(name, 49);
11:
12:     std::cout << " What is your quest? ";
13:     std::cin.getline(quest, 79);
14:
15:     std::cout << " What is the velocity of an unladen swallow? ";
16:     std::cin.getline(velocity, 79);
17:
18:     std::cout << " Name: " << name << " ";
19:     std::cout << "Quest: " << quest << " ";
20:     std::cout << "Velocity: " << velocity << " ";
21:     return 0;
22: }


This program produces output like the following:

What is your name? Rogers Cadenhead

What is your quest? Time-based C++ tutelage

What is the airspeed velocity of an unladen
swallow? I don't know–– aagh!

Name: Rogers Cadenhead
Quest: Time-based C++ tutelage
Velocity: I don't know–– aagh!

Line 10 calls the method getLine() of cin. The buffer declared in line 9 is passed in as the first argument. The second argument is the maximum number of characters to allow as input. Because the name buffer can hold 50 characters, the argument must be 49 to allow for the terminating null. There is no need to provide a terminating character as a third argument because the default value of newline is sufficient.

The film in question, if you haven’t recognized it already (or even if you did), is Monty Python and the Holy Grail. The Bridge of Death is guarded by a bridgekeeper who demands that three questions be answered correctly on penalty of being thrown off to your doom.

The correct answers, in case you run into this problem:

• It is Arthur, King of the Britons

• To seek the Holy Grail

• What do you mean? An African or European swallow?

Copying Strings

C++ inherits from C a library of functions for dealing with strings. This library can be incorporated in a program by including the header file string.h:

#include <string.h>

Among the many functions provided are two for copying one string into another: strcpy() and strncpy().

The strcpy() function copies the entire contents of one string into a designated buffer, as demonstrated by the StringCopier program in Listing 7.4.

Listing 7.4 The Full Text of StringCopier.cpp


 1: #include <iostream>
 2: #include <string.h>
 3:
 4: int main()
 5: {
 6:     char string1[] = "Free the bound periodicals!";
 7:     char string2[80];
 8:
 9:     strcpy(string2, string1);
10:
11:     std::cout << "String1: " << string1 << std::endl;
12:     std::cout << "String2: " << string2 << std::endl;
13:     return 0;
14: }


Run this program to eyeball the following output:

String1: Free the bound periodicals!
String2: Free the bound periodicals!

A character array is created on Line 6 and initialized with the value of a string literal. The strcpy() function on Line 9 takes two character arrays: a destination that will receive the copy and a source that will copy it. If the source array is larger than the destination, strcpy() writes data past the end of the buffer.

To protect against this, the standard library also includes the function strncpy(). This version takes a third argument that specifies the maximum number of characters to copy:

strncopy(string1, string2, 80);

Summary

One thing that makes software so useful is the ability to process large amounts of similar data. Arrays are collections of data that share the same data type. This hour demonstrated them with only the simple data types, but you learn in upcoming hours that arrays can be put to use on more complex forms of data.

Although strings are just character arrays in C++, they’re commonly referred to as strings because they serve so many useful purposes. Strings can collect user input, present text, and store textual data from files, web documents, and other sources.

There are many other ways to represent data in C++ more sophisticated than simple data types and arrays.

Q&A

Q. What happens if I write to element 25 in a 24-member array?

A. You will write to other memory, with potentially disastrous effects on your program. Memory used by the program could be overwritten, and the software could run improperly. According to security experts, the most common software exploit used by malicious programmers is to write data past a buffer and use this error to execute new code. The new code often can do anything, such as altering or deleting files, granting system privileges to untrusted users, and replicating viruses.

Q. What is in an uninitialized array element?

A. An array element that has not been assigned a value. The value is whatever happens to be in memory at a given time. The results of using this member without assigning a value are unpredictable.

Q. Can I combine arrays?

A. Yes. With simple arrays you can use pointers to combine them into a new, larger array. With strings you can use some of the built-in functions, such as strcat, to combine strings.

Q. Why did the number 13 become associated with bad luck?

A. Thirteen has been getting bad press since the Middle Ages, partially from the observation that the presence of Judas made the Last Supper a table for 13.

Additionally, the Norse god Loki arrived at a party attended by 12 of his colleagues and ruined the proceedings, according to one myth.

Also, when ancient calendars in Sumer and Babylon ventured too far off the mark, a 13th month was added to bring things into line with the seasons, Ellis writes. This error affected planting schedules and crop yield, so month 13 was not welcome.

Thirteen is also one past a dozen, and 12 is considered to be a suitable number for a wide variety of things. So, 13 becomes one too many and represents transgression and discord.

The number 13 is so terrifying there’s a word to describe the fear of it: triskaidekaphobia.

Workshop

You just spent the past hour learning about arrays. Now is the time to answer a few questions and perform a couple of exercises to firm up your knowledge of them.

Quiz

1. What is the minimum array subscript for a particular array?

A. 0

B. 1

C. There is no minimum.

2. What happens if you try to store data beyond the maximum allowed array subscript?

A. The compiler reports an error.

B. The data is ignored.

C. The data is written in memory past the array.

3. What is another name for a character array that does not have an initial value?

A. A string

B. A buffer

C. A null character

Answers

1. A. All arrays start with zero. The last element is the size of the array minus 1, so the array brains[50] would hold 49.

2. C. The data is written to the address right after the end of the array. It is difficult to tell what will happen. If you are lucky, the data is stored in an area of memory the computer doesn’t want you to access, and an error results. If you are unlucky, another variable is changed in some strange way that is difficult to debug.

3. B. A buffer can be used to store user input or any other character data.

Activities

1. Write a program that asks a user’s first and last name and displays it as part of a sentence greeting the user.

2. Modify the WeightGoals program and add two new intermediate milestones for 90% and 95%.

To see solutions to these activities, visit this book’s website at http://cplusplus.cadenhead.org.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.107.140