Day 13. Managing Arrays and Strings

In lessons on previous days, you declared a single int, char, or other object. You often want to declare a collection of objects, such as 20 ints or a litter of Cats.

Today, you will learn

• What arrays are and how to declare them

• What strings are and how to use character arrays to make them

• The relationship between arrays and pointers

• How to use pointer arithmetic

• What linked lists are

What Is an Array?

An array is a sequential collection of data storage locations, each of which holds the same type of data. Each storage location is called an element of the array.

You declare an array by writing the type, followed by the array name and the subscript. The subscript is the number of elements in the array, surrounded by square brackets. For example,


long LongArray[25];

declares an array of 25 long integers, named LongArray. When the compiler sees this declaration, it sets aside enough memory to hold all 25 elements. If each long integer requires four bytes, this declaration sets aside 100 contiguous bytes of memory, as illustrated in Figure 13.1.

Figure 13.1. Declaring an array.

Image

Accessing Array Elements

You access an array element by referring to its offset from the beginning of the array. Array element offsets are counted from zero. Therefore, the first array element is referred to as arrayName[0]. In the LongArray example, LongArray[0] is the first array element, LongArray[1] the second, and so forth.

This can be somewhat confusing. The array SomeArray[3] has three elements. They are SomeArray[0], SomeArray[1], and SomeArray[2]. More generally, SomeArray[n] has n elements that are numbered SomeArray[0] through SomeArray[n-1]. Again, remember that this is because the index is an offset, so the first array element is 0 storage locations from the beginning of the array, the second is 1 storage location, and so on.

Therefore, LongArray[25] is numbered from LongArray[0] through LongArray[24]. Listing 13.1 shows how to declare an array of five integers and fill each with a value.

Note

Starting with today’s listings, the line numbers will start with zero. This is to help you remember that arrays in C++ start from zero!

Listing 13.1. Using an Integer Array

Image

Image


Value for myArray[0]:  3
Value for myArray[1]:  6
Value for myArray[2]:  9
Value for myArray[3]:  12
Value for myArray[4]:  15
0: 3
1: 6
2: 9
3: 12
4: 15

Image

Listing 13.1 creates an array, has you enter values for each element, and then prints the values to the console. In line 5, the array, called myArray, is declared and is of type integer. You can see that it is declared with five in the square brackets. This means that myArray can hold five integers. Each of these elements can be treated like an integer variable.

In line 7, a for loop is started that counts from zero through four. This is the proper set of offsets for a five-element array. The user is prompted for a value on line 9, and on line 10 the value is saved at the correct offset into the array.

Looking closer at line 10, you see that each element is accessed using the name of the array followed by square brackets with the offset in between. Each of these elements can then be treated like a variable of the array’s type.

The first value is saved at myArray[0], the second at myArray[1], and so forth. On lines 12 and 13, a second for loop prints each value to the console.

Note

Arrays count from zero, not from one. This is the cause of many bugs in programs written by C++ novices. Think of the index as the offset. The first element, such as ArrayName[0], is at the beginning of the array, so the offset is zero. Thus, whenever you use an array, remember that an array with 10 elements counts from ArrayName[0] to ArrayName[9]. ArrayName[10] is an error.

Writing Past the End of an Array

When you write a value to an element in an array, the compiler computes where to store the value based on the size of each element and the subscript. Suppose that you ask to write over the value at LongArray[5], which is the sixth element. The compiler multiplies the offset (5) by the size of each element—in this case, 4 bytes. It then moves that many bytes (20) from the beginning of the array and writes the new value at that location.

If you ask to write at LongArray[50], most compilers ignore the fact that no such element exists. Rather, the compiler computes how far past the first element it should look (200 bytes) and then writes over whatever is at that location. This can be virtually any data, and writing your new value there might have unpredictable results. If you’re lucky, your program will crash immediately. If you’re unlucky, you’ll get strange results much later in your program, and you’ll have a difficult time figuring out what went wrong.

The compiler is like a blind man pacing off the distance from a house. He starts out at the first house, MainStreet[0]. When you ask him to go to the sixth house on Main Street, he says to himself, “I must go five more houses. Each house is four big paces. I must go an additional 20 steps.” If you ask him to go to MainStreet[100] and Main Street is only 25 houses long, he paces off 400 steps. Long before he gets there, he will, no doubt, step in front of a truck. So be careful where you send him.

Listing 13.2 writes past the end of an array. You should compile this listing to see what error and warning messages you get. If you don’t get any, you should be extra careful when working with arrays!

Caution

Do not run this program; it might crash your system!

Listing 13.2. Writing Past the End of an Array

Image

Image


Test 1:
TargetArray[0]: 10
TargetArray[24]: 10

SentinelOne[0]: 0
SentinelTwo[0]: 0
SentinelOne[1]: 0
SentinelTwo[1]: 0
SentinelOne[2]: 0
SentinelTwo[2]: 0

Assigning...
Test 2:
TargetArray[0]: 20
TargetArray[24]: 20
TargetArray[25]: 20

SentinelOne[0]: 20
SentinelTwo[0]: 0
SentinelOne[1]: 0
SentinelTwo[1]: 0
SentinelOne[2]: 0
SentinelTwo[2]: 0

Image

Lines 8 and 10 declare two arrays of three long integers that act as sentinels around TargetArray. These sentinel arrays are initialized with the value 0 on lines 12–16. Because these are declared before and after TargetArray, there is a good chance that they will be placed in memory just before and just after it. If memory is written to beyond the end of TargetArray, it is the sentinels that are likely to be changed rather than some unknown area of data. Some compilers count down in memory; others count up. For this reason, the sentinels are placed both before and after TargetArray.

Lines 20–30 confirm the sentinel values are okay by printing them as well as the first and last elements of TargetArray. On line 34, TargetArray’s members are all reassigned from the initial value of 10 to the new value of 20. Line 34, however, counts to TargetArray offset 25, which doesn’t exist in TargetArray.

Lines 37–39 print TargetArray’s values again as a second test to see what the values are. Note that on line 39 TargetArray[25] is perfectly happy to print the value 20. However, when SentinelOne and SentinelTwo are printed, SentinelOne[0] reveals that its value has changed. This is because the memory that is 25 elements after TargetArray[0] is the same memory that is at SentinelOne[0]. When the nonexistent TargetArray[25] was accessed, what was actually accessed was SentinelOne[0].

Note

Note that because all compilers use memory differently, your results might vary. You might find that the sentinels did not get overwritten. If this is the case, try changing line 33 to assign yet another value—change the 25 to 26. This increases the likelihood that you’ll overwrite a sentinel. Of course, you might overwrite something else or crash your system instead.

This nasty bug can be very hard to find, because SentinelOne [0]’s value was changed in a part of the code that was not writing to SentinelOne at all.

Fence Post Errors

It is so common to write to one past the end of an array that this bug has its own name. It is called a fence post error. This refers to the problem in counting how many fence posts you need for a 10-foot fence if you need one post for every foot. Most people answer 10, but of course you need 11. Figure 13.2 makes this clear.

Figure 13.2. Fence post errors.

Image

This type of “off by one” counting can be the bane of any C++ programmer’s life. Over time, however, you’ll get used to the idea that a 25-element array counts only to element 24, and that everything counts from 0.

Note

Some programmers refer to ArrayName[0] as the zeroth element. Getting into this habit is a mistake. If ArrayName[0] is the zeroth element, what is ArrayName[1]? The oneth? If so, when you see ArrayName[24], will you realize that it is not the 24th element in the array, but rather the 25th? It is far less confusing to say that ArrayName[0] is at offset zero and is the first element.

Initializing Arrays

You can initialize a simple array of built-in types, such as integers and characters, when you first declare the array. After the array name, you put an equal sign (=) and a list of comma-separated values enclosed in braces. For example,


int IntegerArray[5] = { 10, 20, 30, 40, 50 };

declares IntegerArray to be an array of five integers. It assigns IntegerArray[0] the value 10, IntegerArray[1] the value 20, and so forth.

If you omit the size of the array, an array just big enough to hold the initialization is created. Therefore, if you write


int IntegerArray[] = { 10, 20, 30, 40, 50 };

you will create the same array as you did in the previous example, an array that holds five elements.

You cannot initialize more elements than you’ve declared for the array. Therefore,


int IntegerArray[5] = { 10, 20, 30, 40, 50, 60};

generates a compiler error because you’ve declared a five-member array and initialized six values. It is legal, however, to write


int IntegerArray[5] = {10, 20};

In this case, you have declared a five-element array and only initialized the first two elements, IntegerArray[0] and IntegerArray[1].

Image

Declaring Arrays

This code uses “magic numbers” such as 3 for the size of the sentinel arrays and 25 for the size of TargetArray. It is safer to use constants so that you can change all these values in one place.

Arrays can have any legal variable name, but they cannot have the same name as another variable or array within their scope. Therefore, you cannot have an array named myCats[5] and a variable named myCats at the same time.

In addition, when declaring the number of elements, in addition to using literals, you can use a constant or enumeration. It is actually better to use these rather than a literal number because it gives you a single location to control the number of elements. In Listing 13.2, literal numbers were used. If you want to change the TargetArray so it holds only 20 elements instead of 25, you have to change several lines of code. If you used a constant, you only have to change the value of your constant.

Creating the number of elements, or dimension size, with an enumeration is a little different. Listing 13.3 illustrates this by creating an array that holds values—one for each day of the week.

Listing 13.3. Using consts and enums in Arrays


0:    // Listing 13.3
1:    // Dimensioning arrays with consts and enumerations
2:  
3:    #include <iostream>
4:    int main()
5:    {
6:          enum WeekDays { Sun, Mon, Tue,
7:                      Wed, Thu, Fri, Sat, DaysInWeek };
8:        int ArrayWeek[DaysInWeek] = { 10, 20, 30, 40, 50, 60, 70 };
9:  
10:       std::cout << "The value at Tuesday is: " << ArrayWeek[Tue];
11:       return 0;
12:    }

Image


The value at Tuesday is: 30

Image

Line 6 creates an enumeration called WeekDays. It has eight members. Sunday is equal to 0, and DaysInWeek is equal to 7. On line 8, an array called ArrayWeek is declared to have DaysInWeek elements, which is 7.

Line 10 uses the enumerated constant Tue as an offset into the array. Because Tue evaluates to 2, the third element of the array, ArrayWeek[2], is returned and printed on line 10.

Arrays

To declare an array, write the type of object stored, followed by the name of the array and a subscript with the number of objects to be held in the array.

Example 1


int MyIntegerArray[90];

Example 2


long * ArrayOfPointersToLongs[100];

To access members of the array, use the subscript operator.

Example 1


// assign ninth member of MyIntegerArray to theNinethInteger
int theNinethInteger = MyIntegerArray[8];

Example 2


// assign ninth member of ArrayOfPointersToLongs to pLong.
long * pLong = ArrayOfPointersToLongs[8];

Arrays count from zero. An array of n items is numbered from 0 to n–1.

Using Arrays of Objects

Any object, whether built-in or user defined, can be stored in an array. When you declare the array to hold objects, you tell the compiler the type of object to store and the number for which to allocate room. The compiler knows how much room is needed for each object based on the class declaration. The class must have a default constructor that takes no arguments so that the objects can be created when the array is defined.

Accessing member data in an array of objects is a two-step process. You identify the member of the array by using the index operator ([ ]), and then you add the member operator (.) to access the particular member variable. Listing 13.4 demonstrates how you would create and access an array of five Cats.

Listing 13.4. Creating an Array of Objects

Image

Image


cat #1: 1
cat #2: 3
cat #3: 5
cat #4: 7
cat #5: 9

Image

Lines 5–17 declare the Cat class. The Cat class must have a default constructor so that Cat objects can be created in an array. In this case, the default constructor is declared and defined on line 8. For each Cat, a default age of 1 is set as well as a default weight of 5. Remember that if you create any other constructor, the compiler supplied default constructor is not created; you must create your own.

The first for loop (lines 23 and 24) sets values for the age of each of the five Cat objects in the array. The second for loop (lines 26–30) accesses each member of the array and calls GetAge() to display the age of each Cat object.

Each individual Cat’s GetAge() method is called by accessing the member in the array, Litter, followed by the dot operator (.), and the member function. You can access other members and methods in the exact same way.

Declaring Multidimensional Arrays

It is possible to have arrays of more than one dimension. Each dimension is represented as a subscript in the array. Therefore, a two-dimensional array has two subscripts; a three-dimensional array has three subscripts; and so on. Arrays can have any number of dimensions, although it is likely that most of the arrays you create will be of one or two dimensions.

A good example of a two-dimensional array is a chess board. One dimension represents the eight rows; the other dimension represents the eight columns. Figure 13.3 illustrates this idea.

Figure 13.3. A chess board and a two-dimensional array.

Image

Suppose that you have a class named SQUARE. The declaration of an array named Board that represents it would be


SQUARE Board[8][8];

You could also represent the same data with a one-dimensional, 64-square array. For example:


SQUARE Board[64];

This, however, doesn’t correspond as closely to the real-world object as the two-dimension. When the game begins, the king is located in the fourth position in the first row; that position corresponds to


Board[0][3];

assuming that the first subscript corresponds to row and the second to column.

Initializing Multidimensional Arrays

You can initialize multidimensional arrays. You assign the list of values to array elements in order, with the last array subscript (the one farthest to the right) changing while each of the former holds steady. Therefore, if you have an array


int theArray[5][3];

the first three elements go into theArray[0]; the next three into theArray[1]; and so forth.

You initialize this array by writing


int theArray[5][3] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 };

For the sake of clarity, you could group the initializations with braces. For example:


int theArray[5][3] = {  {1,2,3},
    {4,5,6},
    {7,8,9},
    {10,11,12},
    {13,14,15} };

The compiler ignores the inner braces, but they do make it easier to understand how the numbers are distributed.

When initializing elements of an array, each value must be separated by a comma, without regard to the braces. The entire initialization set must be within braces, and it must end with a semicolon.

Listing 13.5 creates a two-dimensional array. The first dimension is the set of numbers from zero to four. The second dimension consists of the double of each value in the first dimension.

Listing 13.5. Creating a Multidimensional Array


0:    // Listing 13.5 - Creating a Multidimensional Array
1:    #include <iostream>
2:    using namespace std;
3:  
4:    int main()
5:    {
6:       int SomeArray[2][5] = { {0,1,2,3,4}, {0,2,4,6,8}};
7:       for (int i = 0; i<2; i++)
8:       {
9:            for (int j=0; j<5; j++)
10:        {
11:              cout << "SomeArray[" << i << "][" << j << "]: ";
12:              cout << SomeArray[i][j]<< endl;
13:        }
14:       }
15:       return 0;
16:    }

Image


SomeArray[0][0]: 0
SomeArray[0][1]: 1
SomeArray[0][2]: 2
SomeArray[0][3]: 3
SomeArray[0][4]: 4
SomeArray[1][0]: 0
SomeArray[1][1]: 2
SomeArray[1][2]: 4
SomeArray[1][3]: 6
SomeArray[1][4]: 8

Image

Line 6 declares SomeArray to be a two-dimensional array. The first dimension indicates that there will be two sets; the second dimension consists of five integers. This creates a 2×5 grid, as Figure 13.4 shows.

Figure 13.4. A 2×5 array.

Image

The values are based on the two sets of numbers. The first set is the original numbers; the second set is the doubled numbers. In this listing, the values are simply set, although they could be computed as well. Lines 7 and 9 create a nested for loop. The outer for loop (starting on line 7) ticks through each member of the first dimension, which is each of the two sets of integers. For every member in that dimension, the inner for loop (starting on line 9) ticks through each member of the second dimension. This is consistent with the printout. SomeArray[0][0] is followed by SomeArray[0][1]. The first dimension is incremented only after the second dimension has gone through all of its increments. Then counting for the second dimension starts over.

A Word About Memory

When you declare an array, you tell the compiler exactly how many objects you expect to store in it. The compiler sets aside memory for all the objects, even if you never use it. This isn’t a problem with arrays for which you have a good idea of how many objects you’ll need. For example, a chessboard has 64 squares, and cats have between 1 and 10 kittens. When you have no idea of how many objects you’ll need, however, you must use more advanced data structures.

This book looks at arrays of pointers, arrays built on the free store, and various other collections. You’ll see a few advanced data structures, but you can learn more in the book C++ Unleashed from Sams Publishing. You can also check out Appendix E, “A Look at Linked Lists.”

Two of the great things about programming are that there are always more things to learn and that there are always more books from which to learn them.

Building Arrays of Pointers

The arrays discussed so far store all their members on the stack. Usually, stack memory is more limited, whereas free store memory is much larger. It is possible to declare each object on the free store and then to store only a pointer to the object in the array. This dramatically reduces the amount of stack memory used. Listing 13.6 rewrites the array from Listing 13.4, but it stores all the objects on the free store. As an indication of the greater memory that this enables, the array is expanded from 5 to 500, and the name is changed from Litter to Family.

Listing 13.6. Storing an Array on the Free Store

Image

Image


Cat #1: 1
Cat #2: 3
Cat #3: 5
...
Cat #499: 997
Cat #500: 999

Image

The Cat object declared on lines 5–17 is identical to the Cat object declared in Listing 13.4. This time, however, the array declared on line 21 is named Family, and it is declared to hold 500 elements. More importantly, these 500 elements are pointers to Cat objects.

In the initial loop (lines 24–29), 500 new Cat objects are created on the free store, and each one has its age set to twice the index plus one. Therefore, the first Cat is set to 1, the second Cat to 3, the third Cat to 5, and so on. After the pointer is created, line 28 assigns the pointer to the array. Because the array has been declared to hold pointers, the pointer—rather than the dereferenced value in the pointer—is added to the array.

The second loop in lines 31–35 prints each of the values. On line 33, a number is printed to show which object is being printed. Because index offsets start at zero, line 33 adds 1 to display a count starting at 1 instead. On line 34, the pointer is accessed by using the index, Family[i]. That address is then used to access the GetAge() method.

In this example, the array Family and all its pointers are stored on the stack, but the 500 Cat objects that are created are stored on the free store.

A Look at Pointer Arithmetic—An Advanced Topic

On Day 8, “Understanding Pointers,” you initially learned about pointers. Before continuing with arrays, it is worth coming back to pointers to cover an advanced topic—pointer arithmetic.

There are a few things that can be done mathematically with pointers. Pointers can be subtracted, one from another. One powerful technique is to point two pointers at different elements in an array and to take their difference to see how many elements separate the two members. This can be very useful when parsing arrays of characters, as illustrated in Listing 13.7.

Listing 13.7. Illustrates How to Parse Out Words from a Character String

Image

Image

Image


Enter a string: this code first appeared in C++ Report
Got this word: this
Got this word: code
Got this word: first
Got this word: appeared
Got this word: in
Got this word: C
Got this word: Report

Image

This program allows the user to enter in a sentence. The program then breaks out each word (each set of alphanumeric characters) of the sentence. On line 15 is the prompt asking the user to enter a string—basically a sentence. This is fed to a method called GetWord() on line 18, along with a buffer to hold the first word and an integer variable called WordOffset, which is initialized on line 13 to zero.

GetWord() returns each word from the string until the end of the string is reached. As words are returned from GetWord(), they are printed on line 20 until GetWord() returns false.

Each call to GetWord() causes a jump to line 26. On line 28, a check is done to see if the value of string[wordOffset]) is zero. This will be true if you are at or past the end of the string, at which time GetWord() will return false. cin.GetLine() makes sure the string entered is terminated with a null—that is, that it ends with a zero valued character ’’;.

On line 31, two character pointers, p1 and p2, are declared, and on line 32, they are set to point into string offset by wordOffset. Initially, wordOffset is zero, so they point to the beginning of the string.

Lines 35 and 36 tick through the string, pushing p1 to the first alphanumeric character. Lines 39 and 40 ensure that an alphanumeric character is found. If not, false is returned.

p1 now points to the start of the next word, and line 44 sets p2 to point to the same position.

Lines 47 and 48 then cause p2 to march through the word, stopping at the first nonalphanumeric character. p2 is now pointing to the end of the word that p1 points to the beginning of. By subtracting p1 from p2 on line 53 and casting the result to an integer, you are able to establish the length of the word. That word is then copied into the buffer word using a string-copying method from the Standard Library, passing in as the starting point p1 and as the length the difference that you’ve established.

On line 59, a null value is appended to mark the end of the word. p2 is then incremented to point to the beginning of the next word, and the offset of that word is pushed into the integer reference wordOffset. Finally, true is returned to indicate that a word has been found.

This is a classic example of code that is best understood by putting it into a debugger and stepping through its execution.

Pointer arithmetic is used in a number of places in this listing. In this listing, you can see that by subtracting one pointer from another (as on line 53), you determine the number of elements between the two pointers. In addition, you saw on line 55 that incrementing a pointer shifts it to the next element within an array rather than just adding one. Using pointer arithmetic is very common when working with pointers and arrays, but it is also a dangerous activity and needs to be approached with respect.

Declaring Arrays on the Free Store

It is possible to put the entire array on the free store, also known as the heap. You do this by creating a pointer to the array. Create the pointer by calling new and using the subscript operator. The result is a pointer to an area on the free store that holds the array. For example,


Cat *Family = new Cat[500];

declares Family to be a pointer to the first element in an array of 500 Cats. In other words, Family points to—or has the address of—Family[0].

The advantage of using Family in this way is that you can use pointer arithmetic to access each member of Family. For example, you can write

Image

This declares a new array of 500 Cats and a pointer to point to the start of the array. Using that pointer, the first Cat’s SetAge() function is called with a value of 10. The pointer is then incremented. This causes the pointer to be incremented to point to the next Cat object in the array. The second Cat’s SetAge() method is then called with a value of 20.

A Pointer to an Array Versus an Array of Pointers

Examine the following three declarations:


1:    Cat   FamilyOne[500];
2:    Cat * FamilyTwo[500];
3:    Cat * FamilyThree = new Cat[500];

FamilyOne is an array of 500 Cat objects. FamilyTwo is an array of 500 pointers to Cat objects. FamilyThree is a pointer to an array of 500 Cat objects.

The differences among these three code lines dramatically affect how these arrays operate. What is perhaps even more surprising is that FamilyThree is a variant of FamilyOne, but it is very different from FamilyTwo.

This raises the thorny issue of how pointers relate to arrays. In the third case, FamilyThree is a pointer to an array. That is, the address in FamilyThree is the address of the first item in that array. This is exactly the case for FamilyOne.

Pointers and Array Names

In C++, an array name is a constant pointer to the first element of the array. Therefore, in the declaration


Cat Family[500];

Family is a pointer to &Family[0], which is the address of the first element of the array Family.

It is legal to use array names as constant pointers, and vice versa. Therefore, Family + 4 is a legitimate way of accessing the data at Family[4].

The compiler does all the arithmetic when you add to, increment, and decrement pointers. The address accessed when you write Family + 4 isn’t four bytes past the address of Family—it is four objects. If each object is four bytes long, Family + 4 is 16 bytes past the start of the array. If each object is a Cat that has four long member variables of four bytes each and two short member variables of two bytes each, each Cat is 20 bytes, and Family + 4 is 80 bytes past the start of the array.

Listing 13.8 illustrates declaring and using an array on the free store.

Listing 13.8. Creating an Array by Using new

Image

Image


Cat #1: 1
Cat #2: 3
Cat #3: 5
...
Cat #499: 997
Cat #500: 999

Image

Line 25 declares Family, which is a pointer to an array of 500 Cat objects. The entire array is created on the free store with the call to new Cat[500].

On line 30, you can see that the pointer you declared can be used with the index operator [], and thus be treated just like a regular array. On line 36, you see that it is once again used to call the GetAge() method. For all practical purposes, you can treat this pointer to the Family array as an array name. The one thing you will need to do, however, is to free the memory you allocated in setting up the array. This is done on line 39 with a call to delete.

Deleting Arrays on the Free Store

What happens to the memory allocated for these Cat objects when the array is destroyed? Is there a chance of a memory leak?

Deleting Family automatically returns all the memory set aside for the array if you use the delete with the [] operator. By including the square brackets, the compiler is smart enough to destroy each object in the array and to return its memory to the free store.

To see this, change the size of the array from 500 to 10 on lines 25, 28, and 33. Then uncomment the cout statement on line 20. When line 39 is reached and the array is destroyed, each Cat object destructor is called.

When you create an item on the heap by using new, you always delete that item and free its memory with delete. Similarly, when you create an array by using new <class>[size], you delete that array and free all its memory with delete[]. The brackets signal the compiler that this array is being deleted.

If you leave the brackets off, only the first object in the array is deleted. You can prove this to yourself by removing the bracket on line 39. If you edited line 20 so that the destructor prints, you should now see only one Cat object destroyed. Congratulations! You just created a memory leak.

Resizing Arrays at Runtime

The biggest advantage of being able to allocate arrays on the heap is that you determine the size of the array at runtime and then allocate it. For instance, if you asked the user to enter the size of a family into a variable called SizeOfFamily, you could then declare a Cat array as follows:


Cat *pFamily = new Cat[SizeOfFamily];

With that, you now have a pointer to an array of Cat objects. You can then create a pointer to the first element and loop through this array using a pointer and pointer arithmetic:


Cat *pCurrentCat = Family[0];
for ( int Index = 0; Index < SizeOfFamily; Index++, pCurrentCat++ )
{
               pCurrentCat->SetAge(Index);
};

Because C++ views arrays as no more than special cases of pointers, you can skip the second pointer and simply use standard array indexing:


for (int Index = 0; Index < SizeOfFamily; Index++)
{
               pFamily[Index].SetAge(Index);
};

The use of the subscript brackets automatically dereferences the resulting pointer and the compiler causes the appropriate pointer arithmetic to be performed.

A further advantage is that you can use a similar technique to resize an array at runtime when you run out of room. Listing 13.9 illustrates this reallocation.

Listing 13.9. Reallocating an Array at Runtime

Image

Image


Next number = 10

Next number = 20

Next number = 30

Next number = 40

Next number = 50

Next number = 60

Next number = 70

Next number = 0
10
20
30
40
50
60
70

Image

In this example, numbers are entered one after the other and stored in an array. When a number less or equal to 0 is entered, the array of numbers that has been gathered is printed.

Looking closer, you can see on lines 6–9 that a number of variables are declared. More specifically, the initial size of the array is set at 5 on line 6 and then the array is allocated on line 7 and its address is assigned to pArrayOfNumbers.

Lines 12–13 get the first number from the user and place it into the variable, InputNumber. On line 15, if the number entered is greater than zero, processing occurs. If not, the program jumps to line 38.

On line 17, InputNumber is put into the array. This is safe the first time in because you know you have room at this point. On line 19, a check is done to see if this is the last element that the array has room for. If there is room, control passes to line 35; otherwise, the body of the if statement is processed in order to increase the size of the array (lines 20–34).

A new array is created on line 21. This array is created to hold five more elements (AllocationSize) than the current array. Lines 24–29 then copy the old array to the new array using array notation (you could also use pointer arithmetic).

Line 31 deletes the old array and line 32 then replaces the old pointer with the pointer to the larger array. Line 33 increases the MaximumElementsAllowed to match the new larger size.

Lines 39–42 display the resulting array.

Image

char Arrays and Strings

There is a type of array that gets special attention. This is an array of characters that is terminated by a null. This array is considered a “C-style string.” The only C-style strings you’ve seen until now have been unnamed C-style string constants used in cout statements, such as


cout << "hello world";

You can declare and initialize a C-style string the same as you would any other array. For example:


char Greeting[] =
{ ’H’, ’e’, ’l’, ’l’, ’o’, ’ ’, ’W’,’o’,’r’,’l’,’d’,’’ };

In this case, Greeting is declared as an array of characters and it is initialized with a number of characters. The last character, ’’, is the null character, which many C++ functions recognize as the terminator for a C-style string. Although this character-by-character approach works, it is difficult to type and admits too many opportunities for error. C++ enables you to use a shorthand form of the previous line of code. It is


char Greeting[] = "Hello World";

You should note two things about this syntax:

• Instead of single-quoted characters separated by commas and surrounded by braces, you have a double-quoted C-style string, no commas, and no braces.

• You don’t need to add the null character because the compiler adds it for you.

When you declare a string, you need to ensure that you make it as large as you will need. The length of a C-style string includes the number of characters including the null character. For example, Hello World is 12 bytes. Hello is 5 bytes, the space is 1 byte, World is 5 bytes, and the null character is 1 byte.

You can also create uninitialized character arrays. As with all arrays, it is important to ensure that you don’t put more into it than there is room for. Listing 13.10 demonstrates the use of an uninitialized buffer.

Listing 13.10. Filling an Array


0:    //Listing 13.10 char array buffers
1:  
2:    #include <iostream>
3:  
4:    int main()
5:    {
6:        char buffer[80];
7:        std::cout << "Enter the string: ";
8:        std::cin >> buffer;
9:        std::cout << "Here’s the buffer: " << buffer << std::endl;
10:        return 0;
11:    }

Image


Enter the string: Hello World
Here’s the buffer: Hello

Image

On line 6, a character array is created to act as a buffer to hold 80 characters. This is large enough to hold a 79-character C-style string and a terminating null character.

On line 7, the user is prompted to enter a C-style string, which is entered into the buffer on line 8. cin writes a terminating null to the buffer after it writes the string.

Two problems occur with the program in Listing 13.10. First, if the user enters more than 79 characters, cin writes past the end of the buffer. Second, if the user enters a space, cin thinks that it is the end of the string, and it stops writing to the buffer.

To solve these problems, you must call a special method on cin called get(). cin.get() takes three parameters:

• The buffer to fill

• The maximum number of characters to get

• The delimiter that terminates input

The delimiter defaults to a newline character. Listing 13.11 illustrates the use of get().

Listing 13.11. Filling an Array With a Maximum Number of Characters.


0:    //Listing 13.11 using cin.get()
1:
2:    #include <iostream>
3:    using namespace std;
4:
5:    int main()
6:    {
7:        char buffer[80];
8:        cout << "Enter the string: ";
9:        cin.get(buffer, 79);       // get up to 79 or newline
10:        cout << "Here’s the buffer:  " << buffer << endl;
11:        return 0;
12:    }

Image


Enter the string: Hello World
Here’s the buffer:  Hello World

Image

Line 9 calls the method get() of cin. The buffer declared on line 7 is passed in as the first argument. The second argument is the maximum number of characters to get. In this case, it must be no greater than 79 to allow for the terminating null. No need exists to provide a terminating character because the default value of newline is sufficient.

If you enter spaces, tabs, or other whitespace characters, they are assigned to the string. A newline character ends the input. Entering 79 characters also results in the end of the input. You can verify this by rerunning the listing and trying to enter a string longer than 79 characters.

Using the strcpy() and strncpy() Methods

A number of existing functions are available in the C++ library for dealing with strings. C++ inherits many of these functions for dealing with C-style strings from the C language. Among the many functions provided are two for copying one string into another: strcpy() and strncpy(). strcpy() copies the entire contents of one string into a designated buffer. The other, strncpy() copies a number of characters from one string to another. Listing 13.12 demonstrates the use of strcpy().

Listing 13.12. Using strcpy()


0:    //Listing 13.12 Using strcpy()
1:  
2:    #include <iostream>
3:  #include <cstring>
4:    using namespace std;
5:  
6:    int main()
7:    {
8:       char String1[] = "No man is an island";
9:       char String2[80];
10:  
11:       strcpy(String2,String1);
12:  
13:       cout << "String1: " << String1 << endl;
14:       cout << "String2: " << String2 << endl;
15:       return 0;
16:    }

Image


String1: No man is an island
String2: No man is an island

Image

This listing is relatively simple. It copies data from one string into another. The header file cstring is included on line 3. This file contains the prototype of the strcpy() function. strcpy() takes two character arrays—a destination followed by a source. On line 11, this function is used to copy String1 into String2.

You have to be careful using the strcpy() function. If the source is larger than the destination, strcpy() overwrites past the end of the buffer. To protect against this, the Standard Library also includes strncpy(). This variation takes a maximum number of characters to copy. strncpy() copies up to the first null character or the maximum number of characters specified into the destination buffer. Listing 13.13 illustrates the use of strncpy().

Listing 13.13. Using strncpy()


0:    //Listing 13.13 Using strncpy()
1:  
2:    #include <iostream>
3:    #include <cstring>
4:  
5:    int main()
6:    {
7:        const int MaxLength = 80;
8:        char String1[] = "No man is an island";
9:        char String2[MaxLength+1];
10:  
11:        strncpy(String2,String1,MaxLength);
12:  
13:        std::cout << "String1: " << String1 << std::endl;
14:        std::cout << "String2: " << String2 << std::endl;
15:        return 0;
16:    }

Image


String1: No man is an island
String2: No man is an island

Image

Once again, a simple listing is presented. Like the preceding listing, this one simply copies data from one string into another. On line 11, the call to strcpy() has been changed to a call to strncpy(), which takes a third parameter: the maximum number of characters to copy. The buffer String2 is declared to take MaxLength+1 characters. The extra character is for the null, which both strcpy() and strncpy() automatically add to the end of the string.

Note

As with the integer array shown in Listing 13.9, character arrays can be resized using heap allocation techniques and element-by-element copying. Most flexible string classes provided to C++ programmers use some variation on that technique to allow strings to grow and shrink or to insert or delete elements from the middle of the string.

String Classes

C++ inherited the null-terminated C-style string and the library of functions that includes strcpy() from C, but these functions aren’t integrated into an object-oriented framework. The Standard Library includes a String class that provides an encapsulated set of data and functions for manipulating that data, as well as accessor functions so that the data itself is hidden from the clients of the String class.

Before using this class, you will create a custom String class as an exercise in understanding the issues involved. At a minimum, your String class should overcome the basic limitations of character arrays.

Like all arrays, character arrays are static. You define how large they are. They always take up that much room in memory, even if you don’t need it all. Writing past the end of the array is disastrous.

A good String class allocates only as much memory as it needs and always enough to hold whatever it is given. If it can’t allocate enough memory, it should fail gracefully.

Listing 13.14 provides a first approximation of a String class.

Note

This custom String class is quite limited and is by no means complete, robust, or ready for commercial use. That is fine, however, as the Standard Library does provide a complete and robust String class.

Listing 13.14. Using a String class

Image

Image

Image

Image

Image


S1:     initial test
S1:     Hello World
tempTwo:        ; nice to be here!
S1:     Hello World; nice to be here!
S1[4]:  o
S1:     Hellx World; nice to be here!
S1[999]:        !
S3:     Hellx World; nice to be here! Another string
S4:     Why does this work?

Image

Your String class’s declaration is on lines 7–31. To add flexibility to the class, there are three constructors in lines 11–13: the default constructor, the copy constructor, and a constructor that takes an existing null-terminated (C-style) string.

To allow the your users to manipulate strings easily, this String class overloads several operators including the offset operator ([ ]), operator plus (+), and operator plus-equals (+=). The offset operator is overloaded twice: once as a constant function returning a char and again as a nonconstant function returning a reference to a char.

The nonconstant version is used in statements such as


S1[4]=’x’;

as seen on line 161. This enables direct access to each of the characters in the string. A reference to the character is returned so that the calling function can manipulate it.

The constant version is used when a constant String object is being accessed, such as in the implementation of the copy constructor starting on line 63. Note that rhs[i] is accessed, yet rhs is declared as a const String &. It isn’t legal to access this object by using a nonconstant member function. Therefore, the offset operator must be overloaded with a constant accessor. If the object being returned were large, you might want to declare the return value to be a constant reference. However, because a char is only one byte, there would be no point in doing that.

The default constructor is implemented on lines 34–39. It creates a string whose length is 0. It is the convention of this String class to report its length not counting the terminating null. This default string contains only a terminating null.

The copy constructor is implemented on lines 63–70. This constructor sets the new string’s length to that of the existing string—plus one for the terminating null. It copies each character from the existing string to the new string, and it null-terminates the new string. Remember that, unlike assignment operators, copy constructors do not need to test if the string being copied into this new object is itself—that can never happen.

Stepping back, you see in lines 53–60 the implementation of the constructor that takes an existing C-style string. This constructor is similar to the copy constructor. The length of the existing string is established by a call to the standard String library function strlen().

On line 28, another constructor, String(unsigned short), is declared to be a private member function. It is the intent of the designer of this class that no client class ever create a String of arbitrary length. This constructor exists only to help in the internal creation of Strings as required, for example, by operator+=, on line 131. This is discussed in depth when operator+= is described later.

On lines 44–50, you can see that the String(unsigned short) constructor fills every member of its array with a null character (’’). Therefore, the for loop checks for i<=len rather than i<len.

The destructor, implemented on lines 73–77, deletes the character string maintained by the class. Be certain to include the brackets in the call to the delete operator so that every member of the array is deleted, instead of only the first.

The assignment operator is overloaded on lines 81–92. This method first checks to see whether the right-hand side of the assignment is the same as the left-hand side. If it isn’t, the current string is deleted, and the new string is created and copied into place. A reference is returned to facilitate stacked assignments such as


String1 = String2 = String3;

Another overloaded operator is the offset operator. This operator is overloaded twice, first on lines 97–103 and again on lines 107–113. Rudimentary bounds checking is performed both times. If the user attempts to access a character at a location beyond the end of the array, the last character—that is, len-1—is returned.

Lines 117–127 implement the overloading of the operator plus (+) as a concatenation operator. It is convenient to be able to write


String3 = String1 + String2;

and have String3 be the concatenation of the other two strings. To accomplish this, the operator plus function computes the combined length of the two strings and creates a temporary string temp. This invokes the private constructor, which takes an integer, and creates a string filled with nulls. The nulls are then replaced by the contents of the two strings. The left-hand side string (*this) is copied first, followed by the right-hand side string (rhs). The first for loop counts through the string on the left-hand side and adds each character to the new string. The second for loop counts through the right-hand side. Note that i continues to count the place for the new string, even as j counts into the rhs string.

On line 127, operator plus returns the string, temp, by value, which is assigned to the string on the left-hand side of the assignment (string1). On lines 131–143, operator += operates on the existing string—that is, the left-hand side of the statement string1 += string2. It works the same as operator plus, except that the temporary value, temp, is assigned to the current string (*this = temp) on line 142.

The main() function (lines 145–175) acts as a test driver program for this class. Line 147 creates a String object by using the constructor that takes a null-terminated C-style string. Line 148 prints its contents by using the accessor function GetString(). Line 150 creates a second C-style string, which is assigned on line 151 to the original string, s1. Line 152 prints the result of this assignment, thus showing that the overloading of the assignment operator truly does work.

Line 154 creates a third C-style string called tempTwo. Line 155 invokes strcpy() to fill the buffer with the characters ; nice to be here! Line 156 invokes the overloaded operator += in order to concatenate tempTwo onto the existing string s1. Line 158 prints the results.

On line 160, the fifth character in s1 is accessed using the overloaded offset operator. This value is printed. On line 161, a new value of ‘x’ is assigned to this character within the string. This invokes the nonconstant offset operator ([ ]). Line 162 prints the result, which shows that the actual value has, in fact, been changed.

Line 164 attempts to access a character beyond the end of the array. From the information printed, you can see that the last character of the array is returned, as designed.

Lines 166 and 167 create two more String objects, and line 168 calls the addition operator. Line 169 prints the results.

Line 171 creates a new String object, s4. Line 172 uses the overloaded assignment operator to assign a literal C-style string to s4. Line 173 prints the results. You might be thinking, “The assignment operator is defined to take a constant String reference on line 21, but here the program passes in a C-style string. Why is this legal?”

The answer is that the compiler expects a String, but it is given a character array. Therefore, it checks whether it can create a String from what it is given. On line 12, you declared a constructor that creates Strings from character arrays. The compiler creates a temporary String from the character array and passes it to the assignment operator. This is known as implicit casting, or promotion. If you had not declared—and provided the implementation for—the constructor that takes a character array, this assignment would have generated a compiler error.

In looking through Listing 13.14, you see that the String class that you’ve built is beginning to become pretty robust. You’ll also realize that it is a longer listing than what you’ve seen. Fortunately, the Standard C++ Library provides an even more robust String class that you’ll be able to use by including the <string> library.

Linked Lists and Other Structures

Arrays are much like Tupperware. They are great containers, but they are of a fixed size. If you pick a container that is too large, you waste space in your storage area. If you pick one that is too small, its contents spill all over and you have a big mess.

One way to solve this problem is shown in Listing 13.9. However, when you start using large arrays or when you want to move, delete, or insert entries in the array, the number of allocations and deallocations can be expensive.

One way to solve such a problem is with a linked list. A linked list is a data structure that consists of small containers that are designed to link together as needed. The idea is to write a class that holds one object of your data—such as one Cat or one Rectangle—and that can point at the next container. You create one container for each object that you need to store, and you chain them together as needed.

Linked lists are considered an advanced level topic. More information can be found on them in Appendix E, “A Look at Linked Lists.”

Creating Array Classes

Writing your own array class has many advantages over using the built-in arrays. For starters, you can prevent array overruns. You might also consider making your array class dynamically sized: At creation, it might have only one member, growing as needed during the course of the program.

You might also want to sort or otherwise order the members of the array. You might have a need for one or more of these powerful array variants:

Ordered collection—Each member is in sorted order.

Set—No member appears more than once.

Dictionary—This uses matched pairs in which one value acts as a key to retrieve the other value.

Sparse array—Indices are permitted for a large set, but only those values actually added to the array consume memory. Thus, you can ask for SparseArray[5] or SparseArray[200], but it is possible that memory is allocated only for a small number of entries.

Bag—An unordered collection that is added to and retrieved in indeterminate order.

By overloading the index operator ([ ]), you can turn a linked list into an ordered collection. By excluding duplicates, you can turn a collection into a set. If each object in the list has a pair of matched values, you can use a linked list to build a dictionary or a sparse array.

Note

Writing your own array class has many advantages over using the built-in arrays. Using the Standard Library implementations of similar classes usually has advantages over writing your own classes.

Summary

Today, you learned how to create arrays in C++. An array is a fixed-size collection of objects that are all the same type.

Arrays don’t do bounds checking. Therefore, it is legal—even if disastrous—to read or write past the end of an array. Arrays count from 0. A common mistake is to write to offset n of an array of n members.

Arrays can be one dimensional or multidimensional. In either case, the members of the array can be initialized, as long as the array contains either built-in types, such as int, or objects of a class that has a default constructor.

Arrays and their contents can be on the free store or on the stack. If you delete an array on the free store, remember to use the brackets in the call to delete.

Array names are constant pointers to the first elements of the array. Pointers and arrays use pointer arithmetic to find the next element of an array.

Strings are arrays of characters, or chars. C++ provides special features for managing char arrays, including the capability to initialize them with quoted strings.

Q&A

Q   What is in an uninitialized array element?

A   Whatever happens to be in memory at a given time. The results of using an uninitialized array member without assigning a value can be unpredictable. If the compiler is following the C++ standards, array elements that are static, nonlocal objects will be zero initialized.

Q   Can I combine arrays?

A   Yes. With simple arrays, you can use pointers to combine them into a new, larger array. With strings, you can use some of the built-in functions, such as strcat, to combine strings.

Q   Why should I create a linked list if an array will work?

A   An array must have a fixed size, whereas a linked list can be sized dynamically at runtime. Appendix E provides more information on creating linked lists.

Q   Why would I ever use built-in arrays if I can make a better array class?

A   Built-in arrays are quick and easy to use, and you generally need them to build your better array class.

Q   Is there a better construct to use than arrays?

A   On Day 19, “Templates,” you learn about templates as well as the Standard Template Library. This library contains templates for arrays that contain all the functionality you will generally need. Using these templates is a safer alternative to building your own.

Q   Must a string class use a char * to hold the contents of the string?

A   No. It can use any memory storage the designer thinks is best.

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered and exercises to provide you with experience in using what you’ve learned. Try to answer the quiz and exercise questions before checking the answers in Appendix D, and be certain you understand the answers before continuing to tomorrow’s lesson.

Quiz

1. What are the first and last elements in SomeArray[25]?

2. How do you declare a multidimensional array?

3. Initialize the members of an array declared as SomeArray[2][3][2].

4. How many elements are in the array SomeArray[10][5][20]?

5. How does a linked list differ from an array?

6. How many characters are stored in the string “Jesse knows C++”?

7. What is the last character in the string “Brad is a nice guy”?

Exercises

1. Declare a two-dimensional array that represents a tic-tac-toe game board.

2. Write the code that initializes all the elements in the array you created in Exercise 1 to the value 0.

3. Write a program that contains four arrays. Three of the arrays should contain your first name, middle initial, and last name. Use the string-copying function presented in today’s lesson to copy these strings together into the fourth array, full name.

4. BUG BUSTERS: What is wrong with this code fragment?


unsigned short SomeArray[5][4];
for (int i = 0; i<4; i++)
       for (int j = 0; j<5; j++)
            SomeArray[i][j] = i+j;

5. BUG BUSTERS: What is wrong with this code fragment?


unsigned short SomeArray[5][4];
for (int i = 0; i<=5; i++)
       for (int j = 0; j<=4; j++)
            SomeArray[i][j] = 0;

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.199.184