Chapter 9. Taking a Second Look at C++ Pointers

In This Chapter

  • Performing arithmetic operations on character pointers

  • Examining the relationship between pointers and arrays

  • Increasing program performance

  • Extending pointer operations to different pointer types

  • Explaining the arguments to main() in our C++ program template

C++ allows the programmer to operate on pointer variables much as she would on simple types of variables. (The concept of pointer variables is introduced in Chapter 8.) How and why this is done along with its implications are the subjects of this chapter.

Defining Operations on Pointer Variables

Some of the same arithmetic operators I cover in Chapter 3 can be applied to pointer types. This section examines the implications of applying these operators to both to pointers and to the array types (I discuss arrays in Chapter 7). Table 9-1 lists the three fundamental operations that are defined on pointers. In Table 9-1, pointer, pointer1, and pointer2 are all of some pointer type, say char*; and offset is an integer, for example, long. C++ also supports the other operators related to addition and subtraction, such as ++ and +=., although they are not listed in Table 9-1.

Table 9.1. The Three Basic Operations Defined on Pointer Types

Operation

Result

Meaning

pointer + offset

pointer

Calculate the address of the object offset entries from pointer

pointer - offset

pointer

The opposite of addition

pointer2 - pointer1

offset

Calculate the number of entries between pointer2 and pointer1

The neighborhood memory model is useful to explain how pointer arithmetic works. Consider a city block in which all houses are numbered sequentially. The house at 123 Main Street has 122 Main Street on one side and 124 Main Street on the other.

Now it's pretty clear that the house four houses down from 123 Main Street must be 127 Main Street; thus, you can say 123 Main + 4 = 127 Main. Similarly, if I were to ask how many houses are there from 123 Main to 127 Main, the answer would be four — 127 Main - 123 Main = 4. (Just as an aside, a house is zero houses from itself: 123 Main - 123 Main = 0.)

But it makes no sense to ask how far away from 123 Main Street is 4 or what the sum of 123 Main and 127 Main is. In similar fashion, you can't add two addresses. Nor can you multiply an address, divide an address, square an address, or take the square root — you get the idea. You can perform any operation that can be converted to addition or subtraction. For example, if you increment a pointer to 123 Main Street, it now points to the house next door (at 124 Main, of course!).

Reexamining arrays in light of pointer variables

Now return to the wonderful array for just a moment. Consider the case of an array of 32 1-byte characters called charArray. If the first byte of this array is stored at address 0x100, the array will extend over the range 0x100 through 0x11f. charArray[0] is located at address 0x100, charArray[1] is at 0x101, charArray[2] at 0x102, and so on.

After executing the expression

ptr = &charArray[0];

the pointer ptr contains the address 0x100. The addition of an integer offset to a pointer is defined such that the relationships shown in Table 9-2 are true. Table 9-2 also demonstrates why adding an offset n to ptr calculates the address of the nth element in charArray.

Table 9.2. Adding Offsets

Offset

Result

Is the Address of

+ 0

0x100

charArray[0]

+ 1

0x101

charArray[1]

+ 2

0x102

charArray[2]

...

...

...

+ n

0x100+ n

charArray[n]

The addition of an offset to a pointer is identical to applying an index to an array.

Thus, if

char* ptr = &charArray[0];

then

*(ptr + n) ← corresponds with → charArray[n]

Warning

Because * has higher precedence than addition, * ptr + n adds n to the character that ptr points to. The parentheses are needed to force the addition to occur before the indirection. The expression *(ptr + n) retrieves the character pointed at by the pointer ptr plus the offset n.

In fact, the correspondence between the two forms of expression is so strong that C++ considers array[n] nothing more than a simplified version of *(ptr + n), where ptr points to the first element in array.

array[n] -- C++ interprets as → *(&array[0] + n)

To complete the association, C++ takes a second shortcut. If given

char charArray[20];

charArray is defined as &charArray[0];. That is, the name of an array without a subscript present is the address of the array itself. Thus, you can further simplify the association to

array[n] -- C++ interprets as → *(array + n)

Tip

The type of charArray is actually char const*; that is, "constant pointer to a character" since its address cannot be changed.

Applying operators to the address of an array

The correspondence between indexing an array and pointer arithmetic is useful. For example, a displayArray() function used to display the contents of an array of integers can be written as follows:

// displayArray - display the members of an
//                array of length nSize
void displayArray(int intArray[], int nSize)
{
    cout << "The value of the array is:
";

    for(int n; n < nSize; n++)
    {
        cout << n << ": " << intArray[n] << "
";
    }
    cout << endl;
}

This version uses the array operations with which you are familiar. A pointer version of the same appears as follows:

// displayArray - display the members of an
//                array of length nSize
void displayArray(int intArray[], int nSize)
{
    cout << "The value of the array is:
";

    int* pArray = intArray;
    for(int n = 0; n < nSize; n++, pArray++)
    {
        cout << n << ": " << *pArray << "
";
    }
    cout << endl;
}

The new displayArray() begins by creating a pointer to an integer pArray that points at the first element of intArray.

Note

The p in the variable name indicates that the variable is a pointer, but this is just a convention, not a part of the C++ language.

The function then loops through each element of the array. On each loop, displayArray() outputs the current integer (that is, the integer pointed at by pArray) before incrementing the pointer to the next entry in intArray. displayArray() can be tested using the following version of main():

int main(int nNumberofArgs, char* pszArgs[])
{
    int array[] = {4, 3, 2, 1};
    displayArray(array, 4);

    // wait until user is ready before terminating program
    // to allow the user to see the program results
    system("PAUSE");
    return 0;
}

The output from this program is

The value of the array is:
0: 4
1: 3
2: 2
3: 1

Press any key to continue...

You may think this pointer conversion is silly; however, the pointer version of displayArray() is actually more common than the array version among C++ programmers in the know. For some reason, C++ programmers don't seem to like arrays but they love pointer manipulation.

The use of pointers to access arrays is nowhere more common than in the accessing of character arrays.

Expanding pointer operations to a string

A null-terminated string is simply a constant character array whose last character is a null. C++ uses the null character at the end to serve as a terminator. This null-terminated array serves as a quasivariable type of its own. (See Chapter 7 for an explanation of null-terminated string arrays.) Often C++ programmers use character pointers to manipulate such strings. The following code examples compare this technique to the earlier technique of indexing in the array.

Character pointers enjoy the same relationship with a character array that any other pointer and array share. However, the fact that strings end in a terminating null makes them especially amenable to pointer-based manipulation, as shown in the following DisplayString program:

// DisplayString - display an array of characters using
//                 both a pointer and an array index
#include <cstdio>
#include <cstdlib>
#include <iostream>
using namespace std;

int main(int nNumberofArgs, char* pszArgs[])
{
    // declare a string
    const char* szString = "Randy";
    cout << "The array is '" << szString << "'" << endl;

    // display szString as an array
    cout << "Display the string as an array: ";
    for(int i = 0; i < 5; i++)
    {
      cout << szString[i];
    }
    cout << endl;

    // now using typical pointer arithmetic
    cout << "Display string using a pointer: ";
    const char* pszString = szString;
    while(*pszString)
    {
      cout << *pszString;
      pszString++;
    }
    cout << endl;

    // wait until user is ready before terminating program
    // to allow the user to see the program results
    system("PAUSE");
    return 0;
}

The program first makes its way through the array szString by indexing into the array of characters. The for loop chosen stops when the index reaches 5, the length of the string.

The second loop displays the same string using a pointer. The program sets the variable pszString equal to the address of the first character in the array. It then enters a loop that will continue until the char pointed at by pszString is equal to false — in other words, until the character is a null.

Note

The integer value 0 is interpreted as false — all other values are true.

The program outputs the character pointed at by pszString and then increments the pointer so that it points to the next character in the string before being returned to the top of the loop.

Tip

The dereference and increment can be (and usually are) combined into a single expression as follows:

cout << *pszString++;

The output of the program appears as follows:

The array is 'Randy'
Display the string as an array: Randy
Display string using a pointer: Randy
Press any key to continue...

Justifying pointer-based string manipulation

The sometimes-cryptic nature of pointer-based manipulation of character strings might lead the reader to wonder, "Why?" That is, what advantage does the char* pointer version have over the easier-to-read index version?

The answer is partially (pre-)historic and partially human nature. When C, the progenitor to C++, was invented, compilers were pretty simplistic. These compilers could not perform the complicated optimizations that modern compilers can. As complicated as it might appear to the human reader, a statement such as *pszString++ could be converted into an amazingly small number of machine-level instructions even by a stupid compiler.

Older computer processors were not very fast by today's standards. In the early days of C, saving a few computer instructions was a big deal. This gave C a big advantage over other languages of the day, notably Fortran, which did not offer pointer arithmetic.

In addition to the efficiency factor, programmers like to generate clever program statements. After C++ programmers learn how to write compact and cryptic but efficient statements, there is no getting them back to accessing arrays with indices.

Tip

Do not generate complex C++ expressions to create a more efficient program. There is no obvious relationship between the number of C++ statements and the number of machine instructions generated.

Applying operators to pointer types other than char

It is not too hard to convince yourself that szTarget + n points to szTarget [n] when szTarget is an array of chars. After all, a char occupies a single byte. If szTarget is stored at 0x100, szTarget[5] is located at 0x105.

It is not so obvious that pointer addition works in exactly the same way for an int array because an int takes 4 bytes for each char's 1 byte (at least it does on a 32-bit Intel processor). If the first element in intArray were located at 0x100, then intArray[5] would be located at 0x114 (0x100 + (5 * 4) = 0x114) and not 0x104.

Fortunately for us, array + n points at array[n] no matter how large a single element of array might be. C++ takes care of the element size for us — it's clever that way.

Once again, the dusty old house analogy works here as well. (I mean dusty analogy, not dusty house.) The third house down from 123 Main is 126 Main, no matter how large the building might be, even if it's a hotel.

Contrasting a pointer with an array

There are some differences between an array and a pointer. For one, the array allocates space for the data, whereas the pointer does not, as shown here:

void arrayVsPointer()
{
    // allocate storage for 128 characters
    char charArray[128];

    // allocate space for a pointer but not for
    // the thing pointed at
    char* pArray;
}

Here charArray allocates room for 128 characters. pArray allocates only 4 bytes — the amount of storage required by a pointer.

Consider the following example:

char charArray[128];
charArray[10] = '0'; // this works fine

char* pArray;
pArray[10] = '0'; // this writes into random location

The expression pArray[10] is syntactically equivalent to charArray[10], but pArray has not been initialized so pArray[10] references some random (garbage) location in memory.

Tip

The mistake of referencing memory with an uninitialized pointer variable is generally caught by the CPU when the program executes, resulting in the dreaded segment violation error that from time to time issues from your favorite applications under your favorite, or not-so-favorite, operating system. This problem is not generally the fault of the processor or the operating system but of the application.

A second difference between a pointer and the address of an array is that charArray is a constant, whereas pArray is not. Thus, the following for loop used to initialize the array charArray does not work:

void arrayVsPointer()
{
char charArray[10];
for (int i = 0; i < 10; i++)
{
    *charArray = '';     // this makes sense...
    charArray++;           // ...this does not
}
}

The expression charArray++ makes no more sense than 10++. The following version is correct:

void arrayVsPointer()
{
char charArray[10];
char* pArray = charArray;
for (int i = 0; i < 10; i++)
{
    *pArray = '';    // this works great
    pArray++;
}

When Is a Pointer Not?

C++ is completely quiet about what is and isn't a legal address, with one exception. C++ predefines the constant nullptr with the following properties:

  • It is a constant value.

  • It can be assigned to any pointer type,

  • It evaluates to false.

  • It is never a legal address.

The constant nullptr is used to indicate when a pointer has not been initialized. It is also often used to indicate the last element in an array of pointers in much the same way that a null character is used to terminate a character string.

Note

Actually the keyword nullptr was introduced in the 2009 standard. Before that, the constant 0 was used to indicate a null pointer.

It is a safe practice to initialize pointers to the nullptr (or 0 if your compiler doesn't support nullptr yet). You should also clear out the contents of a pointer to heap memory after you invoke delete to avoid deleting the same memory block twice:

delete pHeap;     // return memory to the heap
pHeap = nullptr;  // now clear out the pointer

Warning

Passing the same address to delete twice will always cause your program to crash. Passing a nullptr (or 0) to delete has no effect.

Declaring and Using Arrays of Pointers

If pointers can point to arrays, it seems only fitting that the reverse should be true. Arrays of pointers are a type of array of particular interest.

Just as arrays may contain other data types, an array may contain pointers. The following declares an array of pointers to ints:

int* pInts[10];

Given the preceding declaration, pInts[0] is a pointer to an int value. Thus, the following is true:

void fn()
{
    int n1;
    int* pInts[3];
    pInts[0] = &n1;
    *pInts[0] = 1;
}

or

void fn()
{
    int n1, n2, n3;
    int* pInts[3] = {&n1, &n2, &n3};
    for (int i = 0; i < 3; i++)
    {
        *pInts[i] = 0;
    }
}

or even

void fn()
{
     int* pInts[3] = {(new int),
                      (new int),
                      (new int)};
     for (int i = 0; i < 3; i++)
     {
        *pInts[i] = 0;
     }
}

The latter declares three int objects off the heap. This type of declaration isn't used very often except in the case of an array of pointers to character strings. The following two examples show why arrays of character strings are useful.

Utilizing arrays of character strings

Suppose I need a function that returns the name of the month corresponding to an integer argument passed to it. For example, if the program is passed a 1, it returns a pointer to the string "January"; if 2, it reports "February", and so on. The month 0 and any numbers greater than 12 are assumed to be invalid. I could write the function as follows:

// int2month() - return the name of the month
const char* int2month(int nMonth)
{
    const char* pszReturnValue;

    switch(nMonth)
    {
        case 1: pszReturnValue = "January";
                break;
        case 2: pszReturnValue = "February";
                break;
        case 3: pszReturnValue = "March";
                break;
        // ...and so forth...
        default: pszReturnValue = "invalid";
     }
     return pszReturnValue;
}

Note

The switch() control command is like a sequence of if statements.

A more elegant solution uses the integer value for the month as an index into an array of pointers to the names of the months. In use, this appears as follows:

// define an array containing the names of the months
const char *const pszMonths[] = {"invalid",
                                 "January",
                                 "February",
                                 "March",
                                 "April",
                                 "May",
                                 "June",
                                 "July",
                                 "August",
                                 "September",
                                 "October",
                                 "November",
                                 "December"};

// int2month() - return the name of the month
const char* int2month(int nMonth)
{
    // first check for a value out of range
    if (nMonth < 1 || nMonth > 12)
    {
        return "invalid";
    }

    // nMonth is valid - return the name of the month
    return pszMonths[nMonth];
}

Here int2Month() first checks to make sure that nMonth is a number between 1 and 12, inclusive (the default clause of the switch statement handled that in the previous example). If nMonth is valid, the function uses it as an offset into an array containing the names of the months.

Tip

This technique of referring to character strings by index is especially useful when writing your program to work in different languages. For example, a program may declare a ptrMonths of pointers to Julian months in different languages. The program would initialize ptrMonth to the proper names, be they in English, French, or German (for example) at execution time. In that way, ptrMonth[1] points to the correct name of the first Julian month, irrespective of the language.

A program that demonstrates int2Month() is included on the CD-ROM as DisplayMonths.

Accessing the arguments to main()

Now the truth can be told — what are all those funny argument declarations to main() in our program template? The second argument to main() is an array of pointers to null-terminated character strings. These strings contain the arguments to the program. The arguments to a program are the strings that appear with the program name when you launch it. These arguments are also known as parameters. The first argument to main() is the number of parameters passed to the program. For example, suppose that I entered the following command at the command prompt:

MyProgram file.txt /w

The operating system executes the program contained in the file MyProgram.exe, passing it the arguments file.txt, and /w.

Consider the following simple program:

// PrintArgs - write the arguments to the program
//             to the standard output
#include <cstdio>
#include <cstdlib>
#include <iostream>
using namespace std;

int main(int nNumberofArgs, char* pszArgs[])
{
    // print a warning banner
    cout << "The arguments to "
         << pszArgs[0] << " are:
";

    // now write out the remaining arguments
    for (int i = 1; i < nNumberofArgs; i++)
    {
        cout << i << ":" << pszArgs[i] << "
";
    }

    // that's
    cout << "That's it" << endl;

     // wait until user is ready before terminating program
     // to allow the user to see the program results
     system("PAUSE");
     return 0;
}

As always, the function main() accepts two arguments. The first argument is an int that I have been calling (quite descriptively, as it turns out) nNumberofArgs. This variable is the number of arguments passed to the program. The second argument is an array of pointers of type char* that I have been calling pszArgs.

Accessing program arguments DOS style

If I were to execute the PrintArgs program from the command prompt window as

PrintArgs arg1 arg2 arg3 /w

nArgs would be 5 (one for each argument). The first argument is the name of the program itself. This could be anywhere from the simple "PrintArgs" to the slightly more complicated "PrintArgs.exe" to the full path — the C++ standard doesn't specify. The environment can even supply a null string "" if it doesn't have access to the name of the program.

The remaining elements in pszArgs point to the program arguments. For example, the element pszArgs[1] points to "arg1" and pszArgs[2] to "arg2". Because Windows does not place any significance on "/w", this string is also passed as an argument to be processed by the program.

Tip

Actually C++ includes one final value. The last value in the array, the one after the pointer to the last argument to the program, contains nullptr.

To demonstrate how argument passing works, you need to build the program from within Code::Blocks and then execute the program directly from a command prompt. First ensure that Code::Blocks has built an executable by opening the PrintArgs projects and choosing Build

Accessing program arguments DOS style

Next open a command prompt window. If you are running Unix or Linux, you're already there. If you are running Windows, choose Programs

Accessing program arguments DOS style

Now you need to use the CD command to navigate to the directory where Code::Blocks placed the PrintArgs program. If you used the default settings when installing Code::Blocks that directory will be C:CPP_ProgramsChap09PrintArgsinDebug.

You can now execute the program by typing its name followed by your arguments. The following shows what happened when I did it in Windows Vista:

Microsoft Windows [Version 6.0.6001]
Copyright (c) 2006 Microsoft Corporation.  All rights reserved.

C:UsersRandy>cd cpp_programschap09printargsindebug

C:CPP_ProgramsChap09PrintArgsinDebug>PrintArgs arg1 arg2 arg3 /n
The arguments to PrintArgs are:
1:arg1
2:arg2
3:arg3
4:/n
That's it
Press any key to continue . . .

Wild cards such as *.* may or may not be expanded before being passed to the program — the standard is silent on this point. The Code::Blocks/gcc compiler included with this book does perform such expansion on Windows Vista, as the following example shows:

C:CPP_ProgramsChap09PrintArgs>bindebugPrintArgs *.*
The arguments to bindebugPrintArgs are:
1:bin
2:main.cpp
3:obj
4:PrintArgs.cbp
That's it
Press any key to continue . . .

Here you see the names of the files in the current directory in place of the *.* that I entered.

Tip

Wild-card expansion is performed under all forms of Unix and Linux as well. Wild-card expansion was specifically not performed under older versions of gcc and it isn't performed under Visual C++ Express.

Accessing program arguments Code::Blocks style

You can add arguments to your program when you execute it from Code::Blocks as well. Choose Project

Accessing program arguments Code::Blocks style

Accessing program arguments Windows-style

Windows passes arguments as a means of communicating with your program as well. Try the following experiment. Build your program as you would normally. Find the executable file using Windows Explorer. As noted earlier, the default location for the PrintArgs program is C:CPP_ProgramsChap09PrintArgsinDebug. Now grab a file and drop it onto the filename. (It doesn't matter what file you choose because the program won't hurt it anyway.) Bam! The PrintArgs program starts right up, and the name of the file that you dropped on the program appears.

Now try again, but drop several files at once. Select multiple filenames while pressing the Ctrl key or by using the Shift key to select a group. Now drag the lot of them onto PrintArgs.exe and let go. The name of each file appears as output.

I dropped a few of the files that appear in my Program FilesWinZip folder onto PrintArgs as an example:

The arguments to C:CPP_ProgramsChap09PrintArgsinDebugPrintArgs.exe are:
1:C:Program FilesWinZipVENDOR.TXT
2:C:Program FilesWinZipWHATSNEW.TXT
3:C:Program FilesWinZipWINZIP.CHM
4:C:Program FilesWinZipWINZIP.TXT
5:C:Program FilesWinZipWINZIP32.EXE
6:C:Program FilesWinZipWZ.COM
7:C:Program FilesWinZipWZ.PIF
8:C:Program FilesWinZipWZ32.DLL
9:C:Program FilesWinZipWZCAB.DLL
10:C:Program FilesWinZipWZCAB3.DLL
11:C:Program FilesWinZipFILE_ID.DIZ
12:C:Program FilesWinZipLICENSE.TXT
13:C:Program FilesWinZipORDER.TXT
14:C:Program FilesWinZipREADME.TXT
That's it
Press any key to continue . . .

Notice that the name of each file appears as a single argument, even though the filename may include spaces. Also note that Windows passes the full path name of the file.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.181.252