17. Debugging and Optimization

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 17. Debugging and Optimization

Bloody instructions which, being learned, return to plague the inventor.

—Shakespeare, on debugging

The hardest part of a program is not the design and writing, but the debugging phase. It is here that you find out how your program really works (instead of how you think it works).

As programs grow larger and larger, finding bugs becomes more and more difficult. Also, the cost of errors is growing. Software bugs and poor design decisions have cost people billions of dollars the last few years alone. In this chapter we’ll go through techniques you can use to find and eliminate bugs.

Code Reviews

A code review is the process by which the programmer shows her code to her peers and they review it. Code reviews are the most effective way of making sure your code has a minimum number of bugs. Code reviews not only give you better code but also give you better programmers.

Producing an effective code review is an art. It requires good people-skills and management support. It takes time to get a good system in place.

Planning the Review

The first thing to consider when planning a code review is who will be on the team doing the review. Ideally you should try to have three to five people there in addition to the person who wrote the code. Fewer than three and you don’t have enough people to do an effective review. More than five and you can’t have an effective meeting. (In large meetings, people don’t talk, they give speeches.)

At least one senior software engineer should be part of the group. He’s there to make sure that the code being reviewed conforms to the overall design of the project it’s written for. He also has enough experience so that he probably can spot more problems than the less experienced coders.

Ideally, you don’t want management there. That’s because whenever a non-technical boss shows up, you spend more time explaining things to her than reviewing code. If a manager decides she must be there, make sure she knows that her role is strictly as an observer.

One problem with code reviews is getting the team to actually attend the meetings. Bribery is one solution to this problem. Free drinks, cookies, bagels, or other eats can be used to help attendance. Lunch should also be considered.

The Review Meeting

All right: you’ve chosen the team, you’ve got a conference room, and somehow managed to get everyone into the meeting. What now?

First, you need to designate someone to take notes. This person’s job is to capture all the important information generated in the meeting. This person should not be the writer of the code. The author should keep his own notes. That way, with two sets of notes, you can be sure that any changes recommended by the committee are performed.

The meeting starts with the author of the code explaining his work. How he does this is largely irrelevant. He can use the top down method, bottom up, or sideways with a left twist. As long as the other people in the room understand what he’s done, he can explain his code any way he wants to.

The other people in the room should feel free to break into the conversation at any time with questions or suggestions, or when they notice a problem.

Comments should be constructive. Everything said should help make the code better. Comments such as “I think you need to check for an error return from that system call” fall into this category.

Comments like “I could have done that in half as much code” are not appropriate. This is not a competition. This is also not a design review, so don’t criticize the design (unless it’s flawed and will absolutely not work.)

It is important to keep the meeting on track. The purpose of the review is to make sure the code works. The meeting should be focused only on the code in front of you. Topics like new programming techniques, the latest Microsoft product, and what movie you saw last night do not belong here. They should wait until your regular gossip sessions when you and the rest of the gang hang around the copy machine to discuss the latest rumors.

Code reviews should last about one to two hours, three at the most. Any longer than that and the committee will start to skim code and skip steps just to get out of the meeting. Programmers are like a can of soda: after about three hours, they go flat and lifeless. If you need more time, break it into multiple reviews, preferably with different people doing the review each time.

If you do a review right, the result will be that you’ve caught a number of errors. But what’s important is that those errors did not make it into the final product. Fixing something in the testing stage or, worse, after the product has shipped, is extremely costly.

There’s one key advantage of code reviews that we haven’t discussed yet: you not only get better programs, you get better programmers. When a mistake is caught at review time and pointed out to the programmer, she is probably not going to make that mistake again. The result is that the next program she generates will contain less errors.

One example of this occurred when I wrote the book Practical C Programming. I had a bad habit of using the word “that” in places where I shouldn’t. The copy editor reviewed the book and crossed out about three to five “that"s per page. I got the job of editing the files and removing the approximately 2,000 extraneous uses of the word. As a result, O’Reilly and I produced a better book. But another result was that in all my subsequent books, I didn’t make that mistake again.

Why People Don’t Do Code Reviews

One of the main reasons that people don’t do code reviews is that they are working on a tight schedule, and management has decided there isn’t enough time to do them. This is a false economy. After all, is the goal to get code out on time or to get working code out on time? (If you eliminate the requirement that the code must work, you can cut down on development time tremendously.)

Code that has not been reviewed will be buggy. Hopefully these bugs are found during the testing phase where they are merely expensive to fix. Sometimes they make it into a release product where they are very expensive to fix. One of the cheapest times to fix bugs is during the coding phase, and code reviews are an excellent way to see that bugs are eliminated early. (As someone once put it, “Why is there never enough time to do things right, but always enough time to do them over?”)

Metrics

Metrics are an important part of the code review process. They show how effective your review process is. With proper metrics, you can demonstrate to both the programmers and management that code reviews are effective and are saving the company money.

The metrics that should be collected for the code reviews are:

The number of lines reviewed
The time spent reviewing the code
The number of defects found
The defect density (defects per line of code)—computed
The review rate (# lines reviewed per hour)—computed

Most review processes show a remarkable ability to find defects before the code enters the testing process. (There it costs 10-500 times as much to find defects.) Also, as you progress, you’ll discover that the defect density goes down. This is because the review process not only makes for better code, but better programmers.

Serial Debugging

Before you start debugging, save the old, “working” copy of your program in a safe place. (If you are using a source control system such as SCCS, RCS, or PCVS, your last working version should be checked in.) Many times while you are searching for a problem, you may find it necessary to try out different solutions or to add temporary debugging code. Sometimes you will find you’ve been barking up the wrong tree and need to start over. That’s when the last working copy becomes invaluable.

Once you have reproduced the problem, you must determine what caused it to happen. There are several methods for doing this.

Divide and Conquer

The divide and conquer method has already been briefly discussed in Chapter 7. It consists of putting in cout statements where you know the data is good (to make sure it really is good), where the data is bad, and several points in between. This way you can start zeroing in on the section of code that contains the error. More cout statements can further reduce the scope of the error until the bug is finally located.

The Confessional Method of Debugging

The confessional method of debugging is one by which the programmer explains his program to someone: an interested party, an uninterested party, a wall—it doesn’t matter to whom he explains it as long as he talks about it.

A typical confessional session goes like this:

“Hey, Bill, could you take a look at this? My program has a bug in it. The output should be 8.0 and I’m getting -8.0. The output is computed using this formula—and I’ve checked out the payment value and rate and the date must be correct, unless there is something wrong with the leap-year code, which—thank you Bill, you’ve found my problem.”

Bill never said a word.

This type of debugging is also called a walkthrough . Getting other people involved brings a fresh point of view to the process, and frequently other people can spot problems you have overlooked.

Debug-Only Code

The divide-and-conquer method uses temporary cout statements. They are put in as needed and taken out after they are used. The preprocessor conditional-compilation directives can be used to put in and take out debugging code. For example:

#ifdef DEBUG 
    std::cout << "Width " << width << " Height " << height << '
';
#endif /* DEBUG */

The program can be compiled with DEBUG undefined for normal use, and you can define it when debugging is needed.

Debug Command-Line Switch

Rather than using a compile-time switch to create a special version of the program, you can permanently include the debugging code and add a special program switch that will turn on debugging output. For example:

if (debug)  
    std::cout << "Width " << width << " Height " << height << '
';

where debug is a variable set if -D is present on the command line when the program is run.

Using a command-line option has the advantage that only a single version of the program exists. One of the problems with “debug-only” code is that unless the code is frequently used, it can easily become stale and out of date. Frequently a programmer tries to find a bug only to discover that the debug-only code is out of date and needs fixing.

Another advantage of the debug command-line option is that the user can turn on this switch in the field, save the output, and send it to you for analysis. The runtime option should be used in all cases instead of conditional compilation, unless there is some reason you do not want the customer to be able to get at the debugging information.

Some programs use the concept of a debug level. Level 0 outputs only minimal debugging information, level 1 more information, and on up to level 9, which outputs everything.

Another variation of this debugging technique can be seen in the Ghostscript ^[1] program by Aladdin Enterprises. This program implements the idea of debugging letters. The command option -Z xxx sets the debugging flags for each type of diagnostic output wanted. For example, f is the code for the fill algorithm, and p is the code for the path tracer. If I wanted to trace both these sections, I would specify -Zfp.

The option is implemented by code similar to this:

/* 
 * Even though we only put 1 zero, C++ will fill in the 
 * rest of the arrays with zeros 
 */ 
char debug[128] = {0};    // The debugging flags

int main(int argc, char *argv[]) 
{ 
    while ((argc > 1) && (argv[1][0] == '-')) { 
        switch (argv[1][1]) { 
            /* .... normal switch .... */ 

            // Debug switch
            case 'Z': 
                debug_ptr = &argv[1][2]; 
                // Loop for each letter
                while (*debug_ptr != '') { 
                    debug[*debug_ptr] = 1; 
                    ++debug_ptr; 
                } 
                break; 
        } 
        --argc; 
        ++argv; 
    } 

    /* Rest of program */ 
}

Now that we’ve set the debug options, we can use them with code like the following:

    if (debug['p']) 
        std::cout << "Starting new path
";

Ghostscript is a large program (some 25,000 lines) and rather difficult to debug. This form of debugging allows the user to get a great deal of information easily.

Going Through the Output

Enabling debug printout is a nice way of getting information, but many times there is so much data that the information you want can easily get lost.

The shell or command-line interpreter allows you to redirect what would normally go to the screen to a file through the use of the >file option. For example:

buggy -D9 >tmp.out

will run the program buggy with a high level of debug set and send the output to the file tmp.out.

The text editor on your system also makes a good file browser. You can use its search capabilities to look through the file containing the debug output for the information you want to find.

Interactive Debuggers

Most compiler manufacturers provide an interactive debugger. They give you the ability to stop the program at any point, examine and change variables, and “single-step” through the program. Because each debugger is different, a detailed discussion of each tool is not possible.

Tip

Most compilers require that you enable debugging with a command-line option. If this option is not present, debugging information is not included in the program. If you used the compilations flags suggested in Chapter 2, your programs have been compiled with debugging enabled.

Basic Debugging Commands

However, we are going to discuss one debugger: GDB. This program is available for many Unix machines from the Free Software Foundation. Borland-C++ and Visual C++ have their own built-in debuggers. Although the exact syntax used by your debugger may be different, the principles shown here will work for all debuggers.

Basic GDB commands are the following:

run: Start execution of a program.
break line-number: Insert a breakpoint at the given line number. When a running program reaches a breakpoint, execution stops and control returns to the debugger.
break function-name: Insert a breakpoint at the first line of the named function. Commonly, the command break int main is used to stop execution at the beginning of the program.
cont: Continue execution after a breakpoint.
print expression: Display the value of an expression.
step: Execute a single line in the program. If the current statement calls a function, the function is single-stepped.
next: Execute a single line in the program, but treat function calls as a single line. This command is used to skip over function calls.
list: List the source program.
where: Print the list of currently active functions.
status: Print a list of breakpoints.
delete: Remove a breakpoint.

Debugging a Simple Program

We have a program that should count the number of threes and sevens in a series of numbers. The problem is that it keeps getting the wrong answer for the number of sevens. Our program is shown in Example 17-1.

Example 17-1. seven/count.cpp

 1: #include <iostream>
 2: 
 3: int seven_count;    /* number of seven's in the data */
 4: int data[5];        /* the data to count 3 and 7 in */
 5: int three_count;    /* the number of threes in the data */
 6: 
 7: int main(  ) {
 8:     int index;  /* index into the data */
 9:     void get_data(int data[]);
10: 
11:     seven_count = 0;
12:     three_count = 0;
13:     get_data(data);
14: 
15:     for (index = 1; index <= 5; ++index) {
16:         if (data[index] == 3)
17:             ++three_count;
18:         if (data[index] == 7)
19:             ++seven_count;
20:     }
21:     std::cout << "Three's " << three_count << 
22:             " Seven's " << seven_count << '
';
23:     return (0);
24: }
25: /********************************************************
26:  * get_data -- get 5 numbers from the command line      *
27:  ********************************************************/
28: void get_data(int data[])
29: {
30:     std::cout << "Enter 5 numbers
";
31:     std::cin >> data[1] >> data[2] >> data[3] >> data[4] >> data[5];
32: }

When we run this program with the data 3 7 3 0 2, the results are:

Threes 3 Sevens 3

We start by invoking the debugger (GDB) with the name of the program we are going to debug (count). The debugger initializes, outputs the prompt (gdb), and waits for a command.

% gdb count
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb)

We don’t know where the variable is getting changed, so we’ll start at the beginning and work our way through until we get an error. At every step, we’ll display the variable seven_count just to make sure it’s okay.

We need to stop the program at the beginning so we can single-step through it. The command break main tells GDB to set a breakpoint at the first instruction of the function main:

(gdb) break main
Breakpoint 1 at 0x22c2: file count.cpp, line 12.
(gdb)

The number 1 is used by GDB to identify the breakpoint. Now we need to start the program. The command run tells GDB to start the program, which will run until it hits the first breakpoint:

(gdb) run
Starting program: /usr/sdo/count/count 

Breakpoint 1, main (  ) at count.cpp:12
11          seven_count = 0;
(gdb)

The message Breakpoint 1, main... indicates that the program encountered a breakpoint and has now turned control over to debug.

We have reached the point where seven_count is initialized. The command next will execute a single statement, treating function calls as one statement. (The name of the command for your debugger may be different.) We go past the initialization and check to see whether it worked:

(gdb) next
12          three_count = 0;
(gdb) print seven_count
$1 = 0
(gdb)

Initialization worked. We try the next few lines, checking all the time:

(gdb) next
13          get_data(data);
(gdb) print seven_count
$2 = 0
(gdb) next
Enter 5 numbers
3 7 3 0 2
15          for (index = 1; index <= 5; ++index) {
(gdb) print seven_count
$3 = 2
(gdb)

seven_count somehow changed the value to 2. The last statement we executed was get_data(data); so something is going on in that function. We add a breakpoint at the beginning of get_data, get rid of the one at main, and start the program over with the run command:

(gdb) break 
                  get_data
Breakpoint 2 at 0x23b2: file count.cpp, line 30.
(gdb) info breakpoints
Num Type           Disp Enb Address    What
1   breakpoint   keep y   0x000022c2 in main at count.cpp:11
2   breakpoint   keep y   0x000023b2 in get_data(int *) at count.cpp:30
(gdb) delete 1
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) Y
Starting program: /usr/sdo/count/count 
Breakpoint 2, get_data (data=0x208f8) at count.cpp:30
(gdb)

We now start single-stepping again until we find the error:

Breakpoint 2, get_data (data=0x208f8) at count.cpp:30
30          std::cout << "Enter 5 numbers
";
(gdb) print seven_count
$5 = 0
(gdb) next
31     std::cin >> data[1] >> data[2] >> data[3] >> data[4] >> data[5];
(gdb) print seven_count
$6 = 0
(gdb) next
Enter 5 numbers
3 7 3 0 2
32      }
(gdb) print seven_count
$7 = 2
(gdb) list 23
23          return (0);
24      }
25      /********************************************************
26       * get_data -- get 5 numbers from the command line      *
27       ********************************************************/
28      void get_data(int data[])
29      {
30          std::cout << "Enter 5 numbers
";
31          std::cin >> data[1] >> data[2] >> data[3] >> data[4] >> data[5];
32      }

At line 31 the data was good, but when we reached line 32, the data was bad, so the error is located at line 31 of the program, the std::cin. We’ve narrowed the problem down to one statement. By inspection, we can see that we are using data[5], an illegal member of the array data.

But why does seven_count go bad? Since data is only five elements long, there is no data[5]. However, the std::cin >> data[5] has to put the data someplace, so it decided to put it in a random memory location, in this case seven_count.

Debugging a Binary Search

The binary search algorithm is fairly simple. You want to see whether a given number is in an ordered list. Check your number against the one in the middle of the list. If it is the number, you were lucky—stop. If your number is bigger, you might find it in the top half of the list. Try the middle of the top half. If it is smaller, try the bottom half. Keep trying and dividing the list in half until you find the number or the list gets down to a single number.

The First Bug, a Segmentation Fault

Example 17-2 uses a binary search to see whether a number can be found in the file numbers.dat.

Example 17-2. search/search0.cpp

 1: /********************************************************
 2:  * search -- Search a set of numbers.                   *
 3:  *                                                      *
 4:  * Usage:                                               *
 5:  *      search                                          *
 6:  *              you will be asked numbers to lookup     *
 7:  *                                                      *
 8:  * Files:                                               *
 9:  *      numbers.dat -- numbers 1 per line to search     *
10:  *                      (Numbers must be ordered)       *
11:  ********************************************************/
12: #include <iostream>
13: #include <fstream>
14: #include <cstdlib>
15: #include <cstdio>
16: #include <cassert>
17: 
18: const int MAX_NUMBERS = 1000;   // Max numbers in file 
19: const char *const DATA_FILE = "numbers.dat";// File with numbers 
20: 
21: int data[MAX_NUMBERS];  // Array of numbers to search 
22: int max_count;          // Number of valid elements in data 
23: 
24: int main(  )
25: {
26:     std::ifstream in_file;      // Input file 
27:     int middle;         // Middle of our search range 
28:     int low, high;      // Upper/lower bound 
29:     int search;         // number to search for 
30: 
31:     in_file.open(DATA_FILE, std::ios::in);
32:     if (in_file.bad(  )) {
33:         std::cerr << "Error:Unable to open " << DATA_FILE << '
';
34:         exit (8);
35:     }
36: 
37:     /*
38:      * Read in data 
39:      */
40: 
41:     max_count = 0;
42:     while (true) {
43:         char line[30];  // Line from the input file
44: 
45:         if (in_file.eof(  ))
46:             break;
47: 
48:         in_file.getline(line, sizeof(line));
49: 
50:         assert(max_count >= 0);
51:         assert(max_count < sizeof(data)/sizeof(data[0]));
52:         std::sscanf(line, "%d", data[max_count]);
53:         if (data[max_count] == -1)
54:             break;
55: 
56:         ++max_count;
57:     }
58: 
59:     while (true) {
60:         std::cout << "Enter number to search for or -1 to quit:" ;
61:         std::cin >> search;
62: 
63:         if (search == -1)
64:             break;
65: 
66:         low = 0;
67:         high = max_count;
68: 
69:         while (true) {
70:             middle = (low + high) / 2;
71: 
72:             assert(middle >= 0);
73:             assert(middle < sizeof(data)/sizeof(data[0]));
74:             if (data[middle] == search) {
75:                 std::cout << "Found at index " << middle << '
';
76:             }
77: 
78:             if (low == high) {
79:                 std::cout << "Not found
";
80:                 break;
81:             }
82: 
83:             assert(middle >= 0);
84:             assert(middle < sizeof(data)/sizeof(data[0]));
85:             if (data[middle] < search)
86:                 low = middle;
87:             else
88:                 high = middle;
89:         }
90:    }
91:    return (0);
92: }

Here’s our data file:

File: numbers.dat

When we run this program in Unix, the results are:

% search 
Segmentation fault (core dumped)

When we run this program under Microsoft Windows, we get an application error (if we’re lucky).

Either way this is not good. It means something went wrong in our program and the program tried to read memory that wasn’t there. The debugger GDB can read this file and help us determine what happened:

% gdb search
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb) run
Starting program: /usr/sdo/search/search 

Program received signal SIGSEGV, Segmentation fault.
0xec46320 in number (  )
(gdb)

The debugger tells us we have been killed by a segmentation fault generated from the procedure number. But we don’t have a procedure number! The routine must belong to the C++ library.

We now use the where command to find out which function called which function. This report is called a stack trace .

(gdb) where
#0  0xec46320 in number (  )
#1  0xec45cc2 in _doscan (  )
#2  0xec45b34 in sscanf (  )
#3  0x2400 in main (  ) at search.cpp:52
(gdb)

The current function is printed first, then the function that called it, and so on until we reach the outer function main. From this we see that number was called by _doscan, which was called by sscanf. We recognize sscanf as a library routine. The other functions must be subroutines called by sscanf. The last function that had control was the call of sscanf, which was made from line 52 of main.

Now we use the list command to take a look at the source for this line:

(gdb) list 52
45              if (in_file.eof(  ))
46                  break;
47      
48              in_file.getline(line, sizeof(line));
49      
50              assert(max_count >= 0);
51              assert(max_count < sizeof(data)/sizeof(data[0]));
52              sscanf(line, "%d", data[max_count]);
53              if (data[max_count] == -1)
54                  break;
55      
56              ++max_count;
(gdb) quit
The program is running.  Quit anyway (and kill it)? (y or n) Y

Line 52 caused the problem.

Another way of finding the problem is to single-step through the program until the error occurs. First list a section of the program to find a convenient place to put the breakpoint, and then start the execution and single-step process:

Script started on Mon Oct 31 10:07:19 1994
% gdb search
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb) list 
                  main
20      const char *const DATA_FILE = "numbers.dat"; // File with nums
21      
22      int data[MAX_NUMBERS];  // Array of numbers to search 
23      int max_count;          // Number of valid elements in data 
24      int main(  )
25      {
26          std::ifstream in_file;   // Input file 
27          int middle;         // Middle of our search range 
28          int low, high;      // Upper/lower bound 
29          int search;         // Number to search for 
(gdb) break main
Breakpoint 1 at 0x2318: file search.cpp, line 25.
(gdb) run
Starting program: /usr/sdo/search/search

Breakpoint 1, main (  ) at search.cpp:25
26          ifstream in_file;   // Input file 
(gdb) step
31          in_file.open(DATA_FILE, ios::in);
(gdb) step
32          if (in_file.bad(  )) {
(gdb) step
41          max_count = 0;
(gdb) step
45              if (in_file.eof(  ))
(gdb) step
48              in_file.getline(line, sizeof(line));
(gdb) step
50              assert(max_count >= 0);
(gdb) step
51              assert(max_count < sizeof(data)/sizeof(data[0]));
(gdb) step
52              sscanf(line, "%d", data[max_count]);
(gdb) step

Program received signal SIGSEGV, Segmentation fault.
0xec46320 in number (  )
(gdb) quit
The program is running.  Quit anyway (and kill it)? (y or n) y

This method, too, points at line 52 as the culprit. On inspection we notice that we forgot to put an ampersand (&) in front of the third parameter for std::sscanf. So we change line 52 from:

           std::sscanf(line, "%d", data[max_count]);

to:

           std::sscanf(line, "%d", &data[max_count]);

and try again.

Note

You might wonder why we use the function sscanf when the line:

    in_file >> data[max_count];

performs the same function.

The answer is simple. We used sscanf to cause problems. Without the pointer error, we would have nothing to debug. The in_file statement is more reliable, and reliable code has no place in a chapter on debugging.

The Unintended Infinite Loop

Rather than fix the std::scanf call, we’ve decided to enter the 21st century and use C++ style I/O calls. The first number in our list is 4, so we try it. This time our output looks like:

Enter number to search for or -1 to quit: 4 
Found at index 0 
Found at index 0 
Not found 
Enter number to search for or -1 to quit: ^C

The program should find the number, let us know it’s at index 0, and then ask for another number. Instead we get two found messages and one not found message. We know that everything is running smoothly up to the time we get the first found message. After that things go downhill.

Getting back into the debugger, we use the list command to locate the found message and put a breakpoint there:

% gdb search
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb) list 69,81
69              while (true) {
70                  middle = (low + high) / 2;
71
72                  assert(middle >= 0);
73                  assert(middle <sizeof(data)/sizeof(data[0]));      
74                  if (data[middle] == search) {
75                    std::cout << "Found at index " << middle << '
';
76                  }
77      
78                  if (low == high) {
79                      std::cout << "Not found
";
80                      break;
81                  }
(gdb) break 75
Breakpoint 1 at 0x249e: file search.cpp, line 71.
(gdb) run
Starting program: /usr/sdo/search/search 
Enter number to search for or -1 to quit: 4

Breakpoint 1, main (  ) at search.cpp:71
75                    std::cout << "Found at index " << middle << '
';
(gdb) step
Found at index 0
78                  if (low == high) {
(gdb) step
83                  assert(middle >= 0);
(gdb) step
84                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
85                  if (data[middle] < search)
(gdb) step
88                      high = middle;
(gdb) step
70                  middle = (low + high) / 2;
(gdb) step
72                  assert(middle >= 0);
(gdb) step
73                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
74                  if (data[middle] == search) {
(gdb) step
75                    std::cout << "Found at index " << middle << '
';
(gdb) step
Found at index 0
78                  if (low == high) {
(gdb) quit
The program is running.  Quit anyway (and kill it)? (y or n) y

The program doesn’t exit the loop. Instead it continues with the search. Because the number has already been found, this search results in strange behavior. We are missing a break after the cout.

We need to change:

    if (data[middle] == search) { 
        std::cout << "Found at index " << middle << '
'; 
    }

to:

    if (data[middle] == search) { 
        std::cout << "Found at index " << middle << '
'; 
        break; 
    }

Making this fix, we try the program again:

% search 
Enter number to search for or -1 to quit: 4 
Found at index 0 
Enter number to search for or -1 to quit: 6 
Found at index 1 
Enter number to search for or -1 to quit: 3 
Not found 
Enter number to search for or -1 to quit: 5 
                  program runs forever (or until we abort it)

We have a runaway program. This time, instead of setting a breakpoint, we just start running the program. After a few seconds pass and we believe that we are stuck in the infinite loop, we stop the program with a control-C (^C). Normally this would abort the program and return us to the shell prompt. Since we are running with the debugger, it returns control to GDB:

% gdb search
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb) run
Starting program: /usr/sdo/search/search 
Enter number to search for or -1 to quit: 5
                  ^C
Program received signal SIGINT, Interrupt.
0x2500 in main (  ) at search.cpp:79
79                  if (data[middle] < search)

Now we can use the single-step command to step through the infinite loop, looking at key values along the way.

87                  if (data[middle] < search)
(gdb) print middle
$1 = 0
(gdb) print data[middle]
$2 = 4
(gdb) print search
$3 = 5
(gdb) step
88                      low = middle;
(gdb) step
71                  middle = (low + high) / 2;
(gdb) step
73                  assert(middle >= 0);
(gdb) step
74                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
75                  if (data[middle] == search) {
(gdb) step
80                  if (low == high) {
(gdb) step
85                  assert(middle >= 0);
(gdb) step
86                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
87                  if (data[middle] < search)
(gdb) step
88                      low = middle;
(gdb) step
71                  middle = (low + high) / 2;
(gdb) step
73                  assert(middle >= 0);
(gdb) step
74                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
75                  if (data[middle] == search) {
(gdb) step
80                  if (low == high) {
(gdb) step
85                  assert(middle >= 0);
(gdb) step
86                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
87                  if (data[middle] < search)
(gdb) step
88                      low = middle;
(gdb) step
71                  middle = (low + high) / 2;
(gdb) step
73                  assert(middle >= 0);
(gdb) step
74                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
75                  if (data[middle] == search) {
(gdb) step
80                  if (low == high) {
(gdb) step
85                  assert(middle >= 0);
(gdb) step
86                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
87                  if (data[middle] < search)
(gdb) step
88                      low = middle;
(gdb) step
71                  middle = (low + high) / 2;
(gdb) step
73                  assert(middle >= 0);
(gdb) step
74                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
75                  if (data[middle] == search) {
(gdb) print 
                  low
$5 = 0
(gdb) print 
                  middle
$6 = 0
(gdb) print 
                  high
$7 = 1
(gdb) print 
                  search
$8 = 5
(gdb) print data[0]
$9 = 4
(gdb) print data[1]
$10 = 6
(gdb) quit
The program is running.  Quit anyway (and kill it)? (y or n) y

The problem is that we have reached a point where the following is true:

low = 0  middle = 0  high = 1

The item we are searching for falls exactly between elements 0 and 1. Our algorithm has an off-by-one error. Obviously the middle element does not match. If it did, we’d exit with a found at message. So there is no point including the middle element in our new search range. Our code to adjust the interval is:

            if (data[middle] < search) 
                low = middle; 
            else 
                high = middle;

It should be:

            if (data[middle] < search) 
                low = middle + 1; 
            else 
                high = middle - 1;

The full version of the corrected program is shown in Example 17-3.

Example 17-3. search/search4.cpp

 1: /********************************************************
 2:  * search -- Search a set of numbers.                   *
 3:  *                                                      *
 4:  * Usage:                                               *
 5:  *      search                                          *
 6:  *              you will be asked numbers to lookup     *
 7:  *                                                      *
 8:  * Files:                                               *
 9:  *      numbers.dat -- numbers 1 per line to search     *
10:  *                      (Numbers must be ordered)       *
11:  ********************************************************/
12: #include <iostream>
13: #include <fstream>
14: #include <cstdlib>
15: #include <cstdio>
16: #include <cassert>
17: 
18: const int MAX_NUMBERS = 1000;   // Max numbers in file 
19: const char *const DATA_FILE = "numbers.dat";// File with numbers 
20: 
21: int data[MAX_NUMBERS];  // Array of numbers to search 
22: int max_count;          // Number of valid elements in data 
23: int main(  )
24: {
25:     std::ifstream in_file;      // Input file 
26:     int middle;         // Middle of our search range 
27:     int low, high;      // Upper/lower bound 
28:     int search;         // number to search for 
29: 
30:     in_file.open(DATA_FILE, std::ios::in);
31:     if (in_file.bad(  )) {
32:         std::cerr << "Error:Unable to open " << DATA_FILE << '
';
33:         exit (8);
34:     }
35: 
36:     /*
37:      * Read in data 
38:      */
39: 
40:     max_count = 0;
41: 
42:     while (true) {
43:         char line[30];  // Line from the input file
44: 
45:         if (in_file.eof(  ))
46:             break;
47: 
48:         in_file.getline(line, sizeof(line));
49: 
50:         assert(max_count >= 0);
51:         assert(max_count < sizeof(data)/sizeof(data[0]));
52:         std::sscanf(line, "0", &data[max_count]);
53:         if (data[max_count] == -1)
54:             break;
55: 
56:         ++max_count;
57:     }
58: 
59:     while (true) {
60:         std::cout << "Enter number to search for or -1 to quit:" ;
61:         std::cin >> search;
62: 
63:         if (search == -1)
64:             break;
65: 
66:         low = 0;
67:         high = max_count;
68: 
69:         while (true) {
70:             if (low >= high) {
71:                 std::cout << "Not found
";
72:                 break;
73:             }
74:             middle = (low + high) / 2;
75: 
76:             assert(middle >= 0);
77:             assert(middle < sizeof(data)/sizeof(data[0]));
78:             if (data[middle] == search) {
79:                 std::cout << "Found at index " << middle << '
';
80:                 break;
81:             }
82: 
83:             assert(middle >= 0);
84:             assert(middle < sizeof(data)/sizeof(data[0]));
85:             if (data[middle] < search)
86:                 low = middle +1;
87:             else
88:                 high = middle -1;
89:         }
90:    }
91:    return (0);
92: }

Interactive Debugging Tips and Tricks

Interactive debuggers work well for most programs, but sometimes they need a little help. Consider Example 17-4. We try to debug it and find it fails when point_number is 735. Why it fails on call 735 to lookup we don’t know, but the first 734 calls work and the next one doesn’t. All we know is that we call float_point_color 800 times, it calls lookup 800 times, and something goes wrong at 735.

We want to put a breakpoint before the calculation is made. When the debugger inserts a breakpoint into a program, the program will execute normally until it hits the breakpoint, then control will return to the debugger. This allows the user to examine and change variables, as well as perform other debugging commands. When a cont command is typed, the program will continue execution as though nothing happened. The problem is that there are 734 points before the one we want, and we don’t want to stop for each of them.

Example 17-4. debug/cstop.cpp

float point_color(int point_number)
{
    float correction;           // color correction factor 
    extern float red,green,blue;// current colors 

    // Lookup color correction
    extern lookup(int point_number);

    correction = lookup(point_number);
    return (red*correction * 100.0 + 
            blue*correction * 10.0 +
            green*correction);
}

How do we force the debugger to stop only when point_number == 735? We can do this by adding the following temporary code:

48:    if (point_number == 735)  /* ### Temp code ### */ 
49:        point_number = point_number;   /* ### Line to stop on ### */

Line 49 does nothing useful except serve as a line that the debugger can stop on. We can put a breakpoint on that line with the command break 49. The program will process the first 734 points, then execute line 49, hitting the breakpoint. (Some debuggers have a conditional breakpoint. The advanced GDB command break 49 if point_number == 735 would also work; however, your debugger may not have such advanced features.)

Runtime Errors

Runtime errors are usually the easiest to fix. Following are some types of runtime errors:

Segmentation violation: This error indicates that the program tried to dereference a pointer containing a bad value.
Stack overflow: The program tried to use too many temporary variables. Sometimes this means the program is too big or using too many big temporary arrays, but most of the time this is due to infinite recursion problems. Almost all Unix systems automatically check for this error. Borland-C++ will check for stack overflow only if the compile-time option -N is used.
Divide by zero: Divide by zero is an obvious error. Unix masks the problem by reporting an integer divide by zero with the error message Floating exception (core dumped).

In all cases, program execution will be stopped. In Unix, an image of the running program, called a core file, is written out. This file can be analyzed by the debugger to determine why the program died. Our first run of Example 17-4 resulted in a core dump. (One of the problems with core dumps is that the core files are very big and can fill up a disk quickly.)

One problem with runtime errors is that when they occur, program execution stops immediately. The buffers for buffered files are not flushed. This can lead to some unexpected surprises. Consider Example 17-5.

Example 17-5. debug/flush.cpp

#include <iostream>
int main(  )
{
    int i,j;    /* two random integers */

    i = 1;
    j = 0;
    std::cout << "Starting
";
    std::cout << "Before divide...";
    i = i / j;  // divide by zero error 
    std::cout << "After
";
    return(0);
}

When run, this program outputs:

Starting 
Floating exception (core dumped)

This might lead you to think the divide had never started, when in fact it had. What happened to the message “Before divide...”? The cout statement executed and put the message in a buffer; then the program died. The buffer never got a chance to be emptied.

By putting explicit flush-buffer commands inside the code, we get a truer picture of what is happening, as shown in Example 17-6.

Example 17-6. debug/flush2.cpp

#include <iostream>
int main(  )
{
    int i,j;    /* two random integers */

    i = 1;
    j = 0;
    std::cout << "Starting
";
    std::cout.flush(  );
    std::cout << "Before divide...";
    std::cout.flush(  );
    i = i / j;  // divide by zero error 
    std::cout << "After
";
    std::cout.flush(  );
    return(0);
}

The flush statement makes the I/O less efficient, but more current.

Optimization

And now a word on optimization: don’t. Most programs do not need to be optimized. They run fast enough. Who cares whether an interactive program takes 0.5 seconds to start up instead of 0.2?

To be fair, there are a lot of slow programs out there that can be sped up. This is usually done not by the simple optimization steps shown in this chapter, but by replacing poorly designed core algorithms with more efficient ones.

For a well-written program, the simplest way to get your program to run faster is to get a faster computer. Many times it is cheaper to buy a more powerful machine than it is to optimize a program, because you may introduce new errors into your code. Don’t expect miracles from optimization. Usually most programs can be sped up only 10 percent to 20 percent.

Profiling

In general you’ll find that your program spends 90% of its time in 10% of your code. That means that optimizing that code will give the greatest results. Most compilers come with a profile that lets you instrument your code and identify which sections are taking up the most time. Unfortunately each tool is different and a discussion of all of the is not possible in this book.

Analyzing and Optimizing code

Example 17-7 initializes a matrix (two-dimensional array). Let’s take a look at this code and see if there’s any way of making it faster

Example 17-7. matrix/matrix1.cpp

#include <cassert>
const int X_SIZE = 60;
const int Y_SIZE = 30;

int matrix[X_SIZE][Y_SIZE];

void init_matrix(  )
{
    int x,y;    // current element to initialize 

    for (x = 0; x < X_SIZE; ++x) {
        for (y = 0; y < Y_SIZE; ++y) {
            assert((x >= 0) && (x < X_SIZE));
            assert((y >= 0) && (y < Y_SIZE));
            matrix[x][y] = -1;
        }
    }
}

Register Declarations

How can this function be optimized? First we notice we are using two local variables. By using the qualifier register on these variables, we tell the compiler that they are frequently used and should be placed in fast registers instead of relatively slow main memory. The number of registers varies from computer to computer. Slow machines like the PC have 2, most Unix systems have about 11, and supercomputers can have as many as 128. It is possible to declare more register variables than you have registers. C++ will put the extra variables in main memory.

Tip

The register form of optimization has been overtaken by compiler technology. Most compilers do a better job of register allocation than you can by manually adding register hints, and they ignore any user register modifiers. However, this technique is still valid for older compilers.

The program now looks like Example 17-8.

Example 17-8. matrix/matrix2.cpp

#include <cassert>
const int X_SIZE = 60;
const int Y_SIZE = 30;

int matrix[X_SIZE][Y_SIZE];

void init_matrix(  )
{
    register int x,y;    // current element to initialize 

    for (x = 0; x < X_SIZE; ++x) {
        for (y = 0; y < Y_SIZE; ++y) {
            assert((x >= 0) && (x < X_SIZE));
            assert((y >= 0) && (y < Y_SIZE));
            matrix[x][y] = -1;
        }
    }
}

Loop ordering

The outer loop is executed 60 times. This means the overhead associated with starting the inner loop is executed 60 times. If we reverse the order of the loops, we will have to deal with the inner loop only 30 times.

In general, loops should be ordered so the innermost loop is the most complex and the outermost loop is the simplest. Example 17-9 contains the init_matrix function with the loops reordered.

Example 17-9. matrix/matrix3.cpp

#include <cassert>
const int X_SIZE = 60;
const int Y_SIZE = 30;

int matrix[X_SIZE][Y_SIZE];

void init_matrix(  )
{
    register int x,y;    // current element to initialize 

    for (y = 0; y < Y_SIZE; ++y) {
        for (x = 0; x < X_SIZE; ++x) {
            assert((x >= 0) && (x < X_SIZE));
            assert((y >= 0) && (y < Y_SIZE));
            matrix[x][y] = -1;
        }
    }
}

The power of powers of 2

Indexing an array requires a multiplication operation. For example, to execute the line:

        matrix[x][y] = -1;

the program must compute the location where we want to put the -1. To do this, the program must perform the following steps:

Get the address of the matrix.
Compute x * Y_SIZE.
Compute y.
Add up all three parts to form the address. In C++ this code looks like:

        *(matrix + (x * Y_SIZE) + y) = -1;

However, you typically won’t write matrix accesses this way because C++ handles the details. But being aware of the details can help you generate more efficient code.

Almost all C++ compilers will convert multiplication by a power of 2 (2, 4, 8, ...) into shifts, thus taking an expensive operation (multiply) and changing it into an inexpensive operation (shift).

For example:

i = 32 * j;

is compiled as:

i = j << 5; /* 2**5 == 32 */

In Example 17-9 we define Y_SIZE as 30, which is not a power of 2. By increasing Y_SIZE to 32, we waste some memory but get a faster program.

Example 17-10 shows how we can take advantage of a power of 2.

Example 17-10. matrix/matrix4.cpp

#include <cassert>
const int X_SIZE = 60;
const int Y_SIZE = 32;

int matrix[X_SIZE][Y_SIZE];

void init_matrix(  )
{
    register int x,y;    // current element to initialize 

    for (y = 0; y < Y_SIZE; ++y) {
        for (x = 0; x < X_SIZE; ++x) {
            assert((x >= 0) && (x < X_SIZE));
            assert((y >= 0) && (y < Y_SIZE));
            matrix[x][y] = -1;
        }
    }
}

Making use of pointers

Since we are initializing consecutive memory locations, we can optimize the program even further. We can initialize the matrix by starting at the first location and storing a -1 in the next X_SIZE * Y_SIZE elements. Using this method, we cut the number of loops down to one. The indexing of the matrix has changed from a standard index (matrix[x][y]), requiring a shift and add, into a pointer dereference (*matrix_ptr) and an increment (++matrix_ptr). In Example 17-11, we’ve turned our arrays into pointers.

Example 17-11. matrix/matrix5.cpp

const int X_SIZE = 60;
const int Y_SIZE = 30;

int matrix[X_SIZE][Y_SIZE];

void init_matrix(  )
{
    register int index;         // element counter 
    register int *matrix_ptr;   // Current element

    matrix_ptr = &matrix[0][0];
    for (index = 0; index < X_SIZE * Y_SIZE; ++index) {
        *matrix_ptr = -1;
        ++matrix_ptr;
    }
}

But why have both a loop counter and a matrix_ptr? Couldn’t we combine the two? In fact we can. In Example 17-12 we’ve successfully eliminated the loop counter by combining it with the array pointer.

Example 17-12. matrix/matrix6.cpp

const int X_SIZE = 60;
const int Y_SIZE = 30;

int matrix[X_SIZE][Y_SIZE];

void init_matrix(  )
{
    register int *matrix_ptr;   // Current element

    for (matrix_ptr = &matrix[0][0]; 
             matrix_ptr <= &matrix[X_SIZE-1][Y_SIZE-1];
             ++matrix_ptr) {

        *matrix_ptr = -1;
    }
}

The function is now well optimized. The only way we could make it better is to manually code it into assembly language. This might make it faster; however, assembly language is highly nonportable and very error-prone. But in this case, someone else has written a highly optimized assembly-language (usually) function that we can use to do the job, and it’s a part of the standard C++ library.

Using the System Library

The library routine memset can be used to fill a matrix or array with a single character value. We can use it to initialize the matrix in this program. Frequently used library subroutines such as memset are often coded into assembly language and may make use of special processor-dependent tricks to do the job faster than could be done in C++. In Example 17-13 we let the function memset do the work.

Example 17-13. matrix/matrix7.cpp

#include <cstring>

const int X_SIZE = 60;
const int Y_SIZE = 30;

int matrix[X_SIZE][Y_SIZE];

void init_matrix(  )
{
    std::memset(matrix, -1, sizeof(matrix));
}

Now our function consists of only a single function call. It seems a shame to have to call a function just to call another function. We have to pay for the overhead of two function calls. It would be better if we called memset from the main function. Why don’t we rewrite all the functions that call init_matrix so they use memset? Because it has several hundred init_matrix calls, and we don’t want to do all that editing.

So how do we get rid of the overhead of a function call? By making the function inline. Our final version of the function uses inline to eliminate all the call overhead. It can be seen in Example 17-14.

Example 17-14. matrix/matrix8.cpp

#include <cstring>

const int X_SIZE = 60;
const int Y_SIZE = 30;

int matrix[X_SIZE][Y_SIZE];

inline void init_matrix(  )
{
    std::memset(matrix, -1, sizeof(matrix));
}

Question 17-1: Why does memset successfully initialize the matrix to -1, but when we try to use it to set every element to 1, we fail?

#include <cstring>

const int X_SIZE = 60;
const int Y_SIZE = 30;

int matrix[X_SIZE][Y_SIZE];

inline void init_matrix(  ) {
    memset(matrix, 1, sizeof(matrix));
}

How to Optimize

Our matrix initialization function illustrates several optimizing strategies. These are:

Removing invariant code

Code that does not need to be put inside a loop should be put outside the loop. For example:

for (i = 0; i < 10; ++i) 
     matrix[i] = i + j * 10;

can be written as:

j_times_10 = j * 10;
for (i = 0; i < 10; ++i)
    matrix[i] = i + j_times_10;

Most good optimizing compilers will do this work for you if possible.

Loop ordering

Nested loops should be ordered with the simplest loop outermost and the most complex loops innermost.

Reference parameters

Use constant reference parameters (const type &) instead of constant parameters for structures, unions, and classes.

Powers of two

Use a power of two when doing integer multiply or divide. Most compilers will substitute a shift for the operation.

Pointers

Using pointers to go through an array is generally faster than using an index,, but pointers are more tricky to use.

Inline functions

Using inline functions eliminates the overhead associated with a function call. It also can make the code bigger and a little more difficult to debug. (See the case study below.)

Reduction in strength

This is a fancy way of saying use cheap operations instead of expensive ones. Table 17-1 lists the relative cost of common operations.

Table 17-1. Relative cost of operations

Operation	Relative cost
File input and output (`<<` and `>>`), including the C functions `printf` and `scanf`	1,000
`new` and `delete`	800
Trigonometric functions (sin, cos, ...)	500
Floating point (any operation)	100
Integer divide	30
Integer multiply	20
Function call	10
`assert` ^[2]	8
Simple array index	6
Shifts	5
Add/subtract	5
Pointer dereference	2
Bitwise AND, OR, NOT	1
Logical AND, OR, NOT	1
^[2]The `assert` statement can be removed by using the compile time option `-DNDEBUG`. However, these statements should be removed with care, as they provide insurance against bad things happening in your program. Sometimes the fastest way is not the best.

Tip

The C++ I/O system and the C formatting functions called using std::scanf, std::printf, and std::sscanf are extremely costly because they have to go through the format string one character at a time looking for a format conversion character (%). They then have to do a costly conversion between a character string and a number. These functions should be avoided in time-critical sections of code.

Case Study: Inline Functions Versus Normal Functions

I once worked on writing a word-processing program for a large computer manufacturer. We had a function next_char that was used to get the next character from the current file. It was used in thousands of places throughout the program. When we first tested the program with next_char written as a function, the program was unacceptably slow. Analyzing our program, we found that 90 percent of the time was spent in next_char. So we changed it to an inline function. The speed doubled; however, our code size went up 40 percent and required a memory expansion card to work. So the speed was all right, but the size was unacceptable. We finally had to write the routine as a function in hand-optimized assembly language to get both the size and the speed to acceptable levels.

Case Study: Optimizing a Color-Rendering Algorithm

I once was asked to optimize a program that did color rendering for a large picture. The problem was that the program took eight hours to process a single picture. This limited us to doing one picture a day.

The first thing I did was run the program on a machine with a floating-point accelerator. This brought the time down to about six hours. Next I got permission to use a high-speed RISC computer that belonged to another project but was currently sitting idle. That reduced the time to two hours.

I saved six hours solely by using faster machines. No code had changed yet.

Two fairly simple functions were being called only once from the innermost loop. Rewriting these functions as macros saved about 15 minutes.

Next I changed all the floating-point operations I could from floating-point to integer. The savings amounted to 30 minutes out of a 1:45 run.

Then I noticed the program was spending about 5 minutes reading an ASCII file containing a long list of floating-point numbers used in the conversion process. Knowing that scanf is an extremely expensive function, I cut the initialization process down to almost nothing by making the file binary. Total runtime was now down to 1:10.

By carefully inspecting the code and using every trick I knew, I saved another 5 minutes, leaving me 5 minutes short of my goal of an hour per run. At this point my project was refocused and the program put in mothballs for use at some future date.

Programming Exercises

Exercise 17-1: Take one of your previous programs and run it using the interactive debugger to examine several intermediate values.

Exercise 17-2: Write a matrix-multiply function. Create a test program that not only tests the function, but times it as well. Optimize the program using pointers and determine the time savings.

Exercise 17-3: Write a program to sum the elements in an array. Optimize it.

Exercise 17-4: Write a program that counts the number of bits in a character array. Optimize it through the use of register-integer variables. Time it on several different arrays of different sizes. How much time do you save?

Exercise 17-5: Write your own version of the library function memcpy. Optimize it. Most implementations of memcpy are written in assembly language and take advantage of all the quirks and tricks of the processor. How does your memcpy compare to theirs?

Answers to Chapter Questions

Answer 17-1: The problem is that memset is a character-fill routine. An integer consists of 2 or 4 bytes (characters). Each byte is assigned the value 1. So a 2-byte integer will receive the value:

    integer = 0x0101;

The 1-byte hex value for -1 is 0xFF. The 2-byte hex value of -1 is 0xFFFF. So we can take two single-byte -1 values, put them together, and come out with -1. This works for zero also. Any other number will produce the wrong answer. For example, 1 is 0x01. Two bytes of this is 0x0101, or 257.

^[1]Ghostscript is a PostScript-like interpreter available from http://www.ghostscript.com.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.