Chapter 7. High-Quality Routines


Chapter 6 described the details of creating classes. This chapter zooms in on routines, on the characteristics that make the difference between a good routine and a bad one. If you'd rather read about issues that affect the design of routines before wading into the nitty-gritty details, be sure to read Chapter 5, first and come back to this chapter later. Some important attributes of high-quality routines are also discussed in Chapter 8. If you're more interested in reading about steps to create routines and classes, Chapter 9, might be a better place to start.

Before jumping into the details of high-quality routines, it will be useful to nail down two basic terms. What is a "routine"? A routine is an individual method or procedure invocable for a single purpose. Examples include a function in C++, a method in Java, a function or sub procedure in Microsoft Visual Basic. For some uses, macros in C and C++ can also be thought of as routines. You can apply many of the techniques for creating a high-quality routine to these variants.

What is a high-quality routine? That's a harder question. Perhaps the easiest answer is to show what a high-quality routine is not. Here's an example of a low-quality routine:

What's wrong with this routine? Here's a hint: you should be able to find at least 10 different problems with it. Once you've come up with your own list, look at the following list:

  • The routine has a bad name. HandleStuff() tells you nothing about what the routine does.

  • The routine isn't documented. (The subject of documentation extends beyond the boundaries of individual routines and is discussed in Chapter 32.)

  • The routine has a bad layout. The physical organization of the code on the page gives few hints about its logical organization. Layout strategies are used haphazardly, with different styles in different parts of the routine. Compare the styles where expenseType == 2 and expenseType == 3. (Layout is discussed in Chapter 31.)

  • The routine's input variable, inputRec, is changed. If it's an input variable, its value should not be modified (and in C++ it should be declared const). If the value of the variable is supposed to be modified, the variable should not be called inputRec.

  • The routine reads and writes global variables—it reads from corpExpense and writes to profit. It should communicate with other routines more directly than by reading and writing global variables.

  • The routine doesn't have a single purpose. It initializes some variables, writes to a database, does some calculations—none of which seem to be related to each other in any way. A routine should have a single, clearly defined purpose.

  • The routine doesn't defend itself against bad data. If crntQtr equals 0, the expression ytdRevenue * 4.0 / (double) crntQtr causes a divide-by-zero error.

  • The routine uses several magic numbers: 100, 4.0, 12, 2, and 3. Magic numbers are discussed in Numbers in General.

  • Some of the routine's parameters are unused: screenX and screenY are not referenced within the routine.

  • One of the routine's parameters is passed incorrectly: prevColor is labeled as a reference parameter (&) even though it isn't assigned a value within the routine.

  • The routine has too many parameters. The upper limit for an understandable number of parameters is about 7; this routine has 11. The parameters are laid out in such an unreadable way that most people wouldn't try to examine them closely or even count them.

  • The routine's parameters are poorly ordered and are not documented. (Parameter ordering is discussed in this chapter. Documentation is discussed in Chapter 32.)

Aside from the computer itself, the routine is the single greatest invention in computer science. The routine makes programs easier to read and easier to understand than any other feature of any programming language, and it's a crime to abuse this senior statesman of computer science with code like that in the example just shown.


The class is also a good contender for the single greatest invention in computer science. For details on how to use classes effectively, see Chapter 6.

The routine is also the greatest technique ever invented for saving space and improving performance. Imagine how much larger your code would be if you had to repeat the code for every call to a routine instead of branching to the routine. Imagine how hard it would be to make performance improvements in the same code used in a dozen places instead of making them all in one routine. The routine makes modern programming possible.

"OK," you say, "I already know that routines are great, and I program with them all the time. This discussion seems kind of remedial, so what do you want me to do about it?"

I want you to understand that many valid reasons to create a routine exist and that there are right ways and wrong ways to go about it. As an undergraduate computer-science student, I thought that the main reason to create a routine was to avoid duplicate code. The introductory textbook I used said that routines were good because the avoidance of duplication made a program easier to develop, debug, document, and maintain. Period. Aside from syntactic details about how to use parameters and local variables, that was the extent of the textbook's coverage. It was not a good or complete explanation of the theory and practice of routines. The following sections contain a much better explanation.

Valid Reasons to Create a Routine

Here's a list of valid reasons to create a routine. The reasons overlap somewhat, and they're not intended to make an orthogonal set.

Valid Reasons to Create a Routine

Reduce complexity. The single most important reason to create a routine is to reduce a program's complexity. Create a routine to hide information so that you won't need to think about it. Sure, you'll need to think about it when you write the routine. But after it's written, you should be able to forget the details and use the routine without any knowledge of its internal workings. Other reasons to create routines—minimizing code size, improving maintainability, and improving correctness—are also good reasons, but without the abstractive power of routines, complex programs would be impossible to manage intellectually.

One indication that a routine needs to be broken out of another routine is deep nesting of an inner loop or a conditional. Reduce the containing routine's complexity by pulling the nested part out and putting it into its own routine.

Introduce an intermediate, understandable abstraction. Putting a section of code into a well-named routine is one of the best ways to document its purpose. Instead of reading a series of statements like

if ( node <> NULL ) then
   while ( <> NULL ) do
      node =
      leafName =
   end while
   leafName = ""
end if

you can read a statement like this:

leafName = GetLeafName( node )

The new routine is so short that nearly all it needs for documentation is a good name. The name introduces a higher level of abstraction than the original eight lines of code, which makes the code more readable and easier to understand, and it reduces complexity within the routine that originally contained the code.

Avoid duplicate code. Undoubtedly the most popular reason for creating a routine is to avoid duplicate code. Indeed, creation of similar code in two routines implies an error in decomposition. Pull the duplicate code from both routines, put a generic version of the common code into a base class, and then move the two specialized routines into subclasses. Alternatively, you could migrate the common code into its own routine, and then let both call the part that was put into the new routine. With code in one place, you save the space that would have been used by duplicated code. Modifications will be easier because you'll need to modify the code in only one location. The code will be more reliable because you'll have to check only one place to ensure that the code is right. Modifications will be more reliable because you'll avoid making successive and slightly different modifications under the mistaken assumption that you've made identical ones.

Support subclassing. You need less new code to override a short, well-factored routine than a long, poorly factored routine. You'll also reduce the chance of error in subclass implementations if you keep overrideable routines simple.

Hide sequences. It's a good idea to hide the order in which events happen to be processed. For example, if the program typically gets data from the user and then gets auxiliary data from a file, neither the routine that gets the user data nor the routine that gets the file data should depend on the other routine's being performed first. Another example of a sequence might be found when you have two lines of code that read the top of a stack and decrement a stackTop variable. Put those two lines of code into a PopStack() routine to hide the assumption about the order in which the two operations must be performed. Hiding that assumption will be better than baking it into code from one end of the system to the other.

Hide pointer operations. Pointer operations tend to be hard to read and error prone. By isolating them in routines, you can concentrate on the intent of the operation rather than on the mechanics of pointer manipulation. Also, if the operations are done in only one place, you can be more certain that the code is correct. If you find a better data type than pointers, you can change the program without traumatizing the code that would have used the pointers.

Improve portability. Use of routines isolates nonportable capabilities, explicitly identifying and isolating future portability work. Nonportable capabilities include nonstandard language features, hardware dependencies, operating-system dependencies, and so on.

Simplify complicated boolean tests. Understanding complicated boolean tests in detail is rarely necessary for understanding program flow. Putting such a test into a function makes the code more readable because (1) the details of the test are out of the way and (2) a descriptive function name summarizes the purpose of the test.

Giving the test a function of its own emphasizes its significance. It encourages extra effort to make the details of the test readable inside its function. The result is that both the main flow of the code and the test itself become clearer. Simplifying a boolean test is an example of reducing complexity, which was discussed earlier.

Improve performance. You can optimize the code in one place instead of in several places. Having code in one place will make it easier to profile to find inefficiencies. Centralizing code into a routine means that a single optimization benefits all the code that uses that routine, whether it uses it directly or indirectly. Having code in one place makes it practical to recode the routine with a more efficient algorithm or in a faster, more efficient language.

To ensure all routines are small? No. With so many good reasons for putting code into a routine, this one is unnecessary. In fact, some jobs are performed better in a single large routine. (The best length for a routine is discussed in How Long Can a Routine Be?)


For details on information hiding, see "Hide Secrets (Information Hiding)" in Design Building Blocks: Heuristics.

Operations That Seem Too Simple to Put Into Routines

Operations That Seem Too Simple to Put Into Routines

One of the strongest mental blocks to creating effective routines is a reluctance to create a simple routine for a simple purpose. Constructing a whole routine to contain two or three lines of code might seem like overkill, but experience shows how helpful a good small routine can be.

Small routines offer several advantages. One is that they improve readability. I once had the following single line of code in about a dozen places in a program:

Example 7-2. Pseudocode Example of a Calculation

points = deviceUnits * ( POINTS_PER_INCH / DeviceUnitsPerInch() )

This is not the most complicated line of code you'll ever read. Most people would eventually figure out that it converts a measurement in device units to a measurement in points. They would see that each of the dozen lines did the same thing. It could have been clearer, however, so I created a well-named routine to do the conversion in one place:

Example 7-3. Pseudocode Example of a Calculation Converted to a Function

Function DeviceUnitsToPoints ( deviceUnits Integer ): Integer
   DeviceUnitsToPoints = deviceUnits *
      ( POINTS_PER_INCH / DeviceUnitsPerInch() )
End Function

When the routine was substituted for the inline code, the dozen lines of code all looked more or less like this one:

Example 7-4. Pseudocode Example of a Function Call to a Calculation Function

points = DeviceUnitsToPoints( deviceUnits )

This line is more readable—even approaching self-documenting.

This example hints at another reason to put small operations into functions: small operations tend to turn into larger operations. I didn't know it when I wrote the routine, but under certain conditions and when certain devices were active, DeviceUnitsPerlnch() returned 0. That meant I had to account for division by zero, which took three more lines of code:

Pseudocode Example of a Calculation That Expands Under Maintenance
Function DeviceUnitsToPoints( deviceUnits: Integer ) Integer;
   if ( DeviceUnitsPerInch() <> 0 )
      DeviceUnitsToPoints = deviceUnits *
         ( POINTS_PER_INCH / DeviceUnitsPerInch() )
      DeviceUnitsToPoints = 0
   end if
End Function

If that original line of code had still been in a dozen places, the test would have been repeated a dozen times, for a total of 36 new lines of code. A simple routine reduced the 36 new lines to 3.

Summary of Reasons to Create a Routine

Here's a summary list of the valid reasons for creating a routine:

  • Reduce complexity

  • Introduce an intermediate, understandable abstraction

  • Avoid duplicate code

  • Support subclassing

  • Hide sequences

  • Hide pointer operations

  • Improve portability

  • Simplify complicated boolean tests

  • Improve performance

In addition, many of the reasons to create a class are also good reasons to create a routine:

  • Isolate complexity

  • Hide implementation details

  • Limit effects of changes

  • Hide global data

  • Make central points of control

  • Facilitate reusable code

  • Accomplish a specific refactoring

