C H A P T E R  12

Code Construction

Mostly, when you see programmers, they aren’t doing anything. One of the attractive things about programmers is that you cannot tell whether or not they are working simply by looking at them. Very often they’re sitting there seemingly drinking coffee and gossiping, or just staring into space. What the programmer is trying to do is get a handle on all the individual and unrelated ideas that are scampering around in his head.

—Charles M. Strauss

Great software, likewise, requires a fanatical devotion to beauty. If you look inside good software, you find that parts no one is ever supposed to see are beautiful too. I’m not claiming I write great software, but I know that when it comes to code I behave in a way that would make me eligible for prescription drugs if I approached everyday life the same way. It drives me crazy to see code that’s badly indented, or that uses ugly variable names.

—Paul Graham, “Hackers and Painters,” 2003

Well, finally we’re getting to the real heart of software development – writing the code. The assumption here is that you already do know how to write code in at least one programming language; this chapter presents examples in a few languages, each chosen for the appropriate point being made. The purpose of this chapter is to provide some tips for writing better code. Because we can all write better code.

For plan-driven process folks (see Chapter 2), coding is the tail that wags the development-process dog. Once you finish detailed requirements, architecture, and detailed design, the code should just flow out of the final design, right? Not. In 20 years of industry software development experience, I never saw this happen. Coding is hard; translating even a good, detailed design into code takes a lot of thought, experience, and knowledge, even for small programs. Depending on the programming language you are using and the target system, programming can be a very time-consuming and difficult task. On the other hand, for very large projects that employ dozens or even hundreds of developers, having a very detailed design is critical to success, so don’t write off the plan-driven process just yet.

For the agile development process folks, coding is it. The agile manifesto (http://agilemanifesto.org) says it at the very beginning, “Working software over comprehensive documentation.” Agile developers favor creating code early and often; they believe in delivering software to their customers frequently, and using feedback from the customers to make the code better. They welcome changes in requirements and see them as an opportunity to refactor the code and make the product more usable for their customer. This doesn’t mean that coding gets any easier when using an agile process; it means that your focus is different. Rather than focus on requirements and design and getting them nailed down as early as possible, in agile processes you focus on delivering working code to your customer as quickly and as often as possible. You change the code often, and the entire team owns all the code and so has permission to change anything if it’s appropriate.

Your code has two audiences:

  • The machine that’s the target of the compiled version of the code, what will actually get executed.
  • The people, including yourself, who will read it in order to understand it and modify it.

To those ends, your code needs to fulfill the requirements, implement the design, and also be readable and easy to understand. We’ll be focusing on the readability and understandability parts of these ends first, and then look at some issues related to performance and process. This chapter will not give you all the hints, tips, and techniques for writing great code; there are entire books for that, some of which are in the references at the end of this chapter. Good luck!

Before we continue, I’d be remiss if I didn’t suggest the two best books on coding around. The first is Steve McConnell’s Code Complete 2: A Practical Handbook of Software Construction, a massive, 960-page, tome that takes you through what makes good code.1 McConnell discusses everything from variable names, to function organization, to code layout, to defensive programming, to controlling loops. It is in McConnell’s book where the “software construction” metaphor comes from. The metaphor suggests that building a software application is similar to constructing a building. Small buildings (Fido’s dog house, for example) are easier to build, require less planning, and are easier to change (refactor) if something goes wrong. Larger buildings (your house) require more detail, more planning, and more coordination largely because it’s more than a one-person job. Really big buildings (skyscrapers) require many detailed levels of both design and planning, close coordination, and many processes to handle change and errors. Although the building construction model isn’t perfect – it doesn’t handle incremental development well and McConnell also talks about an accretion model where one layer of software is added to an existing layer much like a pearl is created in an oyster – the metaphor gives you a clear view of the idea that software gets much more complicated and difficult to build, the larger it gets.

The second classic book is Hunt and Thomas’, The Pragmatic Programmer.2 The book is organized as 46 short sections containing 70 tips that provide a clear vision of how you should act as a programmer. It provides practical advice on a range of topics from source code control, to testing, to assertions, to the DRY principle, some of which we’ll cover later in this chapter. Hunt and Thomas themselves do the best job of describing what the book and what pragmatic programming is all about,

__________

1 McConnell, S. Code Complete 2: A Practical Handbook of Software Construction. Redmond, WA, Microsoft Press, 2004).

2 Hunt, A. and D. Thomas. The Pragmatic Programmer: From Journeyman to Master. (Boston, MA: Addison-Wesley, 2000).

Programming is a craft. At its simplest, it comes down to getting a computer to do what you want it to do (or what your user wants it to do). As a programmer, you are part listener, part advisor, part interpreter, and part dictator. You try to capture elusive requirements and find a way of expressing them so that a mere machine can do them justice. You try to document your work so that others can understand it, and you try to engineer your work so that others can build on it. What’s more, you try to do all this against the relentless ticking of the project clock. You work small miracles every day. It’s a difficult job.3

A coding example

In Code Complete, Steve McConnell gives an example of bad code that is worth examining so we can begin to see what the issues of readability, usability, and understandability are about. I’ve converted it from C++ to Java, but the example is basically McConnell’s.4 Here’s the code; we’ll look for what’s wrong with it.

void HandleStuff(CORP_DATA inputRec, int crntQtr, EMP_DATA empRec, Double estimRevenue,
   double ytdRevenue, int screenx, int screeny, Color newColor, Color prevColor, StatusType
   status, int expenseType) {
int i;
for ( i = 0; i < 100; i++ )
        {
        inputRec.revenue[i] = 0;
        inputRec.expense[i] = corpExpense[crntQtr][i];
        }
UpdateCorpDatabase( empRec );
estimRevenue = ytdRevenue * 4.0 / (double) crntQtr;
newColor = prevColor;
status = SUCCESS;
if ( expenseType == 1 ) {
        for ( i = 0; i < 12; i++ )
                profit[i] = revenue[i] – expense.type1[i];
        }
else if ( expenseType == 2 ) {
                profit[i] = revenue[i] – expense.type2[i];
        }
else if ( expenseType == 3 )
                profit[i] = revenue[i] – expense.type3[i];
                }

So what’s wrong with this code? Well, what isn’t? Let’s make a list

  • Because this is Java, it should have a visibility modifier. No, it’s not required, but you should always put one in. You are not writing for the compiler here, you are writing for the human. Visibility modifiers make things explicit for the human reader.

__________

3 Hunt, 2000.

4 McConnell, 2004, p. 162.

  • The method name is terrible. HandleStuff doesn’t tell you anything about what the method does.
  • Oh, and the method does too many things. It seems to compute something called profit based on an expenseType. But it also seems to change a color and indicate a success. Methods should be small. They should do just one thing.
  • Where are the comments? There is no indication of what the parameters are or what the method is supposed to do. All methods should tell you at least that.
  • The layout is just awful. And it’s not consistent. The indentation is wrong. Sometimes the curly braces are part of the statement, and sometimes they’re separators. And are you sure that that last right curly brace really ends the method?
  • The method doesn’t protect itself from bad data. If the crntQtr variable is zero, then the division in line 8 will return a divide-by-zero exception.
  • The method uses magic numbers including 100, 4.0, 12, 2, and 3. Where do they come from? What do they mean? Magic numbers are bad.
  • The method has way too many input parameters. If we knew what the method was supposed to do maybe we could change this.
  • There are also at least two input parameters – screenx and screeny – that aren’t used at all. This is an indication of poor design; this method’s interface may be used for more than one purpose and so it is “fat,” meaning it has to accommodate all the possible uses.
  • The variables corpExpense and profit are not declared inside the method, so they are either instance variables or class variables. This can be dangerous. Because instance and class variables are visible inside every method in the class, we can also change their values inside any method, generating a side effect. Side effects are bad.
  • Finally, the method doesn’t consistently adhere to the Java naming conventions. Tsk, tsk.

So this example is terrible code for a bunch of different reasons. In the rest of the chapter we’ll take a look at the general coding rules that are violated here and give suggestions for how to make your code correct, readable, and maintainable.

Functions and Methods and Size, Oh My!

First things first. Your classes, functions, and methods should all do just one thing. This is the fundamental idea behind encapsulation. Having your methods do just one thing isolates errors and makes them easier to find. It encourages re-use because small, single feature methods are easier to use in different classes. Single feature (and single layer of abstraction) classes are also easier to re-use.

Single feature implies small. Your methods/functions should be small. And I mean small – 20 lines of executable code is a good upper bound for a function. Under no circumstances should you write 300 line functions. I know, I’ve done it. It’s not pretty. Back in Chapter 7 we talked about stepwise refinement and modular decomposition. Taking an initial function definition and re-factoring it so that it does just a single small thing will decompose your function into two or more smaller, easier to understand and easier to maintain functions. Oh, and as we’ll see in Chapter 14, smaller functions are easier to test because they require fewer unit tests (they have fewer ways to get through the code). As the book said, Small is Beautiful.

Formatting, Layout, and Style

Formatting, layout, and style are all related to how your code looks on the page. It turns out that, as we saw above, that how your code looks on the page is also related to its correctness. McConnell’s Fundamental Theorem of Formatting says “good visual layout shows the logical structure of a program.”5 Good visual layout not only makes the program more readable, it helps reduce the number of errors because it shows how the program is structured. The converse is also true; a good logical structure is easier to read. So the objectives of good layout and formatting should be

  • to accurately represent the logical structure of your program;
  • to be consistent so there are few exceptions to whatever style of layout you’ve chosen;
  • to improve readability for humans; and
  • to be open to modifications. (You do know you’re code is going to be modified, right?)

General Layout Issues and Techniques

Most layout issues have to do with laying out blocks of code; there are different types of block layout, some of which are built into languages, some you get to choose on your own. The three most prevalent kinds of block layouts are built-in block boundaries, begin-end block boundaries, and emulating built-in blocks.

Some languages have built-in block boundaries for every control structure in the language. In this case you have no choice; because the block boundary element is a language feature you must use it. Languages that have built-in block boundaries include Ada, PL/1, Lisp and Scheme, and Visual Basic. As an example, an if-then statement in Visual Basic looks like

if income > 25000 then
        statement1
        statement2
else
        statement3
        …
end if

You can’t write a control structure in Visual Basic without using the ending block element, so blocks are easier to find and distinguish.

__________

5 McConnell, 2004.

But, most languages don’t have built-in block boundary lexical elements. Most languages use a begin-end block boundary requirement. With this requirement, a block is a sequence of zero or more statements (where a statement has a particular definition) that is delimited by begin and end lexical elements. The most typical begin and end elements are the keywords begin and end, or left and right curly braces { and }. So, for example

Pascal:

if income > 25000 then
        begin
                statement1;
                statement2
        end
else
        statement3;

C/C++/Java:

if (income > 25000)
{
        statement1;
        statement2;
} else
        statement3;

Note in both examples that a single statement is considered a block and does not require the block delimiter elements. Note also in Pascal the semi-colon is the statement separator symbol, so is required between statements, but because else and end are not the end of a statement, you don’t use a semi-colon right before else or end (confused? most people are); in C, C++, and Java, the semi-colon is the statement terminator symbol, and must be at the end of every statement. This is easier to remember and write; you just pretty much put a semi-colon everywhere except after curly braces. Simplicity is good.

Finally, when we format a block we can try to emulate the built-in block boundary in languages that don’t have it by requiring that every block use the block delimiter lexical elements.

C/C++/Java:

if (income > 25000) {
        statement1;
        statement2;
} else {
        statement3;
}

In this example, we want to pretend that the left and right curly braces are part of the control structure syntax, and so we use them to delimit the block, no matter how large it is. To emphasize that the block delimiter is part of the control structure, we put it on the same line as the beginning of the control statement. We can then line up the closing block boundary element with the beginning of the control structure. This isn’t a perfect emulation of the built-in block element language feature, but it comes pretty close and has the advantage that you’re less likely to run into problems with erroneous indentation like the following:

C/C++/Java:

if (income > 25000)
        statement1;
        statement2;
        statement3;

In this example, the erroneous indentation for statement2 and statement3 can lead the reader to believe that they are part of the if statement. The compiler is under no such illusions.

Overall, using an emulating block-boundaries style works very well, is readable, and clearly illustrates the logical structure of your program. It’s also a great idea to put block boundaries around every block, including just single statement blocks. That lets you eliminate the possibility of the erroneous indentation error above. So if you say

if (income > 25000) {
        statement1;
}

it’s then clear that in

if (income > 25000) {
        statement1;
}
        statement2;
        statement3;

that statement2 and statement3 are not part of the block, regardless of their indentation. It also means that you can now safely add extra statements to the block without worrying about whether they are in the block or not

if (income > 25000) {
        statement1;
        statement2;
        statement3;
        statement4;
        statement5;
}

White Space

White space is your friend. You wouldn’t write a book with no spaces between words, or line breaks between paragraphs, or no chapter divisions, would you? Then why would you write code with no white space? White space allows you to logically separate parts of the program and to line up block separators and other lexical elements. It also lets your eyes rest between parts of the program. Resting your eyes is a good thing. The following are some suggestions on the use of white space:

  • Use blank lines to separate groups (just like paragraphs).
  • Within a block align all the statements to the same tab stop (the default tab width is normally four spaces).
  • Use indentation to show the logical structure of each control structure and block.
  • Use spaces around operators.
  • In fact, use spaces around array references and function/method arguments as well.
  • Do not use double indentation with begin-end block boundaries.

Block and Statement Style Guidelines

As mentioned previously, the “emulating block boundaries” style works well for most block-structured languages.

  • Use more parentheses than you think you’ll need. I especially use parentheses around all my arithmetic expressions – mostly just to make sure I haven’t screwed up the precedence rules.
    fx = ((a + b) * (c + d)) / e;
  • Format single statement blocks consistently. Using the emulating block-boundaries technique:
    if (average > MIN_AVG) {
         avg = MIN_AVG;
    }
  • For complicated conditional expressions, put separate conditions on separate lines.
    if ((’0’ <= inChar && inChar <= ’9’) ||
         (’a’ <= inChar && inChar <= ’z’) ||
         (’A’ <= inChar && inChar <= ’Z’)) {
              mytext.addString(inChar);
              mytext.length++;
    }
  • Wrap individual statements at column 70 or so. This is a holdover from the days of 80-column punch cards, but it’s also a great way to make your code more readable. Having very long lines of code forces your readers to scroll horizontally, or it makes them forget what the heck was at the beginning of the line!
  • Don’t use goto, no matter what Don Knuth says.6 Some languages, like Java, don’t even have goto statements. Most don’t need them (assembly languages excepted). Take the spirit of Knuth’s paper and only use gotos where they make real sense and make your program more readable and understandable.
  • Use only one statement per line. (Do not write code as if you were entering the annual International Obfuscated C Code Contest! www.ioccc.org.) This
    g.setColor(Color.blue); g.fillOval(100, 100, 200, 200);
    mytext.addString(inChar);mytext.length++;System.out.println();

__________

6 Knuth, D. “Structured Programming with goto Statements.” ACM Computing Surveys 6(4): 261-301. 1974.

  • is legal, but just doesn’t look good, and it’s easy to just slide right over that statement in the middle. This
    g.setColor(Color.blue);
    g.fillOval(100, 100, 200, 200);

    mytext.addString(inChar);
    mytext.length++
    System.out.println();
  • looks much, much better.

Declaration Style Guidelines

Just like in writing executable code, your variable declarations need to be neat and readable.

  • Use only one declaration per line. Well, I go both ways on this one. While I think that
    int max,min,top,left,right,average,bottom,mode;
  • is a bit crowded; I’d rewrite this as
    int max, min;
    int top, bottom;
    int left, right;
    int average, mode;
  • Not one per line, but the variables that are related are grouped together. That makes more sense to me.
  • Declare variables close to where they are used. Most procedural and object-oriented programming languages have a declaration before use rule, requiring that you declare a variable before you can use it in any expression. In the olden days, say in Pascal, you had to declare variables at the top of your program (or subprogram) and you couldn’t declare variables inside blocks. This had the disadvantage that you might declare a variable pages and pages before you’d actually use it. (But see the section later in this chapter where I talk about how long your functions should be.)
  • These days you can normally declare variables in any block in your program. The scope of that variable is the block in which it is declared and all the blocks inside that block.
  • This tip says that it’s a good idea to declare those variables in the closest block in which they are used. That way you can see the declaration and the use the variables right there.
  • Order declarations sensibly
    • Group by types and usage (see the previous example).
    • Use white space to separate your declarations. Once again, white space is your friend. The key idea in these last couple of tips is to make your declarations visible and to keep them near the code where they will be used.
  • Don’t nest header files – ever! (This is for you C and C++ programmers.) Header files are designed so that you only need to define constants, declare global variables, and declare function prototypes once, and you can then re-use the header file in some (possibly large) number of source code files. Nesting header files hides some of those declarations inside the nested headers. This is bad – because visibility is good. It allows you to erroneously include a header file more than once, which leads to redefinitions of variables and macros and errors.

    The only header files you might nest in your own header files are system headers like stdio.h or stdlib.h and I’m not even sure I like that.

  • Don’t put source code in your header files – ever! (Again, this is for you C and C++ programmers.) Headers are for declarations, not for source code. Libraries are for source code. Putting a function in a header file means that the function will be re-defined every place you include the header. This can easily lead to multiple definitions – which the compiler may not catch until the link phase. The only source that should be in your headers are macro definitions in #define pre-processor statements and even those should be used carefully.

Commenting Style Guidelines

Just like white space, comments are your friend. Every programming book in existence tells you to put comments in your code – and none of them (including this one) tell you just where to put comments, and what a good comment should look like. That’s because how to write good, informative comments falls in the “it depends” category of advice. A good, informative comment depends on the context in which you are writing it, so general advice is pretty useless. The only good advice about writing comments is – just do it. Oh, and since you’ll change your code – do it again. That’s the second hardest thing about comments – keeping them up to date. So here’s my piece of advice, write comments when you first write your program. This gives you an idea of where they should be. Then, when you finish your unit testing of a particular function, write a final set of comments for that function by updating the ones that are already there. That way, you’ll come pretty close to having an up-to-date set of comments in the released code.

  • Indent a comment with its corresponding statement. This is important for readability because then the comment and the code line up.
    /* make sure we have the right number of arguments */
    if (argc < 2) {
        fprintf(stderr, "Usage: %s <filename> ", argv[0]);
        exit(1);
    }
  • Set off block comments with blank lines. Well, I go both ways on this one. If you line up the start and end of the block comments on lines by themselves, then you don’t need the blank lines. If, on the other hand, you stick the end of comment marker at the end of a line, you should use a blank line to set it apart from the source code. So if you do this
    /*
     * make sure we have the right number of arguments
     * from the command line
     */
    if (argc < 2) {
        fprintf(stderr, "Usage: %s <filename> ", argv[0]);
        exit(1);
    }
    • you don’t need the blank line; but if you do
      /* make sure we have the right number of arguments
         from the command line */

      if (argc < 2) {
          fprintf(stderr, "Usage: %s <filename> ", argv[0]);
          exit(1);
      }
    • then you do (but I wouldn’t recommend this style in the first place).
  • Don’t let comments wrap – use block comments instead. This usually occurs if you tack a comment onto the end of a line of source code
    if (argc < 2) { // make sure we have the right number of arguments from the
    command line
  • Don’t do this. Make this a block comment above the if statement instead (see the bullet point above). It’s just way easier to read.
  • All functions/methods should have a header block comment. The purpose of this bit of advice is so that your reader knows what the method is supposed to do. The necessity of this is mitigated if you use good identifier names for the method name and the input parameters. Still, you should tell the user what the method is going to do and what the return values, if any are. See the tip below for the version of this advice for Java programmers. In C++ we can say:
    #include <string>
    /*
     * getSubString() - get a substring from the input string.
     *  The substring starts at index start
     *  and goes up to but doesn’t include index stop.
     *  returns the resulting substring.
     */
    string getSubString(string str, int start, int stop) { }
  • In Java use JavaDoc comments for all your methods. JavaDoc is built into the Java environment and all Java SDKs come with the program to generate JavaDoc web pages, so why not use it? JavaDoc can provide a nice overview of what your class is up to at very little cost. Just make sure and keep those comments up to date!
    /**
     * getSubString() - get a substring from the input string.
     *      The substring starts at index start
     *      and goes up to but doesn’t include index stop.
     *  @param str the input string
     *  @param start the integer starting index
     *  @param stop the integer stopping index
     *  @return the resulting substring.
     */
    String getSubString(String str, int start, int stop) { }
  • Use fewer, but better comments. This is one of those useless motherhood and apple pie pieces of advice that everyone feels obliged to put in any discussion of comments. OK, so you don’t need to comment ever line of code. Everyone knows that
    index = index + 1;      // add one to index
    • is pretty stupid. So don’t do it. Enough said.
  • “Self-documenting code” is an ideal. Self documenting code is the Holy Grail of lazy programmers who don’t want to take the time to explain their code to readers. Get over it. Self documenting code is the Platonic ideal of coding that assumes that everyone who reads your code can also read your mind. If you have an algorithm that’s at all complicated, or input that is at all obscure, you need to explain it. Don’t depend on the reader to grok every subtlety of your code. Explain it. Just do it.

Identifier Naming Conventions

As Rob Pike puts it so well in his terrific white paper on programming style, “Length is not a virtue in a name; clarity of expression is.”7 As Goldilocks would put it, you need identifier names that are not too long, not too short, but just right. Just like comments, this means different things to different people. Common sense and readability should rule.

  • All identifiers should be descriptive.
    Remember, someday you may be back to look at your code again. Or, if you’re working for a living, somebody else will be looking at your code. Descriptive identifiers make it much, much easier to read your code and figure out what you were trying to do at 3:00 AM. A variable called interestRate is much easier to understand than ir. Sure, ir is shorter and faster to type, but believe me, you’ll forget what it stood for about 10 minutes after you ship that program. Reasonably descriptive identifiers can save you a lot of time and effort.
  • OverlyLongVariableNamesAreHardToRead (and type)
    On the other hand, don’t make your identifiers too long. For one thing they are hard to read, for another they don’t really add anything to the context of your program, they use up too much space on the page, and finally, they’re just plain ugly.

__________

7 Pike, Rob, Notes on Programming in C, retrieved from http://www.literateprogramming.com/pikestyle.pdf on 29 September 2010. 1999.

  • Andtheyareevenharderwhenyoudontincludeworddivisions
    Despite what Rob Pike says [Pike80, p. 2], using camel case (those embedded capital letters that begin new words in your identifiers) can make your code easier to read. Especially if the identifier isn’t overly long. At least to me, maxPhysAddr is easier to read than maxphysaddr.
  • And single-letter variable names are cryptic, but useful.

    Using single letter variable names for things like mortgage payments, window names, or graphics objects is not a good example of readability. M, w, and g don’t mean anything even in the context of your code. mortpmnt, gfxWindow, gfxObj have more meaning. The big exception here is variables intended as index values – loop control variables and array index variables. Here, i, j, k, l, m, etc. are easily understandable, although I wouldn’t argue about using index, or indx instead.

    for (int i = 0; i < myArray.length; i++) {
        myArray[i] = 0;
    }
    • looks much better and is just as understandable as
      for (int arrayIndex = 0; arrayIndex < myArray.length; arrayIndex++) {
          myArray[arrayIndex] = 0;
      }
  • Adhere to the programming language naming conventions when they exist.
    • Somewhere, sometime, you’ll run into a document called Style Guide or something like that. Nearly every software development organization of any size has one. Sometimes you’re allowed to violate the guidelines, and sometimes during a code review you’ll get dinged for not following the guidelines and have to change your code.
    • If you work in a group with more than one developer, style guidelines are a good idea. They give all your code a common look and feel and they make it easier for one developer to make changes to code written by somebody else.
    • A common set of guidelines in a Style Guide is about naming conventions. Naming conventions tell you what your identifier names should look like for each of the different kind of identifiers. Java has a common set of naming conventions:
    • For classes and interfaces: The identifier names should be nouns, using both upper and lowercase alphanumerics and with the first character of the name capitalized.
      public class Automobile {}
      public interface Shape {}
    • For methods: The identifier names should be verbs, using both upper and lowercase alphanumerics and with the first character of the name in lower case.
      private double computeAverage(int [] list)
    • For variables: The identifier names should use both upper and lowercase alphanumerics, with the first character of the name in lower case. Variable names should not start with $ or _ (underscore).
      double average;
      String firstSentence;
    • For all identifiers (except constants), camel case should be used, so that internal words are capitalized.
      long myLongArray;
    • For constants: All letters should be uppercase and words should be separated by underscores.
      static final int MAX_WIDTH = 80;

Defensive Programming

By defensive programming we mean that your code should protect itself from bad data. The bad data can come from user input via the command line, a graphical text box or form, or a file. Bad data can also come from other routines in your program via input parameters like in the first example above.

How do you protect your program from bad data? Validate! As tedious as it sounds, you should always check the validity of data that you receive from outside your routine. This means you should check the following

  • Check the number and type of command line arguments.
  • Check file operations.
    • Did the file open?
    • Did the read operation return anything?
    • Did the write operation write anything?
    • Did we reach EOF yet?
  • Check all values in function/method parameter lists.
    • Are they all the correct type and size?
  • You should always initialize variables and not depend on the system to do the initialization for you.

What else should you check for? Well, here’s a short list:

  • Null pointers (references in Java)
  • Zeros in denominators
  • Wrong type
  • Out of range values

As an example, here’s a C program that takes in a list of house prices from a file and computes the average house price from the list. The file is provided to the program from the command line.

/*
 * program to compute the average selling price of a set of homes.
 * Input comes from a file that is passed via the command line.
 * Output is the Total and Average sale prices for
 * all the homes and the number of prices in the file.
 *
 * jfdooley
 */
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char **argv)
{
        FILE *fp;
        double totalPrice, avgPrice;
        double price;
        int numPrices;

        /* check that the user entered the correct number of args */
        if (argc < 2) {
                fprintf(stderr,"Usage: %s <filename> ", argv[0]);
                exit(1);
        }

        /* try to open the input file */
        fp = fopen(argv[1], "r");
        if (fp == NULL) {
                fprintf(stderr, "File Not Found: %s ", argv[1]);
                exit(1);
        }
        totalPrice = 0.0;
        numPrices = 0;

        while (!feof(fp)) {
                fscanf(fp, "%10lf ", &price);
                totalPrice += price;
                numPrices++;
        }

        avgPrice = totalPrice / numPrices;
        printf("Number of houses is %d ", numPrices);
        printf("Total Price of all houses is $%10.2f ", totalPrice);
        printf("Average Price per house is $%10.2f ", avgPrice);

        return 0;
}

Assertions Can Be Your Friend

Defensive programming means that using assertions is a great idea if your language supports them. Java, C99, and C++ all support assertions. Assertions will test an expression that you give them and if the expression is false, it will throw an error and normally abort the program. You should use error handling code for errors you think might happen – erroneous user input, for example – and use assertions for errors that should never happen – off by one errors in loops, for example. Assertions are great for testing your program, but because you should remove them before giving programs to customers (you don’t want the program to abort on the user, right?) they aren’t good to use to validate input data.

Exceptions and Error Handling

We’ve talked about using assertions to handle truly bad errors, ones that should never occur in production. But what about handling “normal” errors? Part of defensive programming is to handle errors in such a way that no damage is done to any data in the program or the files it uses, and so that the program stays running for as long as possible (making your program robust).

Let’s look at exceptions first. You should take advantage of built-in exception handling in whatever programming language you’re using. The exception handling mechanism will give you information about what bad thing has just happened. It’s then up to you to decide what to do. Normally in an exception handling mechanism you have two choices, handle the exception yourself, or pass it along to whoever called you and let them handle it. What you do and how you do it depends on the language you’re using and the capabilities it gives you. We’ll talk about exception handling in Java later.

Error Handling

Just like with validation, you’re most likely to encounter errors in input data, whether it’s command line input, file handling, or input from a graphical user interface form. Here we’re talking about errors that occur at run time. Compile time and testing errors are covered in the next chapter on debugging and testing. Other types of errors can be data that your program computes incorrectly, errors in other programs that interact with your program, the operating system for instance, race conditions, and interaction errors where your program is communicating with another and your program is at fault.

The main purpose of error handling is to have your program survive and run correctly for as long as possible. When it gets to a point where your program cannot continue, it needs to report what is wrong as best as it can and then exit gracefully. Exiting is the last resort for error handling. So what should you do? Well, once again we come to the “it depends” answer. What you should do depends on what your program’s context is when the error occurs and what its purpose is. You won’t handle an error in a video game the same way you handle one in a cardiac pacemaker. In every case, your first goal should be – try to recover.

Trying to recover from an error will have different meanings in different programs. Recovery means that your program needs to try to either ignore the bad data, fix it, or substitute something else that is valid for the bad data. See McConnell8 for a further discussion of error handling. Here are a few examples of how to recover from errors,

  • You might just ignore the bad data and keep going, using the next valid piece of data. Say your program is a piece of embedded software in a digital pressure gauge. You sample the sensor that returns the pressure 60 times a second. If the sensor fails to deliver a pressure reading once, should you shut down the gauge? Probably not; a reasonable thing to do is just skip that reading and set up to read the next piece of data when it arrives. Now if the pressure sensor skips several readings in a row, then something might be wrong with the sensor and you should do something different (like yell for help).

__________

8 McConnell, 2004.

  • You might substitute the last valid piece of data for a missing or wrong piece. Taking the digital pressure gauge again, if the sensor misses a reading, since each time interval is only a sixtieth of a second, it’s likely that the missing reading is very close to the previous reading. In that case you can substitute the last valid piece of data for the missing value.
  • There may be instances where you don’t have any previously recorded valid data. Your application uses an asynchronous event handler, so you don’t have any history of data, but your program knows that the data should be in a particular range. Say you’ve prompted the user for a salary amount and the value that you get back is a negative number. Clearly no one gets paid a salary of negative dollars, so the value is wrong. One way (probably not the best) to handle this error is to substitute the closest valid value in the range, in this case a zero. Although not ideal, at least your program can continue running with a valid data value in that field.
  • In C programs, nearly all system calls and most of the standard library functions return a value. You should test these values! Most functions will return values that indicate success (a non-negative integer) or failure (a negative integer, usually -1). Some functions return a value that indicates how successful they were. For example, the printf() family of functions returns the number of characters printed, and the scanf() family returns the number of input elements read. Most C functions also set a global variable named errno that contains an integer value that is the number of the error that occurred. The list of error numbers is in a header file called errno.h. A zero on the errno variable indicates success. Any other positive integer value is the number of the error that occurred. Because the system tells you two things, (1) an error occurred, and (2) what it thinks is the cause of the error, you can do lots of different things to handle it, including just reporting the error and bailing out. For example, if we try to open a file that doesn’t exist, the program
    #include <stdio.h>
    #include <stdlib.h>
    #include <errno.h>

    int main(int argc, char **argv)
    {
        FILE *fd;
        char *fname = "NotAFile.txt";

        if ((fd = fopen(fname, "r")) == NULL) {
            perror("File not opened");
            exit(1);
        }
        printf("File exists ");
        return 0;
    }
    • will return the error message
      File not opened: No such file or directory
    • if the file really doesn’t exist. The function perror() reads the errno variable and using the string provided plus a standard string corresponding to the error number, writes an error message to the console’s standard error output. This program could also prompt the user for a different file name or it could substitute a default file name. Either of these would allow the program to continue rather than exiting on the error.
  • There are other techniques to use in error handling and recovery. These examples should give you a flavor of what you can do within your program. The important idea to remember here is to attempt recovery if possible, but most of all, don’t fail silently!

Exceptions in Java

Some programming languages have built-in error reporting systems that will tell you when an error occurs, and leave it up to you to handle it one way or another. These errors that would normally cause your program to die a horrible death are called exceptions. Exceptions get thrown by the code that encounters the error. Once something is thrown, it’s usually a good idea if someone catches it. This is the same with exceptions. So there are two sides to exceptions that you need to be aware of when you’re writing code:

  • When you have a piece of code that can encounter an error you throw an exception. Systems like Java will throw some exceptions for you. These exceptions are listed in the Exception class in the Java API documentation (see http://download.oracle.com/javase/6/docs/api). You can also write your own code to throw exceptions. We’ll have an example later in the chapter.
  • Once an exception is thrown, somebody has to catch it. If you don’t do anything in your program, this uncaught exception will percolate through to the Java Virtual Machine (the JVM) and be caught there. The JVM will kill your program and provide you with a stack backtrace that should lead you back to the place that originally threw the exception and show you how you got there. On the other hand, you can also write code to encapsulate the calls that might generate exceptions and catch them yourself using Java’s S try...catch mechanism. Java requires that some exceptions must be caught. We’ll see an example later.

Java has three different types of exceptions – checked exceptions, errors, and unchecked exceptions. Checked exceptions are those that you should catch and handle yourself using an exception handler; they are exceptions that you should anticipate and handle as you design and write your code. For example, if your code asks a user for a file name, you should anticipate that they will type it wrong and be prepared to catch the resulting FileNotFoundException. Checked exceptions must be caught.

Errors on the other hand are exceptions that usually are related to things happening outside your program and are things you can’t do anything about except fail gracefully. You might try to catch the error exception and provide some output for the user, but you will still usually have to exit.

The third type of exception is the runtime exception. Runtime exceptions all result from problems within your program that occur as it runs and almost always indicate errors in your code. For example, a NullPointerException nearly always indicates a bug in your code and shows up as a runtime exception. Errors and runtime exceptions are collectively called unchecked exceptions (that would be because you usually don’t try to catch them, so they’re unchecked). In the program below we deliberately cause a runtime exception:

public class TestNull {
  public static void main(String[] args) {
      String str = null;
      int len = str.length();
  }
}

This program will compile just fine, but when you run it you’ll get this as output:


Exception in thread "main" java.lang.NullPointerException

        at TestNull.main(TestNull.java:4)

This is a classic runtime exception. There’s no need to catch this exception because the only thing we can do is exit. If we do catch it, the program might look like:

public class TestNullCatch {
        public static void main(String[] args) {
                String str = null;

                try {
                        int len = str.length();
                } catch (NullPointerException e) {
                        System.out.println("Oops: " + e.getMessage());
                        System.exit(1);
                }
        }
}

which gives us the output


Oops: null

Note that the getMessage() method will return a String containing whatever error message Java deems appropriate – if there is one. Otherwise it returns a null. This is somewhat less helpful than the default stack trace above.

Let’s rewrite the short C program above in Java and illustrate how to catch a checked exception.

import java.io.*;
import java.util.*;

public class FileTest

        public static void main(String [] args)
        {
                File fd = new File("NotAFile.txt");
                System.out.println("File exists " + fd.exists());

                try {
                        FileReader fr = new FileReader(fd);
                } catch (FileNotFoundException e) {
                        System.out.println(e.getMessage());
                }
        }
}

and the output we get when we execute FileTest is


File exists false

NotAFile.txt (No such file or directory)

By the way, if we don’t use the try-catch block in the above program, then it won’t compile. We get the compiler error message


FileTestWrong.java:11: unreported exception java.io.FileNotFoundException; must be caught or declared to be thrown

                FileReader fr = new FileReader(fd);

                                ^


1 error

Remember, checked exceptions must be caught. This type of error doesn’t show up for unchecked exceptions. This is far from everything you should know about exceptions and exception handling in Java; start digging through the Java tutorials and the Java API!

The Last Word on Coding

Coding is the heart of software development. Code is what you produce. But coding is hard; translating even a good, detailed design into code takes a lot of thought, experience, and knowledge, even for small programs. Depending on the programming language you are using and the target system, programming can be a very time-consuming and difficult task. That’s why taking the time to make your code readable and have the code layout match the logical structure of your design is essential to writing code that is understandable by humans and that works. Adhering to coding standards and conventions, keeping to a consistent style, and including good, accurate comments will help you immensely during debugging and testing. And it will help you six months from now when you come back and try to figure out what the heck you were thinking here.

And finally,

I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.

—Douglas Adams, “Last Chance to See”

References

Hunt, A. and D. Thomas. The Pragmatic Programmer: From Journeyman to Master. (Boston, MA: Addison-Wesley, 2000).

Knuth, D. “Structured Programming with goto Statements.” ACM Computing Surveys 6(4): 261-301. 1974.

Krasner, G. E. and S. T. Pope. “A cookbook for using the Model-View-Controller user interface paradigm in Smalltalk-80.” Journal of Object-Oriented Programming 1(3): 26-49. 1988.

Lieberherr, K., I. Holland, et al. Object-Oriented Programming: An Objective Sense of Style. OOPSLA ’88, Association for Computing Machinery, 1988.

Martin, R. C. Agile Software Development: Principles, Patterns, and Practices. (Upper Saddle River, NJ: Prentice Hall, 2003).

McConnell, S. Code Complete 2: A Practical Handbook of Software Construction. Redmond, WA, Microsoft Press, 2004).

Pike, Rob, Notes on Programming in C, retrieved from http://www.literateprogramming.com/pikestyle.pdf on 29 September 2010. 1999.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.183.138