© Karl Beecher 2018
Karl BeecherBad Programming Practices 101https://doi.org/10.1007/978-1-4842-3411-2_6

6. Subroutines

Karl Beecher1 
(1)
Berlin, Germany
 

Objectives

In this chapter, you’ll learn:
  • How subroutine size affects program comprehension

  • About measures you can take to frustrate comprehension of subroutines, specifically by
    • naming them poorly;

    • making them overly complex; and

    • giving them too many purposes

  • How inputs to and outputs from subroutines can be abused

Prerequisites

It will help you if you’re already familiar with reference types and value types, as well as evaluation strategies like call-by-value and call-by-reference.

Introduction

It is a very foolish and bad habit [. . .] to start working at details before having understood the problem as a whole.

—George Pólya (1973)

If you’re unfortunate enough to have received training in programming, you probably learned about problem decomposition. That’s where, prior to coding, you break a problem down hierarchically into smaller pieces (see Figure 6-1).
../images/455320_1_En_6_Chapter/455320_1_En_6_Fig1_HTML.gif
Figure 6-1

A model of problem decomposition

Levels in the hierarchy end up corresponding to various parts of your program. Things in the middle levels become the organizational parts of your program (like modules, classes, or packages),1 while the lowest-level parts will describe the indivisible units of functionality the program provides. These parts of the problem correspond to subroutines in your program.

Subroutines are intended to help. You’re supposed to write subroutines that correspond with those individual units of functionality. To that end, your subroutines are expected to be logical, small, and simple.

But it’s not your raison d’être to help. By the time you get through with this chapter, you’ll have learned instead how to make subroutines flabby, incoherent, and frustratingly complex.

Super-Size Your Subroutines

When Chapter 2 briefly introduced subroutines, it demonstrated the first example of the anti-rule “In general, bigger is better.” It showed how monolithic programs tie everyone to a low-level view of the code, preventing anyone from seeing the forest for the trees. It also demonstrated how monoliths encourage something that really gets up programmers’ noses: code duplication .

Because of reasons like this, the taskmaster reviewing your code may well pressure you into breaking a large program down into smaller subroutines. However, that doesn’t mean you have to give up your predilection for enormous chunks of code. By investing minimal effort into creating the subroutines, you can instead break up a very large program into just a few still-rather-large subroutines.

Admittedly, this means you are replacing a monolithic piece of code with a series of subroutine calls. This raises the chance of accidentally making it easier to understand what the program is doing by allowing the reader to peruse the names of the subroutines, so keep in mind the anti-rule, “Names are important—so make sure you get them wrong.”2 Nevertheless, a few big subroutines can still provide a lot of trees to obscure the view.

Thumbs Down!

Experienced programmers object to large subroutines for many reasons. For example, large subroutines tend to include many details, which require a lot of effort to understand all at once. They also tend to perform more than one task, making them harder to reuse.

Large subroutines can also be harder to maintain. Making a change to one part of a subroutine risks causing knock-on effects to other parts of that subroutine, a risk that increases the larger the subroutine grows. This results in “brittle” code blocks that are resistant to change.

Also, the number of bugs in a subroutine tends to increase with size (Enders and Rombach, 2003). However, keep in mind that size is only one indicator among several, and it’s actually one of the weaker ones. Just writing smaller subroutines won’t guarantee that you’ve made them less buggy, but it’s a good place to start, and a smaller subroutine is cheaper to fix when it does go wrong (Selby and Basili, 1991).

Coming up with a concrete number for what constitutes “too large” is tricky. The programming language used matters, empirical evidence is patchy, and the numbers found in style guides and handbooks varies. At the lower end of the advice spectrum, you find limits like 20 lines of code (Martin, 2009). At the other end, you’ll find people urging you not to exceed 100 to 200 lines (McConnell, 2003), which in my own opinion is already very large.

Put Up Barriers to Understanding

A program can be made more comprehensible overall by being restructured into a series of smaller, easily understood subroutines. This section will discuss how to neutralize that benefit by making the subroutines themselves incomprehensible.

Bad Naming

Chapter 3 explained how naming a variable sensibly risks revealing that variable’s purpose, which makes things far too easy for the reader. Other things besides variables require naming, which is why we have the anti-rule “Names are important—so make sure you get them wrong.”

Subroutines need names too, so you should reuse ideas from Chapter 3 regarding poor naming (and, yes, I know “reuse” is a dirty word in this book, but reusing bad ideas is okay by me). Names like f, blah, and procFshDBCnt2 are fine examples for shrouding a subroutine’s purpose in secrecy, seeing as they either carry no meaning or verge on being cipher text.

If your colleagues block such poor names and insist on better ones, you could try names that are merely vague rather than incomprehensible. Labels like doProcess or runComputation are wonderfully weak because they contain no specific information.

You can also get up to mischief by giving a subroutine a name that fails to fully describe everything it does. By doing so, you can sneak unseen side effects past your colleagues. For example, someone might use your subroutine named searchInFile, which reports whether a string appears in a given file:

File myFile = new File(path);
if (searchInFile(myFile, "gold")) {
    // Do stuff if string found.
}

The programmer looking for “gold” didn’t look too deep into your subroutine. However, they subsequently noticed something strange: the files they searched through sometimes suddenly went missing. After a laborious debugging session, they finally took a look at the searchInFile method :

boolean searchInFile(File f, String text) {
    BufferedReader br = new BufferedReader(
            new FileReader(f));
    String line;
    while ( (line = br.readLine()) != null) {
        if (line.contains(text)) {
            return true;
        }
    }
    f.delete();
    return false;
}

Woah there! The unsuspecting programmer has let themselves in for a world of trouble, because your humble subroutine doesn’t only search for text in a file; it also deletes the file if the search term wasn’t found. However, the subroutine’s name made no mention of that.

Thumbs Down!

Poor names in the codebase generally lead to development that is lengthier and more problematic (Gorla et al., 1990). Colleagues will thank you for using subroutine names that are clear and complete. A subroutine called, say, processNumbers is named poorly because it’s too weak. A more explicative name would be calculateMedian.

High Complexity

You can think of a subroutine as something that “stitches together” different blocks of code . The result can be simple or complex depending on your stitching skills.

We’ve already met this idea of complexity. Earlier chapters showed us how to make overly complex conditionals (Chapter 4) and write brain-bending loops (Chapter 5). Thankfully, much of the same bad advice carries over to subroutines. This shouldn’t be surprising since a subroutine typically contains a mixture of things like conditionals and loops. That means, when you give a subroutine complex loops or conditionals, the subroutine suffers in turn. Go back and read those earlier chapters if you want to retread that ground.

Every time you stitch an extra loop or another conditional into a subroutine, you add more possible pathways through the subroutine, because constructs like if statements and while loops include decision points (as Figure 2-2 showed). Every pathway is another possibility you have to verify. Piling in more pathways increasingly burdens the reader’s mental capacity and adds to the risk they overlook problems in your code.

This is why complexity serves as more fertile ground for growing problems than does subroutine size alone. In fact, you can write relatively small subroutines that are nevertheless overly complex. For example:

for (CustomerOrder order : orders) {
    vatRate = 0;
    if (order.isDomestic()) {
        vatRate += 0.15;
    } else {
        for (Country country : countries) {
            if (country.equals(order.getOrigin()) &&
                country.hasNoVatException()) {
                vatRate += country.getVatRate();
            }
        }
    }
    itemAmount = 0;
    for (Item item : order.getItems()) {
        if (item.isDiscounted()) {
            itemAmount += item.getPrice();
        }
        else {
            itemAmount += item.getDiscountedPrice();
        }
    }
    orderAmount = itemAmount * vatRate;
}

This subroutine is only about 15 lines long, but the number of decisions means that it has 8 pathways through it. This number actually approaches the maximum level recommended by some of those goodie-goodie authors of software best practices.

Thumbs Down!

You can code complexity using a number of means, but this book has so far focused on a simple and readily accepted one: counting the number of possible pathways through a piece of code. It actually has a name: cyclometric complexity (McCabe, 1976).

Calculating cyclometric complexity gives you an indication of how complex a subroutine is. To do it, begin with the number 1 (every routine has at least one pathway). From there, count each decision point in the routine. A decision point is either
  • a conditional or looping construct that makes a comparison (denoted by a keyword like if, while, for, case, etc.); or

  • a binary operator in an expression (like && and ||).

An instance of each adds 1 to a subroutines’ cyclometric complexity . For example,

for (CustomerOrder order : orders)

adds 1, whereas

if (country.equals(order.getOrigin()) &&
    country.hasNoVatException())

adds 2, one each for the if and &&.

Your colleagues appreciate complexity’s being kept low because the more decision points a subroutine contains, the more effort it takes to understand it and verify that it works as intended. The recommended upper limit varies, but typical values hover around 10 to 15 (McConnell, 2003; Watson and McCabe, 1996).

Some simple steps toward reducing a subroutine’s complexity include
  • simplifying the expressions in decision points;

  • eliminating duplicated code within a subroutine; and

  • moving one complex part of the code into its own subroutine.3

The last example could be rewritten like this:

for (CustomerOrder order : orders) {
    vatRate = calculateTaxRate(order);
    itemAmount = calculateOrderTotal(order.items());
    orderAmount = itemAmount * vatRate;
}

Some of the code has been moved into subroutines. This reduces this subroutine’s own complexity to 2. The subroutines it calls each have a complexity of 5 or less.

Too Many Purposes

For surviving in the wild, which would you prefer: a single flimsy blade or a multi-purpose Swiss Army knife? In a role-playing video game, what’s the better choice of character: a weakling with one skill or a kick-ass warrior with all stats cranked up across the board? Obviously, multi-talented wins every time.

Applying this same logic to subroutines leads us to conclude that the best subroutines are multi-talented and carry out lots of diverse functions. The more things a subroutine can do, the better, right?

So, let’s say your task is to write code for accepting a customer order. The program needs to validate the order, print it to the screen, and store it in the database . In which case, you should write a subroutine that does all those tasks.

void acceptOrder(CustomerOrder order) {
    // Validate it
    if (order.getName().length() == 0 &&
        order.getItemNumber() == 0) {
        // Put up error message
    }
    // Print it
    System.out.println("Order: " + order.getId());
    System.out.println("Name: " + order.getName());
    System.out.println("Items:");
    for (OrderItem item : order.getItems()) {
        System.out.println(" - " + item);
    }
    // Save it
    DbConnection conn = openDbConnection();
    conn.saveOrder(order);
    conn.close();
}

Because it can do everything, acceptOrder is a Swiss Army knife, a multi-skilled video-game character. That makes it better.

Right?

Thumbs Down!

No. A subroutine should focus on a single task.

The main problem with acceptOrder stems from its being an “all or nothing” subroutine. Your program can either validate, print, and save a customer’s order all together or do none of those things. What if more granular control were needed? What if the program sometimes needed to accept an order silently, without printing it to the screen? What if it occasionally needed to validate and print an order, but store it temporarily on disk instead of in the database? The current version of acceptOrder doesn’t allow for any of that. This multi-purpose subroutine imposes a rigid order on proceedings.

You have various options to handle this. Among the least acceptable approaches would be to add tweaked copies of the original subroutine (e.g., acceptOrderSilently and acceptOrderStoreToDisk). Expect this solution to be rejected for excessive code duplication .

A more acceptable solution might be to keep acceptOrder multi-purpose, but add parameters allowing the caller to control the behavior. For example:

void acceptOrder(CustomerOrder order,
        boolean printOrder) {
    // Validate it
    if (order.getName().length() > 0 &&
        order.getItemNumber() > 0) {
        // Put up error message
    }
    // Print it
    if (printOrder) {
        System.out.println("Order: " + order.getId());
        // etc...

Better still would be to extract each individual task into its own subroutine:

boolean isValid(CustomerOrder order) {
    return order.getName().length() > 0 &&
        order.getItemNumber() > 0;
}
void printOrder(CustomerOrder order) {
    System.out.println("Order: " + order.getId());
    System.out.println("Name: " + order.getName());
    System.out.println("Items:");
    for (OrderItem item : order.getItems()) {
        System.out.println(" - " + item);
    }
}
void saveOrderToDb(CustomerOrder order) {
    DbConnection conn = openDbConnection();
    conn.saveOrder(order);
    conn.close();
}

That would give you the option of adding new use cases in the future by simply combining the simpler subroutines in different ways as necessary.

If anything should be flexible and multi-purpose it should be the program, not its individual subroutines. Think of the program as the Swiss Army knife and the subroutines—each finely attuned for a single purpose—as the individual blades.

(Ab)use Parameters

Subroutines need information to work with. The preferred way to supply it is via parameters. But just because it’s preferable, that doesn’t mean it isn’t open to abuse.

The More the Merrier

If you have a procedure with ten parameters, you probably missed some.

—Alan Perlis (1982)

Concerning the number of parameters a subroutine should accept, other people on your project probably toe the usual line, which advises keeping the number small—only up to about two or three, in fact. Your colleagues will expect you to do the same and might even point out a few examples like these:

boolean isOldEnough = isAdult(age);
String name = germanName.replace("ß", "ss");

They might also harp on about how keeping parameter lists small helps keep subroutines small and focused and all that sort of thing. But such colleagues misunderstand your mission. You don’t play by their rules. You play by anti-rules, ones like “The bigger the better.” That’s why you prefer big parameter lists.

void processCustomer(String forename,
        int age, List<Order> orders,
        String phoneNumber, String surname,
        Date dateOfBirth, String mothersMaidenName,
        boolean marketingEmails)

Imagine, if you can bear it, being a colleague faced with this code and the questions that would go through their mind. What on earth is this subroutine’s purpose?4 How much work does it do if it requires all those parameters? Are they all strictly necessary? And why are they in that order?

Thumbs Down!

A couple of very common reasons for parameter lists’ growing too long are as follows:
  1. 1.

    The subroutine is trying to do too much.

     
  2. 2.

    Most or all of the parameters would make more sense being unified into a new type.5

     

Regarding the first reason, this chapter already discussed in an earlier section how to handle subroutines’ doing too much (see “Put Up Barriers to Understanding”).

Regarding the second reason, let’s assume as an example that processCustomer has just one task, say, adding a new customer to the system. Yes, that requires several pieces of information, but see how much you can simplify the subroutine by gathering all those pieces together into a single, new type:

// We created this new class...
class Customer {
    String forename;
    String surname;
    Date dateOfBirth;
    String mothersMaidenName;
    boolean sendMarketingEmails;
    List<Order> orders;
}
// ...
// ... and replaced all the old parameters.
void addNewCustomer(Customer newCustomer)

Being Defensive

Defensive is not a word associated with winners. Winners never get defensive, as they always prefer the offensive. That’s true of all the great names from history: Caesar, Napoleon, Patton, Stone Cold Steve Austin. That’s why whenever anyone suggests writing subroutines defensively, you should treat them like the loser they clearly are.

A subroutine written defensively takes precautions against potentially problematic parameters. Look at this, for example:

void shoutMessage(String message) {
    // WINNERS SHOUT. To shout a message, turn the
    // whole thing to upper-case.
    System.out.println(message.toUpperCase());
}

Your colleagues will urge caution against using the message parameter without checking it. But losers are cautious. After all, did Caesar hesitate when the soothsayer cautioned him to beware the Ides of March? No, he fearlessly marched into the Senate without a bodyguard. (Admittedly, he then got stabbed to death, but that’s beside the point.)

Instead, you should follow the anti-rule, “Assume nothing will go wrong.” Your boldness will be rewarded. See what glory awaits you when the program is run:

Exception in thread "main" java.lang.NullPointerException
        at Main.shoutMessage(Main.java:16)
        at Main.main(Main.java:10)

Oh. Um . . . I guess shoutMessage was called with null as the parameter .

Which leads onto the next lesson: winners know exactly when to walk away exclaiming, “Too bad, but it’s not my problem!”

Thumbs Down!

A significant proportion of errors occur at the boundaries between subroutines (Basili and Perricone, 1984). Invalid data’s crossing those boundaries can cause errors. Therefore, parameters should always be checked before use.

Simple checking6 usually adds just a few extra lines of code. A safer version of shoutMessage would look like this:

void shoutMessage(String message) {
    if (message != null) {
        System.out.println(message.toUpperCase());
    }
}
Examples of checking parameters include the following:
  • Ensuring objects are non-null before trying to call their methods

  • Verifying numerical values are within expected mathematical bounds (e.g., divisors shouldn’t be zero, square roots shouldn’t be taken of a negative number)

  • Checking that specially formatted data conforms to the expected format (e.g., dates, times, credit card numbers)

  • Making sure files are open and readable before trying to access them

Surreptitious Subroutines

People like surprises. That’s why it’s a good idea to make your programs do unexpected things.

When it comes to parameters, a great way to cause surprise (not to mention consternation) is to make your subroutines alter parameter values when that’s not expected. A programmer faced with a method like this:

void addCustomerToList(Customer c,
        List<Customer> customers)

can reasonably expect that calling addCustomerToList will change the contents of the customers argument . However, they would expect that same list to remain unchanged after calling a method like this:

void outputList(List<Customer> customers)

Conscious of these sorts of expectations, you should learn to sneak surprising and unwanted side effects into your subroutines. Here’s a small example.

A DisplayBoard is an announcement screen like you might find in a train station. You can keep adding messages to the board until you run out of space (the board is limited to a maximum of 280 characters). You can check that your message fits before adding it, like this:

if (displayBoard.fits(message)) {
    displayBoard.add(message);
}

Here’s the first part of the DisplayBoard code, including only its data fields:

class DisplayBoard {
    // This is what gets displayed
    StringBuilder text;
    // etc.

Here’s the add method:

    public void add(String message) {
        text.append(message);
    }

Nothing surprising there. Here’s the fits method:

    public boolean fits(String message) {
        return text.append(message).length() <= 280;
    }

Ah, here we have a problem—one which might well be overlooked in review.

The fits method doesn’t just perform a check (as expected); it also performs a change. It doesn’t just simulate what the updated message’s length would be, it actually adds the new message to the board then returns whether or not the whole text has a length of less than or equal to 280 characters. Therefore, checking to see if a message fits before adding it to the board results in its appearing on the board twice!

Screw with Return Values

As well as accepting values, subroutines can also return them. As with parameters, there are certain things your colleagues would prefer you not do.

Here are some of them.

Return of the Harbinger

As Chapter 3 pointed out,7 null values can be dangerous, as they have the potential to kill a program quicker than you can say “null pointer exception .” Chapter 3 gave an example where return values from a subroutine were used without first verifying they were non-null, thus planting seeds from which should eventually bloom an error.

Now that we’re addressing the other side of the boundary between caller and receiver, we can revisit this case. Suffice it to say, you can also encourage bugs from the receiver side, where the author decides what gets returned. Sure, you will have to return a real object at some point, but many languages allow you to return from a subroutine at multiple points. Therefore, sprinkling around plenty of return null statements is a good start. For example, a subroutine that returns a collection could return null if there’s nothing to put into the collection, or a subroutine could return null if it encounters an error.

In short: if in doubt, return null.

Thumbs Down!

There’s sometimes a better alternative to returning nulls from a subroutine; for example:
  • A subroutine that returns a collection could return an empty collection instead of null.

  • A subroutine that encounters a problem could (and should) throw an exception .

  • If you return a custom type, it sometimes makes sense to have the idea of a default value for that type rather than null, similar to the idea of an empty string or a default date.

Some languages have stronger null safety built-in. They force the programmer to specify whether a variable is nullable and will refuse to even compile a program until every reference to a nullable return value includes code handling a null return.

Java isn’t one of these languages. However, as Chapter 3 explained, it does provide the Optional type , which makes it clear that an object may or may not be null and forces the caller of the subroutine to take that into account. However, since the language doesn’t force you to use it, the use of Optional is itself optional.

Fun with Output Parameters

Subroutines that alter parameter values can make for some wonderful confusion. Take a look at this example, a simple subroutine that moves a pair of x-y coordinates:

void move(int x, int xDistance,
        int y, int yDistance) {
    x = x + xDistance;
    y = y + yDistance;
}

A typical call to move would look like this:

move(x, 10, y, -20);

Nothing surprising there. Compare it to another subroutine that keeps a history of all the movements made:

void recordMovement(int x, List<Integer> xs,
        int y, List<Integer> ys) {
    xs.add(x);
    ys.add(y);
}

Again, pretty straightforward. So, think about what the output of this code would be:

int xPos = 5;
int yPos = 5;
List<Integer> xMoves = new ArrayList<>();
List<Integer> yMoves = new ArrayList<>();
System.out.println("X: " + xPos + ", Y: " + yPos);
move(xPos, 10, yPos, -20);
recordMovement(10, xMoves, -20, yMoves);
System.out.println("X-Movements: " + xMoves);
System.out.println("Y-Movements: " + yMoves);
System.out.println("X: " + xPos + ", Y: " + yPos);

What will the program output look like? What will appear in place of the following question marks?

X: ?, Y: ?
X-Movements: ?
Y-Movements: ?
X: ?, Y: ?

Thumbs Down!

Here’s the correct answer.

X: 5, Y: 5
X-Movements: [10]
Y-Movements: [-20]
X: 5, Y: 5

Notice how the values of xPos and yPos didn’t change, but xMoves and yMoves did? If you guessed differently, chances are Java’s evaluation strategy caught you out.

The evaluation strategy describes exactly what is sent to a subroutine when arguments are passed in a call. Different languages use different strategies, so you must learn which strategy your chosen language applies. Java always uses call-by-value , which means the value of the argument (rather than the argument itself) is copied into a new, local variable (i.e., the parameter). The original variable can’t be altered by the subroutine. However, Java’s type system (which divides all types into primitive and reference types ) throws up some complications.

A primitive type (like int) stores the variable’s actual value. In the middle of calling the move method, there are two variables, xPos and x, and x is local to move.

void move(int x, int xDistance,
        int y, int yDistance) {
    // x=5, y=5, xDistance=10, yDistance=-20
    x = x + xDistance;
    // At this point, x=15 and xPos=5
    y = y + yDistance;
}

That’s why executing the statement x = x + xDistance in the move method does nothing to alter the original value of xPos. It only alters the local parameter. That’s why xPos (and yPos) remained 5, even after the move method completed.

A reference type (like ArrayList) stores an object’s location in memory . Therefore, passing a reference type to a method means that the object’s memory location is copied to the corresponding parameter. That means the variable xMoves and the parameter xs are two different labels but they point to the same single object.

void recordMovement(int x, List<Integer> xs,
        int y, List<Integer> ys) {
    // x=10, xs=[], xMovements=[]
    xs.add(x);
    // At this point, xs=[10], xMovements=[10]
    ys.add(y);
}

Calling a mutator method on such a parameter sends a message to the original object to change itself. That’s why, in our example, the changes made by executing xs.add(x) remained visible after the move method completed.

In the example, the author intended to use output parameters , which are parameters passed to subroutines simply to have their values altered.
  • In the case of the recordMovements method, xMoves and yMoves were output parameters.

  • In the case of move, xPos and yPos were intended to be output parameters, but Java’s evaluation strategy prevented that.

There’s nothing inherently wrong with using output parameters—they have their place—but they’re often discouraged today as being confusing and awkward to use. Keep in mind your colleagues’ desire for consistency throughout the codebase, and also that a stronger preference for using immutable types8 is emerging these days, which disfavors output parameters .

In short, my advice is this: if a subroutine must update a variable, prefer creating a new value based on input parameters and return it, rather than using output parameters . Make output parameters an exception when they can be justified.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.171.12