Chapter 7. Coding aloud: Readability

This chapter covers

  • Writing readable code
  • Documenting contracts using Javadoc comments
  • Replacing implementation comments with self-documenting code

Source code serves two very different kinds of users: programmers and computers. Computers are as happy with messy code as they are with clean, well-structured systems. On the other hand, we programmers are utterly sensitive to the shape of the program. Even white space and indentation—completely irrelevant to the computer—make the difference between understandable and obscure code. (See appendix A for an extreme example.) In turn, easy-to-understand code boosts reliability, because it tends to hide fewer bugs, and maintainability, because it’s easier to modify.

In this chapter, I’ll show you some of the modern guidelines for writing readable code. As with the other chapters, my objective isn’t to provide a comprehensive survey of readability tips and tricks. I’ll focus on the main techniques that make sense on a small code unit and put them in practice on our usual running example.

7.1. Points of view on readability

Writing readable code is an undervalued art that schools seldom teach, but whose impact on software reliability, maintenance, and evolution is paramount. Programmers learn to express a set of desired functionalities in machine-friendly code. This encoding process takes time and inserts layer upon layer of abstraction to decompose those functionalities into smaller units. In Java parlance, these abstractions are packages, classes, and methods. If the overall system is large enough, no single programmer will dominate the entire codebase. Some developers will have a vertical view of a functionality: from its requirements to its implementation through all abstraction layers. Others may be in charge of one layer and supervise its API. From time to time, all of them will need to read and understand code their colleagues have written.

Promoting readability means minimizing the time that a reasonably knowledgable programmer needs to understand a given piece of code. A more concrete characterization would be the time that someone who isn’t familiar with the code needs to feel confident enough to modify it without breaking it. Other names for this quality are learnability and understandability.

Pop Quiz 1

Which other code quality attributes are affected by readability?

How do you write readable programs? As early as 1974, when C was two years old, this problem was deemed significant enough to deserve systematic treatment, leading to the influential book The Elements of Programming Style. In it, Kernighan (of C fame) and Plauger take apart a number of small programs, all drawn from published textbooks, summing up their lucid and surprisingly modern observations in a list of programming-style aphorisms. The first aphorism on expressions summarizes well the whole readability issue:

  • Say what you mean, simply and directly.

Indeed, readability is about clearly expressing the intent of the code. Grady Booch, one of the architects of UML, puts forward a natural analogy:

  • Clean code reads like well-written prose.

Now, creating well-written prose isn’t something you can achieve by following a fixed set of rules. It takes years of practice, not only in writing, but also in reading well-written prose by established authors. The expressive capabilities of computer code are definitely limited compared with natural languages, so the process of creating clean code is luckily somewhat simpler, or at least more structured, than producing a beautiful essay. Still, mastering this process requires years of practice that no book (or book chapter!) can replace. In this chapter, we’ll explore some basic ways to improve the readability of your code, focusing on those techniques that you can apply to our recurring example.

In the last two decades, readability has been put on the front burner by the Agile movement, thanks to the focus on refactoring and clean code. Refactoring is the idea of restructuring a working system to improve its design so that future change is easier and safer. It’s one of the main ingredients in those lightweight development processes that favor fast development phases and iterative refinement of software.

Even if you or your company doesn’t subscribe to the whole Agile philosophy, you can’t miss the literature that comes with it, which is full of brilliant ideas about the bad (code smells), the good (clean code), and how to turn the first into the latter (refactoring). See the Further reading section at the end of this chapter for specific suggestions.

It would be nice to supplement the readability tips that well-known experts have developed with hard data on the effectiveness of those tips. Unfortunately, readability is inherently subjective, and it’s extremely hard to come up with objective means to measure it. This hasn’t stopped researchers from proposing a variety of formal models, all attempting to estimate readability with a combination of simple numerical measures, like the length of the identifiers, the number of parentheses occurring in an expression, and so on. This ongoing effort is still far from reaching a stable consensus, so I’ll focus on some established industry best practices, starting with a quick look at the style policies of the biggest IT players.

7.1.1. Corporate coding style guides

Some of the largest software companies publish coding style guides online, including the following:

  • Sun used to provide an “official” Java style guide, which hasn’t been updated since 1999. A frozen archival copy is available at http://mng.bz/adVx.
  • Google has a company-wide style guide: https://google.github.io/styleguide/javaguide.html.
  • Twitter provides a library of common Java utilities, accompanied by a style guide: http://mng.bz/gVAZ. The guide explicitly refers to Google’s and Oracle’s guides as inspirations.
  • Facebook also provides a style guide with its library of Java utility classes: http://mng.bz/eDyw.

These guides mostly agree on the general principles I set forth in this chapter and only differ on the level of detail they reach and on small cosmetic issues. For example, consider the sequence of import statements at the beginning of a source file. Here’s one such sequence in Google’s format:

import static com.google.common.base.Strings.isNullOrEmpty;
import static java.lang.Math.PI;

import java.util.LinkedList;
import javax.crypto.Cypher;
import javax.crypto.SealedObject;

Here’s Twitter’s recommended style for the same imports:

import java.util.LinkedList;

import javax.crypto.Cypher;
import javax.crypto.SealedObject;

import static com.google.common.base.Strings.isNullOrEmpty;

import static java.lang.Math.PI;

Both the order and the use of empty lines are different. Oracle and Facebook, on the other hand, are fine with any layout of imports.

Style guides ensure some uniformity across a company’s code base and are a nice addition to the welcome package for new employees, giving them something easy to sink their teeth into, before the real troubles begin. (Besides, when those troubles start biting back, they can say, “At least I’m following the style guide!”) For your long-term professional growth, though, it’ll be much more useful for you to peruse this chapter and then spend some time with the articulated style books I’ve listed at the end of the chapter, particularly Clean Code and Code Complete.

7.1.2. Readability ingredients

You can distinguish the ingredients contributing to readability into two categories:

  • Structural— Features that may affect the execution of the program; for example, its architecture, the choice of the API, the choice of control flow statements, and so on. You can further distinguish these features into three levels:

    • Architecture-level— Features involving more than one class
    • Class-level— Features involving a single class but transcending the boundaries of a single method
    • Method-level— Features that involve a single method
  • Exterior— Features that don’t affect execution; for example, comments, white space, and choice of variables names

In the following sections, I’ll briefly recall the main guidelines regarding each category. Then, I’ll guide you through applying those guidelines to the water container running example.

7.2. Structural readability features

Architectural-level features refer to the high-level structure of the program: how it’s split into classes and the relationships occurring between them. Generally speaking, an architecture that’s easy to understand should be composed of small classes with coherent responsibilities (aka high cohesion), tied together by an uncomplicated network of dependencies (aka low coupling ). Another readability-enhancing technique is to use standard design patterns whenever possible. Because most developers know them, they spark familiarity and convey a complement of contextual information to the reader.

Table 7.1. Summary of structural code features affecting readability

Structural Understandability

Level

Features

Ways to improve

Architectural Class responsibilities Decrease coupling
Increase cohesion
Arch. patterns (MVC, MVP, etc.)
Relationships between classes Design patterns
Refactorings (Extract Class, etc.)
Class Control-flow Use the most specific loop type
Expressions Show order of evaluation
Local variables Split complex expressions
Method length Refactorings (Extract Method, etc.)

Each of these quick tips is tied to a large body of commentary and caveats. In the spirit of this book, which focuses on small-scale properties, I won’t delve into these architectural features, but you can find more information in the Further reading section at the end of this chapter. Table 7.1 summarizes the most relevant structural features and the corresponding best practices.

Class-level features pertain to the API of a given class and its organization in methods. For example, a golden rule is that long methods are harder to understand. At some point, certainly higher than 200 lines, you lose track of what was at the beginning of the method and end up going back and forth in your editor, trying to keep in your head what doesn’t fit on a single screen. I’m listing this principle among the class-level features because, even though the problem lies in a single method, its solution affects more than one method: you shorten a long method by splitting it into multiple methods, and the suggested way to do this is through the Extract Method refactoring rule, which I’ll present later in this chapter.

Now, let’s zoom in on some method-level features that affect readability. They include the choice of control flow statements, the way you write expressions, and the use of local variables.

7.2.1. Control flow statements

An interesting small-scale readability issue is the choice of the most appropriate loop construct for a given scenario. Java offers four basic types of loops: standard for, while, do-while, and enhanced for. It’s easy to see that the first three are equivalent, in the sense that you can convert any of them into any of the others with little effort. For example, you can convert the exit-checked loop

do {
   body
} while (condition);

into the following falsely entry-checked loop:

while (true) {
   body
   if (!condition) break;
}

Which of these two snippets is more readable? I’m sure you’ll agree the first is definitely better. The second is an ugly gimmick that will only puzzle the reader, because they’ll be acutely aware that there was a more natural way to accomplish that task. Your job when optimizing readability is to avoid that feeling and make the reading experience as smooth and uneventful as possible. That’s the meaning of clearly expressing intent.

If you must implement a loop whose condition must be checked after each iteration as a do-while loop, what about an entry-checked loop? Because there are three options, let’s compare their expressivity:

  • A while loop is like a for loop whose initialization and update bits have been chopped off. If your loop needs those features, and they’re reasonably compact, use a for loop—it’ll help the reader recognize the role of each component. For example, the familiar
    for (int i=0; i<n; i++) {
       ...
    }
    is more readable than the equivalent
    int i=0;
    while (i<n) {
       ...
       i++;
    }
  • An enhanced-for is a more specific form of a standard for loop because it applies to only arrays and objects implementing the Iterable interface. Moreover, it doesn’t provide the loop body with an index or an iterator object.

To decide on a loop construct, you should apply a general rule known as the principle of least privilege and choose the most specific statement that fits your purposes. Is your loop over an array or a collection implementing Iterable? Use the enhanced for. Besides its readability value, it’ll guarantee that the iteration won’t go out of bounds.

Does your loop feature a compact initialization step and a similarly compact update step? Use a standard for loop. Otherwise, use a while loop.

Speaking of loops, starting from Java 8, you also have the option of using the stream library to produce functional-style looping constructs. For example, here’s how you print every object in a set:

Set<T> set = ...
set.stream().forEach(obj -> System.out.println(obj));

Is it more readable than the following old-fashioned enhanced for?

for (T obj: set) {
   System.out.println(obj);
}

Probably not. A good rule of thumb is to use the functional-style API when you have some other reason besides just looping, such as filtering or transforming the content of the stream in some way. One particularly good reason to use data streams is when you want to split the job among multiple threads. In that case, the library will take care of a lot of nasty details for you.

Readability Tip

Choose the most natural and specific type of loop for the job.

Pop Quiz 2

What kind of loop would you use to initialize an array of n integers with the integers from 0 to n – 1?

7.2.2. Expressions and local variables

Expressions are the basic building blocks of any programming language and can grow extremely complicated, essentially without limits. To improve readability, you should consider splitting complex expressions into simpler subexpressions and assigning their values to extra local variables that you introduce for this purpose. Naturally, you should give those new local variables descriptive names illustrating the meaning of the corresponding subexpression. (I’ll return to variable names shortly.)

Reference already employs this readability-enhancing strategy when the method connectTo computes the amount of water that should be present in each container after the new connection is made. The shortest way to describe this calculation would be something like the following:

   public void connectTo(Container other) {
      ...
      double newAmount = (amount * group.size() +
                          other.amount * other.group.size()) /
                         (group.size() + other.group.size());
      ...
   }

As you can see, even split among three lines and aligned, the resulting expression is long and somewhat hard to parse. The reader is likely to struggle, or at least pause, to find the matching parentheses, because the closing parenthesis is far away from its opening. The clumsy repetitions of group.size() and other.group.size() don’t help either.

That’s why Reference introduces as many as four extra variables, just to improve readability:

   public void connectTo(Container other) {
      ...
      int size1 = group.size(),
          size2 = other.group.size();
      double tot1 = amount * size1,
             tot2 = other.amount * size2,
             newAmount = (tot1 + tot2) / (size1 + size2);
      ...
   }

Table 7.2. Summary of exterior code features affecting readability

Exterior Understandability

Features

Ways to improve

Comments Detailed documentation comments,scarce implementation comments
Names Descriptive names
White space White space as punctuation
Indentation Consistent indentation

You shouldn’t worry about the second, more readable version being less efficient. In general, the performance cost of using a few extra local variables is negligible, especially if you compare it with the readability benefit. In this particular case, the extra variables save two method invocations and may even lead to faster execution.[1]

1

As a matter of fact, the bytecode for the readable version is three bytes shorter than the other version.

Martin Fowler has formalized this idea as one of the refactoring rules he has assembled. (See the Further reading section for more information.) Similar to how design patterns work, each refactoring rule is given a standard name to ease communication. The name of this rule is Extract Variable.

Readability Tip

Refactoring rule Extract Variable: Replace a subexpression with a new local variable with a descriptive name.

7.3. Exterior readability features

You can use three exterior traits to improve readability: comments, names, and white space. Table 7.2 summarizes the corresponding best practices, presented in the following subsections.

7.3.1. Comments

Code alone can’t satisfactorily document itself. Sometimes you have to use natural language to provide further insight or convey a more global perspective on some functionality. It’s useful to distinguish two kinds of comments:

  • Documentation or specification comments describe the contract of a method or of an entire class. They’re meant to explain the rules of a class to its potential clients. You can think of them as the public comments. You usually extract these comments from the class and put them into a convenient form (like HTML) for easy consultation. The Java tool that performs such extraction is Javadoc (explained later in this chapter).
  • Implementation comments provide insight about the internals of a class. They may explain the role of a field or the intent of a code fragment belonging to a tricky algorithm. You can think of them as the private comments.

To a certain extent, when and how often to insert comments is open for debate, but the modern trend is to be generous with documentation comments and stingy with implementation ones.

The motivations stem from the following reasoning: the API precedes the implementation, is generally more stable than it, and is the only part of a class that clients should know so they can correctly employ its services. Therefore, it’s particularly important for the health of the overall system that the responsibilities and contracts of each class and method be perfectly clear to its clients. As you saw in chapter 5, you can express contracts in code only up to a certain point, whose exact extent depends on the programming language of choice. Beyond that, natural language comments and other forms of documentation take over.

Pop Quiz 3

Is a comment describing the behavior of a private method a specification comment or an implementation comment?

Conversely, method bodies change often and are hidden from the clients. Because they change often, you need to update any comment inside them equally often, and programmers are known to forget to update a comment (or any other action having no immediate repercussions on the program behavior). You’ve probably been there: tasked with updating a piece of code, for a bug fix or a new feature, probably under a tight deadline. You’re likely to focus on functionality, on writing code that works and passes the tests. Unless your company adopts serious forms of code inspection, no downstream filter on the quality of the comments is in place. As such, it’s just natural to ignore the comments and deal with the active code lines.

If word spreads that some comments in a given codebase are unreliable because they may be stale, all of the comments immediately become pure noise, even if most of them are in fact good and up-to-date.

Readability Tip

Cut back implementation comments in favor of documentation comments, and make sure that all comments are up-to-date. (Code reviews can help.)

7.3.2. Naming things

According to a well-known quote by Phil Karlton, there are only two hard things in computer science: cache invalidation and naming things. Having touched on cache-related issues in chapter 4, it’s time to face the second hard problem. High-level programming allows you to assign arbitrary names to program elements. In Java, these are packages, classes, methods, and all kind of variables, including fields. The language imposes some restrictions on these names (like no spaces), and practicality suggests that they should be relatively short.

I assume you’re already familiar with the basic lexical convention of Java (shared by many languages, including C# and C++) based on so-called camel case. Here are some general guidelines about the types of names suggested for different circumstances:

  • Names should be descriptive so that a reader unfamiliar with your code can surmise at least a general idea of the role of the element named. This doesn’t necessarily mean that names must be long. For instance, in several cases single-letter names are fine:

    • i is a good name for an array index because it’s a customary, and therefore clear, choice.
    • For the same reason, x is a good name for the horizontal coordinate in a Cartesian plane.
    • a and b are good names for the two parameters of a simple comparator:
        Comparator<String> stringComparatorByLength =
           (a,b) -> Integer.compare(a.length(), b.length());        
      In this context, the reader doesn’t need more descriptive names to figure out your intent. (On the other hand, notice the long name for the comparator itself.)
    • T is a good name for a type parameter (as in class LinkedList<T>) because of conventions, and because most type parameters would be called “typeOfElements” anyway.
  • Class names should be nouns, and method names should be verbs.
  • Names shouldn’t use nonstandard abbreviations.
Readability Tip

Use descriptive names, avoid abbreviations, and follow established conventions.

Pop Quiz 4

What name is the most appropriate for the field holding the monthly salary in an Employee class: salary, s, monthlySalary, or employeeMonthlySalary?

7.3.3. White space and indentation

Finally, most languages, including Java, allow ample freedom regarding the visual layout of code. You can split lines at (almost) every point, freely insert white space around symbols, and insert empty lines everywhere. You should use this freedom not to express your artistic creativity (there’s ASCII art for that), but to lessen the cognitive burden on the fellow programmer who’s going to read your code later on.

Correct indentation is absolutely essential, but I trust you already know and practice it. One step beyond basic indentation, you can use white space to align two parts of a split line. A common case is methods with many parameters, like this String instance method:

public boolean regionMatches(int toffset,
                             String other,
                             int ooffset,
                             int len)

Regarding empty lines in code, think of them as punctuation. If a method is akin to a paragraph of text, both in length and in internal coherence, an empty code line is comparable to a period. Don’t use it when a simple comma would do. You should use empty lines to visually separate code sections that are conceptually diverse, including separating different methods or disparate parts of the same method. You can see an example of the latter in the connectTo method, in both Reference (listing 7.3) and Readable (listing 7.4).

Readability Tip

Use an empty line like a sentence-ending period in a paragraph of text.

In the next section, we’ll develop a readability-optimized version of the container class, nicknamed Readable.

7.4. Readable containers    [Readable]

Let’s start from Reference and use the following techniques to improve its readability:

  • Add comments to the class as a whole and to its public methods, in a standard format that can be easily converted into HTML documentation. This step will be the only change we make to addWater and getAmount because their body is simple enough to be straightforward.
  • Apply refactoring rules to the body of connectTo to improve its structural features.

First, it’s important to familiarize yourself with the standard format for Java documentation comments: Javadoc.

7.4.1. Documenting the class header with Javadoc

Javadoc is the Java tool that extracts specially composed comments (using the sort of tags shown in tables 7.3 and 7.4) from source files and lays them out in nicely formatted HTML, thus producing easily navigable documentation. Javadoc originally generates the familiar online documentation for the Java API, as well as the documentation snippets that common IDEs provide on request.

Comments intended for Javadoc consumption must start with /**. Most HTML tags are allowed, such as

  • <p>, to start a new paragraph
  • <i>...</i>, to typeset text in italics
  • <code>...</code>, to typeset code snippets

Table 7.3. Summary of common Javadoc tags

Tag

Meaning

@author Class author (mandatory)
@version Class version (mandatory)
@return Description of a method return value
@param Description of a method parameter
@throws or @exception Description of the conditions for a given exception to be thrown
{@link ...} Generates a link to another program element (class, method, etc.)
{@code ...} Typesets a code snippet
Table 7.4. Summary of common Javadoc-compatible HTML tags

Tag

Meaning

<code>...</code> Typesets a code snippet
<p> Starts a new paragraph
<i>...</i> Italics
<b>...</b> Bold

Moreover, Javadoc recognizes various additional tags, all starting with the “@” symbol (not to be confused with Java annotations). For example, in the comment describing the whole class, you’re supposed to insert the self-explanatory tags @author and @version. Both tags are supposedly mandatory for the class description, but Javadoc won’t complain if they’re missing.

C# documentation comments

In C#, documentation comments should start with “///” (a triple slash) and can include a variety of XML tags. The compiler itself lifts those comments from the source files and stores them in a separate XML file. Visual Studio then uses the information in that file to enrich its contextual help functionalities, and the programmer can summon an external tool to arrange the comments into a readable layout, such as HTML. A popular open-source solution is the DocFX tool, which supports multiple languages besides C#, including Java.

Rather than presenting each Javadoc tag individually, let’s apply them right away to obtain a readability-optimized version of Container. At the very top of the Container source file, add the introductory comment shown in listing 7.1, providing a general description for the class. Such a comment is also the right place to introduce class-specific terminology, such as the word group to indicate the set of containers connected to this one.

By using the <code> HTML tag or the Javadoc {@code ...} tag, you can typeset code snippets. Tables 7.3 and 7.4 summarize the Javadoc and HTML tags you’re most likely to use in a comment.

Listing 7.1. Readable: The class header
/**  1 Beginning of a Javadoc comment
 *  A <code>Container</code> represents a water container
 *  with virtually unlimited capacity.
 *  <p>    2 Most HTML tags are allowed.
 *  Water can be added or removed.
 *  Two containers can be connected with a permanent pipe.
 *  When two containers are connected, directly or indirectly,
 *  they become communicating vessels, and water will distribute
 *  equally among all of them.
 *  <p>
 *  The set of all containers connected to this one is called the
 *  <i>group</i> of this container.
 *
 *  @author Marco Faella   3 Javadoc tag
 *  @version 1.0           4 Another Javadoc tag
 */
public class Container {
   private Set<Container> group;
   private double amount;

Figure 7.1 shows the HTML page that Javadoc generates from the comment in listing 7.1.

Figure 7.1. A snapshot of Javadoc-generated HTML documentation for Readable, including a class description and a list of constructors

Next, the constructor and the getAmount method are so simple that they need no readability enhancements, except for short documentation comments. Use the @return tag to describe the return value for a method.

Listing 7.2. Readable: Constructor and getAmount
   /** Creates an empty container. */
   public Container() {
      group = new HashSet<Container>();
      group.add(this);
   }

   /** Returns the amount of water currently held in this container.
     *
     * @return the amount of water currently held in this container
     */
   public double getAmount() {
      return amount;
   }

The redundancy in the comment for getAmount is justified by the way Javadoc displays the information. Every method is presented twice in the HTML page for the class: first, in a brief summary of all methods (see figure 7.2); then, in a more extensive section, describing each method in detail (see figure 7.3). The first sentence of the comment is included in the summary of all methods, so you can’t omit it. The @return line is only included in the detailed description of the method.

Figure 7.2. A snapshot of Javadoc-generated HTML documentation: the summary of the public methods of Readable

Figure 7.3. A snapshot of Javadoc-generated HTML documentation: a detailed description of the getAmount method

7.4.2. Cleaning connectTo

We now turn our attention to the connectTo method, which can use some refactoring to improve its readability. First, recall the implementation of this method in Reference, reproduced here for convenience:

Listing 7.3. Reference: The connectTo method
   public void connectTo(Container other) {

      // If they are already connected, do nothing
      if (group==other.group) return;

      int size1 = group.size(),
          size2 = other.group.size();
      double tot1 = amount * size1,
             tot2 = other.amount * size2,
             newAmount = (tot1 + tot2) / (size1 + size2);

      // Merge the two groups
      group.addAll(other.group);
      // Update group of containers connected with other
       1 You can replace comments like this with a properly named support method

      for (Container c: other.group) { c.group = group; }
      // Update amount of all newly connected containers
      for (Container c: group) { c.amount = newAmount; }
   }

I already pointed out one of the defects of the reference implementation in chapter 3: an abundance of in-method comments, trying to explain every single line. Adding such comments is the natural course of action for programmers who care about making their code understandable by fellow humans. However, it’s not the most efficient way to achieve this excellent objective. A better alternative is the Extract Method refactoring technique.

Readability Tip

Refactoring rule Extract Method: Move a coherent block of code into a new method with a descriptive name.

Method connectTo offers ample opportunities to apply this technique. In fact, you can apply it five times and obtain as many new support methods, as well as a new, much more readable version of connectTo, as shown in the following listing.

Listing 7.4. Readable: The connectTo method
   /** Connects this container with another.
     *
     *  @param other The container that will be connected to this one
     */
   public void connectTo(Container other) {
      if (this.isConnectedTo(other))
         return;

      double newAmount = (groupAmount() + other.groupAmount()) /
                         (groupSize() + other.groupSize());
      mergeGroupWith(other.group);
      setAllAmountsTo(newAmount);
   }

The @param Javadoc tag documents a method parameter. It’s followed by the parameter name and by its description. Compared to Reference, the method is much shorter and more readable. If you’re not convinced, try reading the body aloud and notice how it almost makes sense as a short paragraph of text.

You achieve this effect by introducing five aptly named support methods. Indeed, long method is one of the code smells that Fowler identifies, and extract method is the refactoring technique aimed at getting rid of that smell. In agile parlance, the new version of connectTo in listing 7.4 is five extract-methods away from its old version in Reference.

Whereas adding a comment only explains some code, Extract Method both explains and hides the code, pushing it away in a separate method. In this way, it keeps the abstraction level in the original method at a higher and more uniform height, avoiding the cumbersome swing between high-level explanations and low-level implementations in listing 7.3.

Replace Temp with Query is another refactoring technique that you can use on connectTo.

Readability Tip

Refactoring rule Replace Temp with Query: Replace a local variable with the invocation to a new method that computes its value.

You could apply this technique to the local variable newAmount, which is assigned only once and then used as the argument of setAllAmountsTo. A straightforward application of the technique would lead to removing the variable newAmount and replacing the last two lines of connectTo with the following:

    mergeGroupWith(other.group);
    setAllAmountsTo(amountAfterMerge(other));

Here, amountAfterMerge is a new method responsible for computing the correct amount of water in each container after the merge. However, a little thought reveals that amountAfterMerge needs to jump through hoops to fulfill its task because the groups already have been merged when the method is invoked. In particular, the set that this .group already points to contains all the elements from other.group.

A good compromise would be to encapsulate the expression for the new amount into a new method, but keep the local variable as well, so that we can compute the new amount before merging the groups:

   final double newAmount = amountAfterMerge(other);
   mergeGroupWith(other.group);
   setAllAmountsTo(newAmount);

All in all, I wouldn’t recommend this refactoring because the expression assigned to newAmount in listing 7.4 is quite readable and doesn’t need to be hidden away in a separate method. The Replace Temp with Query rule tends to be more useful when the expression it replaces is more complicated or occurs multiple times throughout the class.

Now, let’s have a look at the five new methods that support the readable version of connectTo. Of these five, two are better declared private because they may leave the object in an inconsistent state, so you shouldn’t call them from outside the class. They are mergeGroupWith and setAllAmountsTo.

Method mergeGroupWith merges two groups of containers without updating their water amount. If someone were to invoke it in isolation, it would most likely leave a wrong amount of water in some or all containers. This method only makes sense in the exact context where it’s used: at the end of connectTo, immediately followed by a call to setAllAmountsTo. In fact, it’s debatable whether it should really be a separate method. On the one hand, having it separate allows us to document its intent with its name, instead of using a comment like we did in Reference. On the other hand, a separate method runs the risk of being called in the wrong context. Because we’re optimizing for clarity in this chapter, we’ll leave it separate. A similar argument holds for setAllAmountsTo.

The code for these two methods is shown in the following listing.

Listing 7.5. Readable: Two new private methods supporting connectTo
   private void mergeGroupWith(Set<Container> otherGroup) {
      group.addAll(otherGroup);
      for (Container x: otherGroup) {
         x.group = group;
      }
   }

   private void setAllAmountsTo(double amount) {
      for (Container x: group) {
         x.amount = amount;
      }
   }

Private methods aren’t deemed worthy of Javadoc comments. They’re only used inside the class, so few people should ever feel the need to understand them in detail. Hence, the potential benefit of a comment doesn’t repay its cost.

The cost of a comment isn’t limited to the time spent writing it. Just like any other source line, it needs to be maintained, or it may become stale—that is to say, out of sync with the code it’s supposed to clarify. Remember: a stale comment is worse than no comment!

Replacing comments with descriptive names doesn’t rule out this particular risk. Without the proper coding discipline and processes, you may still end up with stale names, which are just as bad as stale comments.

The other three new support methods are innocuous read-only functionalities that may as well be declared public. This is not to say that you should take lightly the decision to make them public. The future maintainability cost of adding any public member to a class is much greater than the cost of adding the same member with private visibility. Additional costs for a public method include

  • appropriate documentation describing its contract
  • precondition checks to withstand interactions with possibly incorrect clients
  • a set of tests providing confidence in its correctness

In this particular case, these costs are arguably quite limited because the three methods under consideration are simple read-only functionalities with no preconditions to speak of.[2] Besides, these three methods provide information to the clients that isn’t otherwise available. As such, they significantly improve the class testability, as discussed in chapter 5.

2

To be precise, isConnectedTo requires its argument to be non-null. This is such a trivial precondition that you don’t need to document or actively check it. Violating it will raise an NPE just as expected.

Listing 7.6. Readable: Three new public methods supporting connectTo
   /** Checks whether this container is connected to another one.
    *
    *  @param other the container whose connection with this will be checked
    *  @return <code>true</code> if this container is connected
    *                            to <code>other</code>
    */
   public boolean isConnectedTo(Container other) {
      return group == other.group;
   }

   /** Returns the number of containers in the group of this container.
    *
    *  @return the size of the group
    */
   public int groupSize() {
      return group.size();
   }

   /** Returns the total amount of water in the group of this container.
    *
    *  @return the amount of water in the group
    */
   public double groupAmount() {
      return amount * group.size();
   }

Incidentally, the isConnectedTo method also improves the testability of our class by making directly observable something that we could only surmise in all previous implementations.

All six methods that make up the connectTo functionality are very short, the longest being connectTo itself, at six lines. Brevity is one of the main tenets of clean code.

7.4.3. Cleaning addWater

Finally, there’s addWater. Its body doesn’t change compared to Reference. We just improve its documentation to better reflect its contract, using Javadoc syntax.

Listing 7.7. Readable: The addWater method
   /** Adds water to this container.
    *  A negative <code>amount</code> indicates removal of water.
    *  In that case, there should be enough water in the group
    *  to satisfy the request.
    *
    *  @param amount the amount of water to be added
    */
   public void addWater(double amount) {
      double amountPerContainer = amount / group.size();
      for (Container c: group) {
         c.amount += amountPerContainer;
      }
   }

Compare this Javadoc method description with the contract for addWater I presented in chapter 5:

  • Precondition— If the argument is negative, there’s enough water in the group.
  • Postcondition— Distributes water equally to all containers in the group.
  • Penalty— Throws IllegalArgumentException.

Notice how the comments in the listing don’t mention the reaction to the client violating the precondition by removing more water than is actually present. That’s because this implementation (just like Reference) doesn’t check that condition and allows containers to hold a negative amount of water. Looking back at figure 7.2, you can witness the HTML page that Javadoc generates from those comments.

What if the implementation checked that condition and actually implemented the penalty that the contract established by throwing IllegalArgumentException? Both the Javadoc style guide and the Effective Java book suggest to document unchecked exceptions using the @throws or @exception tags (which are equivalent).[3] A line like

3

See Item 74 in Effective Java, 3rd ed.

@throws IllegalArgumentException
        if an attempt is made to remove more water than actually present

A quick look at the official Java API documentation shows that this is indeed standard practice. As an example, the documentation for the get(int index) method from ArrayList, returning the element at position index in the list, reports that the method will throw the unchecked exception IndexOutOfBoundsException if the index is out of the proper range.

Pop Quiz 5

Suppose a public method may throw an AssertionError if it detects a violation of a class invariant. Would you document this circumstance in the Javadoc for this method?

7.5. Final thoughts on readability

This chapter is somewhat different from the previous ones in that you can readily apply its advice to most, if not all practical scenarios. Even though I said in chapter 1 that readability may contrast with other quality objectives, such as time or space efficiency, in most of these conflicts it’s readability that should prevail. Human readability is a huge benefit when a given piece of sofware will inevitably need to evolve, because of bugs being found or new features being requested.

Still, we shouldn’t confuse code clarity with algorithmic simplicity. I’m not suggesting to shun an efficient algorithm in favor of a naive one in the name of readability. Rather, you should pick the best algorithm for the job and then strive to code it in the cleanest possible way. Clarity rightfully defies performance hacks, not proper engineering.

For the sake of completeness, I should mention a couple of scenarios in which readability is either a luxury or something to be actively avoided. Examples of the first are tightly timed programming challenges like hackathons or coding competitions. Those scenarios require contestants to quickly write throwaway code that just works. Any delay is a cost, and style considerations go out the window.

Another special scenario arises when companies don’t want their source code to be analyzed by others, including the legitimate users of their software. By hiding or obfuscating their source code, such companies hope to hide their algorithms or data. In such cases, it may seem natural to abandon code readability and go for the most cryptic lines that get the job done. In fact, there’s a specific type of software, called an obfuscator, whose job is precisely to translate a program into another program that is functionally equivalent to the first, but extremely hard to understand for a human reader. You can obfuscate all programming languages,[4] from machine code to Java bytecode or source code. Just googling “Java obfuscator” provides a rich selection of open source and commercial tools for this task. Given the availability of such tools, even the most secretive company can benefit from internally handling clean, self-explanatory code, which is then rendered obscure before being publicly released.

4

Some languages are designed to be unreadable and hardly need any obfuscation. Do you know any? Hint: 3

7.6. And now for something completely different

In this section, you’ll apply the guidelines for readable code to a different example. It’s a single method that accepts a two-dimensional array of doubles and . . . does something to it. I’ve written the method’s body in an intentionally sloppy style; not exactly obscure, but not very readable either. As an exercise, try to understand what it does before reading ahead.

   public static void f(double[][] a) {
      int i = 0, j = 0;
      while (i<a.length) {
         if (a[i].length != a.length)
            throw new IllegalArgumentException();
         i++;
      }
      i = 0;
      while (i<a.length) {
         j = 0;
         while (j<i) {
            double temp = a[i][j];
            a[i][j] = a[j][i];
            a[j][i] = temp;
            j++;
         }
         i++;
      }
   }

Did you feel the pain? Those while loops and meaningless variable names really put a strain on your brain. Imagine a whole program written in the same style!

As you might have guessed, the mystery method transposes a square matrix, a standard operation that swaps rows with columns. The first while loop checks whether the provided matrix is square-shaped—has as many rows as columns. Since Java matrices can be irregular, this entails checking that each row has the same length as the number of rows. Here’s an annotated version of the same method, to help you recognize the various parts:

   public static void f(double[][] a) {
      int i = 0, j = 0;
      while (i<a.length) {  1 For each row
         if (a[i].length != a.length)  2 If the row length is "wrong"
            throw new IllegalArgumentException();
         i++;
      }
      i = 0;
      while (i<a.length) {  3 For each row
         j = 0;
         while (j<i) {      4 For each column less than i
            double temp = a[i][j];  5 Swap a[i][j] and a[j][i]
            a[i][j] = a[j][i];
            a[j][i] = temp;
            j++;
         }
         i++;
      }
   }

It’s time to improve the readability of this method using this chapter’s guidelines. First, the initial squareness check is the ideal occasion for the Extract Method refactoring rule: it’s a coherent operation with a clearly specified contract. Once you put it in a separate method, it might also be useful in other contexts. That’s why I’m declaring it public and equipping it with a full Javadoc comment.

Since the squareness check doesn’t modify the matrix, you can use an enhanced for as its main loop:

  /** Checks whether a matrix is square-shaped
    *
    * @param matrix a matrix
    * @return {@code true} if the given matrix is square
    */
   public static boolean isSquare(double[][] matrix) {
      for (double[] row: matrix) {
         if (row.length != matrix.length) {
            return false;
         }
      }
      return true;
   }

Then, the transpose method itself invokes isSquare and then performs its job with two straightforward for loops. An enhanced for would be useless here, because you need row and column indices to perform the swap.

Along the way, improve the names of the variables, and of the method itself, by making them more descriptive. You can keep names i and j for the row and column indices because those are standard names for array indices.

  /** Transposes a square matrix
    *
    * @param matrix a matrix
    * @throws IllegalArgumentException if the given matrix is not square
    */
   public static void transpose(double[][] matrix) {
      if (!isSquare(matrix)) {
         throw new IllegalArgumentException(
                   "Can’t transpose a nonsquare matrix.");
      }
      for (int i=0; i<matrix.length; i++) {  1 For each row
         for (int j=0; j<i; j++) {           2 For each column less than i
            double temp = matrix[i][j];      3 Swap a[i][j] and a[j][i]
            matrix[i][j] = matrix[j][i];
            matrix[j][i] = temp;
         }
      }
   }

7.7. Real-world use cases

You’ve seen and applied some very important principles to improve the readability of this code. Here are a couple use cases to help you understand the practical importance of this trait.

  • Imagine being one of the cofounders of a small startup and having managed to win a bid to develop software for the company that manages gas infrastructure, and the objective of the project is to implement regulatory law. Things look good: you’ve been assigned a prestigious project and, because legislation doesn’t change easily, you realize that after delivering you’ll be able to enjoy the fruit of your labor for the duration of the maintenance contract. You and your colleagues make a strategic decision to deliver your solution as fast as possible to impress your client. To achieve that, you decide to cut back on luxuries such as readability, documentation, unit tests, and so on. After a couple of years, your company has grown, but half of the original team has left the company, and you still have the contract with the gas operator. Then one day, the impossible happens: legislation changes, and you’re asked to modify your software to implement the new requirements. You learn the hard way that figuring out how your existing code works is harder than implementing new requirements. Code readability is so important that it’s a determining factor for how teams operate in software companies (http://mng.bz/pyKE).
  • You’re an enthusiastic, talented developer eager to contribute to the open source community. You have a great idea (or at least, so you think), and your goal is to share your code on github, hoping that it will attract contributors and eventually be used by people for real projects. You realize that readability is the key to attracting contributors, who will initially be unfamiliar with your code base and probably reluctant to ask questions about it.

The following examples show how seriously the programming world has taken the idea of readability.

  • Working hard to make your code readable is something you have to do regardless of the programming language you’re using. However, for some programming languages, readability is a design characteristic. Python is among the most popular languages, and one of the reasons for this popularity is arguably its inherent readability. In fact, readability is considered so important that the language designer introduced the famous PEP8 (Python Enhancement Proposal), a coding style guide whose basic goal is (surprise!) to improve readability.
  • Let’s talk about Python again. (Yes, this book features Java, but these principles are universal.) Python is a dynamically typed language, so you don’t have to specify the type of function parameters and return values. However, PEP 484 introduced optional type hints in Python 3.5, providing a standard way to declare those types. These hints have absolutely no effect on performance, nor do they provide runtime type inference. Their purpose is to enhance readability and support more static type checks, thus also improving reliability.

7.8. Applying what you learned

Exercise 1

Given the following data:

List<String> names;
double[] lengths;

What kind of loop would you use to accomplish the following tasks?

  1. Print all names in the list.
  2. Remove from the list all names longer than 20 characters.
  3. Compute the sum of all lengths.
  4. Set a Boolean flag to true if the array contains a zero length.

Exercise 2

As you might know, the method charAt from the class String returns the character of this string at a given index:

public char charAt(int index)

Write a Javadoc comment describing the contract of this method and then compare it to the official documentation.

Exercise 3

Examine the following method. Guess what it does and make it more readable. (Don’t forget to add a Javadoc method comment.) You can find the source code for this exercise and the next one in the online repository (https://bitbucket.org/mfaella/exercisesinstyle).

   public static int f(String s, char c) {
      int i = 0, n = 0;
      boolean flag = true;
      while (flag) {
         if (s.charAt(i) == c)
            n++;
         if (i == s.length() -1)
            flag = false;
         else
            i++;
      }
      return n;
   }

Exercise 4

The following method comes from a collection of algorithms hosted in a github repository (starred by 10k people and forked 4k times). The method performs a breadth-first visit of a graph, represented as an adjacency matrix of type byte. You don’t have to know this algorithm to complete this exercise. Just know that the a[i][j] cell contains 1 if there’s an edge from node i to node j, and 0 otherwise.

Improve the method readability in two steps. First, make only exterior changes to variable names and comments. Then, make structural changes. All changes must preserve both the API (types of parameters) and the visible behavior (the on-screen output).

/**
 * The BFS implemented in code to use.
 *
 * @param a Structure to perform the search on a graph, adjacency matrix etc.
 * @param vertices The vertices to use
 * @param source The Source
 */
public static void bfsImplement(byte [][] a,int vertices,int source){
                           //passing adjacency matrix and number of vertices
   byte []b=new byte[vertices];    //flag container containing status
                                  //of each vertices
   Arrays.fill(b,(byte)-1);   //status initialization
   /*       code   status
            -1  =  ready
             0  =  waiting
             1  =  processed       */

   Stack<Integer> st = new Stack<>();     //operational stack
   st.push(source);                                     //assigning source
   while(!st.isEmpty()){
      b[st.peek()]=(byte)0;                      //assigning waiting status
      System.out.println(st.peek());
      int pop=st.peek();
      b[pop]=(byte)1;               //assigning processed status
      st.pop();                  //removing head of the queue
      for(int i=0;i<vertices;i++){
         if(a[pop][i]!=0 && b[i]!=(byte)0 && b[i]!=(byte)1 ){
            st.push(i);
            b[i]=(byte)0;                        //assigning waiting status
         }}}
}

Summary

  • Readability is a major factor contributing toward reliability and maintainability.
  • You can promote readability through both structural and exterior means.
  • One of the objectives of common refactorings is to improve readability.
  • Self-documenting code is preferable to implementation comments.
  • You should detail and format documentation comments in standard ways to make them easily browsable.

Answers to quizzes and exercises

Pop Quiz 1

Readability positively affects maintainability and reliability because readable code is easier to understand and modify in a safe manner.

Pop Quiz 2

You can’t use an enhanced for because you need to modify the array’s entries, and you need an index for that. The best choice for iterating over a whole array using an explicit index is a standard for loop.

Pop Quiz 3

You should consider a comment describing the behavior of a private method an implementation comment. Private methods are not exposed to the clients.

Pop Quiz 4

The most appropriate name is probably monthlySalary. Alternatives s and salary contain too little information, whereas employeeMonthlySalary needlessly repeats the class name.

Pop Quiz 5

You shouldn’t document an AssertionError because that kind of exception is only thrown if an internal error occurs.

Exercise 1

  1. An enhanced for is the ideal loop for the first task:
       for (String name: names) {
          System.out.println(name);
       }
  2. This is the job for an iterator:
       Iterator<String> iterator = names.iterator();
       while (iterator.hasNext()) {
          if (iterator.next().length() > 20) {
             iterator.remove();
          }
       }      
  3. Once again, use an enhanced for:
       double totalLength = 0;
       for (double length: lengths) {
          totalLength += length;
       }
    or the following stream-based one-liner:
      double totalLength = Arrays.stream(lengths).sum();
  4. Common wisdom suggests using a while loop when the data (the content of the array) determines the exit condition. I think an enhanced for plus a break statement is at least as appropriate, as it automatically takes care of the case when the whole array needs to be scanned.
       boolean containsZero = false;
       for (double length: lengths) {
          if (length == 0) {
             containsZero = true;
             break;
          }
      }
    The stream library provides a handy alternative:
    boolean containsZero = Arrays.stream(lengths).anyMatch(
                           length -> length == 0);

Exercise 2

Here’s a slightly simplified version of the Javadoc from OpenJDK 12:

    /**
     * Returns the {@code char} value at the
     * specified index. An index ranges from {@code 0} to
     * {@code length() - 1}. The first {@code char} value of the sequence
     * is at index {@code 0}, the next at index {@code 1},
     * and so on, as for array indexing.
     *
     * @param      index   the index of the {@code char} value.
     * @return     the {@code char} value at the specified index of this string.
     *             The first {@code char} value is at index {@code 0}.
     * @exception  IndexOutOfBoundsException  if the {@code index}
     *             argument is negative or not less than the length of this
     *             string.
     */

Exercise 3

It’s easy to see that the method simply counts the occurrences of a character inside a string. The while loop and the flag are useless detours, replaced by a simple for loop in the following solution:

  /** Counts the number of occurrences of a character in a string.
    *
    * @param s a string
    * @param c a character
    * @return The number of occurrences of {@code c} in {@code s}
    */
   public static int countOccurrences(String s, char c) {
      int count = 0;
      for (int i=0; i<s.length(); i++) {
         if (s.charAt(i) == c) {
            count++;
         }
      }
      return count;
   }

The stream library also allows an alternative implementation, where the method body consists of the following one liner:

return (int) s.chars().filter(character -> character == c).count();

The cast to int is due to the fact that the terminal operation count returns a value of type long. A more robust implementation would take precautions against overflow.

Exercise 4

Let’s jump to the final version, including both exterior and structural improvements. First, notice that the algorithm maintains a status for each node, which can take one of three values: fresh (not encountered yet), enqueued (put in the stack but not visited yet), and processed (visited). In the original implementation, this information is encoded in the array of bytes b. The first structural improvement is to use an enumeration for this purpose. Unfortunately, enumerations can’t be local to a method, so you have to put the following declaration in class scope (outside the method):

private enum Status { FRESH, ENQUEUED, PROCESSED };

Now you can refactor the main method, taking advantage of this enumeration, improving variable names, removing implementation comments, and fixing white space and indentation. You should end up with something like this:

  /** Visits the node in a directed graph in breadth first order,
    * printing the index of each visited node.
    *
    * @param adjacent     the adjacency matrix
    * @param vertexCount  the number of vertices
    * @param sourceVertex the source vertex
    */
   public static void breadthFirst(
                      byte[][] adjacent, int vertexCount, int sourceVertex) {
      Status[] status = new Status[vertexCount];
      Arrays.fill(status, Status.FRESH);

      Stack<Integer> stack = new Stack<>();
      stack.push(sourceVertex);

      while (!stack.isEmpty()) {
         int currentVertex = stack.pop();
         System.out.println(currentVertex);
         status[currentVertex] = Status.PROCESSED;
         for (int i=0; i<vertexCount; i++) {
            if (adjacent[currentVertex][i] != 0 && status[i] == Status.FRESH)
            {
               stack.push(i);
               status[i] = Status.ENQUEUED;
            }
         }
      }
   }

In the previous method, I left the use of the Stack class because it doesn’t affect readability, but you should know that the Stack class has been superseded by LinkedList and ArrayDeque.

Further reading

  • R. C. Martin. Clean Code. Prentice Hall, 2009. A detailed and comprehensive style guide written by one of the authors of the “Manifesto for Agile Software Development.” You can find related higher level design recommendations in the follow-up book, Clean Architecture (Prentice Hall, 2017).
  • S. McConnell. Code Complete. Microsoft Press, 2004. A wide-ranging, well-researched, nicely typeset handbook on coding practices, from the fine points of proper variable naming all the way to project scheduling and team management.
  • Brian W. Kernighan and P. J. Plauger. The Elements of Programming Style. McGraw-Hill, Inc., 1974. Arguably the first book to systematically tackle the code readability problem. Examples are in Fortran and PL/I. An updated second edition followed in 1978. Kernighan returned to the same topic 20 years later with R. Pike in the first chapter of The Practice of Programming (Addison-Wesley, 1999).
  • Martin Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley, 2018. The second edition of the classic book that popularized and standardized the notion of refactoring. You can take a look at the catalogue of refactoring rules from the book on the author’s website at https://martinfowler.com. The most popular IDEs let you apply many of these rules with a simple click or two.
  • Donald E. Knuth. Literate Programming. Center for the Study of Language and Information, 1995. A collection of essays promoting programming as an art form akin to literature.
  • How to Write Doc Comments for the Javadoc Tool. The official Javadoc style guide, as of this writing, available at http://mng.bz/YeDe.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.180.76