Source code serves two very different kinds of users: programmers and computers. Computers are as happy with messy code as they are with clean, well-structured systems. On the other hand, we programmers are utterly sensitive to the shape of the program. Even white space and indentation—completely irrelevant to the computer—make the difference between understandable and obscure code. (See appendix A for an extreme example.) In turn, easy-to-understand code boosts reliability, because it tends to hide fewer bugs, and maintainability, because it’s easier to modify.
In this chapter, I’ll show you some of the modern guidelines for writing readable code. As with the other chapters, my objective isn’t to provide a comprehensive survey of readability tips and tricks. I’ll focus on the main techniques that make sense on a small code unit and put them in practice on our usual running example.
Writing readable code is an undervalued art that schools seldom teach, but whose impact on software reliability, maintenance, and evolution is paramount. Programmers learn to express a set of desired functionalities in machine-friendly code. This encoding process takes time and inserts layer upon layer of abstraction to decompose those functionalities into smaller units. In Java parlance, these abstractions are packages, classes, and methods. If the overall system is large enough, no single programmer will dominate the entire codebase. Some developers will have a vertical view of a functionality: from its requirements to its implementation through all abstraction layers. Others may be in charge of one layer and supervise its API. From time to time, all of them will need to read and understand code their colleagues have written.
Promoting readability means minimizing the time that a reasonably knowledgable programmer needs to understand a given piece of code. A more concrete characterization would be the time that someone who isn’t familiar with the code needs to feel confident enough to modify it without breaking it. Other names for this quality are learnability and understandability.
Which other code quality attributes are affected by readability?
How do you write readable programs? As early as 1974, when C was two years old, this problem was deemed significant enough to deserve systematic treatment, leading to the influential book The Elements of Programming Style. In it, Kernighan (of C fame) and Plauger take apart a number of small programs, all drawn from published textbooks, summing up their lucid and surprisingly modern observations in a list of programming-style aphorisms. The first aphorism on expressions summarizes well the whole readability issue:
Indeed, readability is about clearly expressing the intent of the code. Grady Booch, one of the architects of UML, puts forward a natural analogy:
Now, creating well-written prose isn’t something you can achieve by following a fixed set of rules. It takes years of practice, not only in writing, but also in reading well-written prose by established authors. The expressive capabilities of computer code are definitely limited compared with natural languages, so the process of creating clean code is luckily somewhat simpler, or at least more structured, than producing a beautiful essay. Still, mastering this process requires years of practice that no book (or book chapter!) can replace. In this chapter, we’ll explore some basic ways to improve the readability of your code, focusing on those techniques that you can apply to our recurring example.
In the last two decades, readability has been put on the front burner by the Agile movement, thanks to the focus on refactoring and clean code. Refactoring is the idea of restructuring a working system to improve its design so that future change is easier and safer. It’s one of the main ingredients in those lightweight development processes that favor fast development phases and iterative refinement of software.
Even if you or your company doesn’t subscribe to the whole Agile philosophy, you can’t miss the literature that comes with it, which is full of brilliant ideas about the bad (code smells), the good (clean code), and how to turn the first into the latter (refactoring). See the Further reading section at the end of this chapter for specific suggestions.
It would be nice to supplement the readability tips that well-known experts have developed with hard data on the effectiveness of those tips. Unfortunately, readability is inherently subjective, and it’s extremely hard to come up with objective means to measure it. This hasn’t stopped researchers from proposing a variety of formal models, all attempting to estimate readability with a combination of simple numerical measures, like the length of the identifiers, the number of parentheses occurring in an expression, and so on. This ongoing effort is still far from reaching a stable consensus, so I’ll focus on some established industry best practices, starting with a quick look at the style policies of the biggest IT players.
Some of the largest software companies publish coding style guides online, including the following:
These guides mostly agree on the general principles I set forth in this chapter and only differ on the level of detail they reach and on small cosmetic issues. For example, consider the sequence of import statements at the beginning of a source file. Here’s one such sequence in Google’s format:
import static com.google.common.base.Strings.isNullOrEmpty; import static java.lang.Math.PI; import java.util.LinkedList; import javax.crypto.Cypher; import javax.crypto.SealedObject;
Here’s Twitter’s recommended style for the same imports:
import java.util.LinkedList; import javax.crypto.Cypher; import javax.crypto.SealedObject; import static com.google.common.base.Strings.isNullOrEmpty; import static java.lang.Math.PI;
Both the order and the use of empty lines are different. Oracle and Facebook, on the other hand, are fine with any layout of imports.
Style guides ensure some uniformity across a company’s code base and are a nice addition to the welcome package for new employees, giving them something easy to sink their teeth into, before the real troubles begin. (Besides, when those troubles start biting back, they can say, “At least I’m following the style guide!”) For your long-term professional growth, though, it’ll be much more useful for you to peruse this chapter and then spend some time with the articulated style books I’ve listed at the end of the chapter, particularly Clean Code and Code Complete.
You can distinguish the ingredients contributing to readability into two categories:
In the following sections, I’ll briefly recall the main guidelines regarding each category. Then, I’ll guide you through applying those guidelines to the water container running example.
Architectural-level features refer to the high-level structure of the program: how it’s split into classes and the relationships occurring between them. Generally speaking, an architecture that’s easy to understand should be composed of small classes with coherent responsibilities (aka high cohesion), tied together by an uncomplicated network of dependencies (aka low coupling ). Another readability-enhancing technique is to use standard design patterns whenever possible. Because most developers know them, they spark familiarity and convey a complement of contextual information to the reader.
Level |
Features |
Ways to improve |
---|---|---|
Architectural | Class responsibilities | Decrease coupling |
Increase cohesion | ||
Arch. patterns (MVC, MVP, etc.) | ||
Relationships between classes | Design patterns | |
Refactorings (Extract Class, etc.) | ||
Class | Control-flow | Use the most specific loop type |
Expressions | Show order of evaluation | |
Local variables | Split complex expressions | |
Method length | Refactorings (Extract Method, etc.) |
Each of these quick tips is tied to a large body of commentary and caveats. In the spirit of this book, which focuses on small-scale properties, I won’t delve into these architectural features, but you can find more information in the Further reading section at the end of this chapter. Table 7.1 summarizes the most relevant structural features and the corresponding best practices.
Class-level features pertain to the API of a given class and its organization in methods. For example, a golden rule is that long methods are harder to understand. At some point, certainly higher than 200 lines, you lose track of what was at the beginning of the method and end up going back and forth in your editor, trying to keep in your head what doesn’t fit on a single screen. I’m listing this principle among the class-level features because, even though the problem lies in a single method, its solution affects more than one method: you shorten a long method by splitting it into multiple methods, and the suggested way to do this is through the Extract Method refactoring rule, which I’ll present later in this chapter.
Now, let’s zoom in on some method-level features that affect readability. They include the choice of control flow statements, the way you write expressions, and the use of local variables.
An interesting small-scale readability issue is the choice of the most appropriate loop construct for a given scenario. Java offers four basic types of loops: standard for, while, do-while, and enhanced for. It’s easy to see that the first three are equivalent, in the sense that you can convert any of them into any of the others with little effort. For example, you can convert the exit-checked loop
do { body } while (condition);
into the following falsely entry-checked loop:
while (true) { body if (!condition) break; }
Which of these two snippets is more readable? I’m sure you’ll agree the first is definitely better. The second is an ugly gimmick that will only puzzle the reader, because they’ll be acutely aware that there was a more natural way to accomplish that task. Your job when optimizing readability is to avoid that feeling and make the reading experience as smooth and uneventful as possible. That’s the meaning of clearly expressing intent.
If you must implement a loop whose condition must be checked after each iteration as a do-while loop, what about an entry-checked loop? Because there are three options, let’s compare their expressivity:
for (int i=0; i<n; i++) { ... }is more readable than the equivalent
int i=0; while (i<n) { ... i++; }
To decide on a loop construct, you should apply a general rule known as the principle of least privilege and choose the most specific statement that fits your purposes. Is your loop over an array or a collection implementing Iterable? Use the enhanced for. Besides its readability value, it’ll guarantee that the iteration won’t go out of bounds.
Does your loop feature a compact initialization step and a similarly compact update step? Use a standard for loop. Otherwise, use a while loop.
Speaking of loops, starting from Java 8, you also have the option of using the stream library to produce functional-style looping constructs. For example, here’s how you print every object in a set:
Set<T> set = ... set.stream().forEach(obj -> System.out.println(obj));
Is it more readable than the following old-fashioned enhanced for?
for (T obj: set) { System.out.println(obj); }
Probably not. A good rule of thumb is to use the functional-style API when you have some other reason besides just looping, such as filtering or transforming the content of the stream in some way. One particularly good reason to use data streams is when you want to split the job among multiple threads. In that case, the library will take care of a lot of nasty details for you.
Choose the most natural and specific type of loop for the job.
What kind of loop would you use to initialize an array of n integers with the integers from 0 to n – 1?
Expressions are the basic building blocks of any programming language and can grow extremely complicated, essentially without limits. To improve readability, you should consider splitting complex expressions into simpler subexpressions and assigning their values to extra local variables that you introduce for this purpose. Naturally, you should give those new local variables descriptive names illustrating the meaning of the corresponding subexpression. (I’ll return to variable names shortly.)
Reference already employs this readability-enhancing strategy when the method connectTo computes the amount of water that should be present in each container after the new connection is made. The shortest way to describe this calculation would be something like the following:
public void connectTo(Container other) { ... double newAmount = (amount * group.size() + other.amount * other.group.size()) / (group.size() + other.group.size()); ... }
As you can see, even split among three lines and aligned, the resulting expression is long and somewhat hard to parse. The reader is likely to struggle, or at least pause, to find the matching parentheses, because the closing parenthesis is far away from its opening. The clumsy repetitions of group.size() and other.group.size() don’t help either.
That’s why Reference introduces as many as four extra variables, just to improve readability:
public void connectTo(Container other) { ... int size1 = group.size(), size2 = other.group.size(); double tot1 = amount * size1, tot2 = other.amount * size2, newAmount = (tot1 + tot2) / (size1 + size2); ... }
Features |
Ways to improve |
---|---|
Comments | Detailed documentation comments,scarce implementation comments |
Names | Descriptive names |
White space | White space as punctuation |
Indentation | Consistent indentation |
You shouldn’t worry about the second, more readable version being less efficient. In general, the performance cost of using a few extra local variables is negligible, especially if you compare it with the readability benefit. In this particular case, the extra variables save two method invocations and may even lead to faster execution.[1]
As a matter of fact, the bytecode for the readable version is three bytes shorter than the other version.
Martin Fowler has formalized this idea as one of the refactoring rules he has assembled. (See the Further reading section for more information.) Similar to how design patterns work, each refactoring rule is given a standard name to ease communication. The name of this rule is Extract Variable.
Refactoring rule Extract Variable: Replace a subexpression with a new local variable with a descriptive name.
You can use three exterior traits to improve readability: comments, names, and white space. Table 7.2 summarizes the corresponding best practices, presented in the following subsections.
Code alone can’t satisfactorily document itself. Sometimes you have to use natural language to provide further insight or convey a more global perspective on some functionality. It’s useful to distinguish two kinds of comments:
To a certain extent, when and how often to insert comments is open for debate, but the modern trend is to be generous with documentation comments and stingy with implementation ones.
The motivations stem from the following reasoning: the API precedes the implementation, is generally more stable than it, and is the only part of a class that clients should know so they can correctly employ its services. Therefore, it’s particularly important for the health of the overall system that the responsibilities and contracts of each class and method be perfectly clear to its clients. As you saw in chapter 5, you can express contracts in code only up to a certain point, whose exact extent depends on the programming language of choice. Beyond that, natural language comments and other forms of documentation take over.
Is a comment describing the behavior of a private method a specification comment or an implementation comment?
Conversely, method bodies change often and are hidden from the clients. Because they change often, you need to update any comment inside them equally often, and programmers are known to forget to update a comment (or any other action having no immediate repercussions on the program behavior). You’ve probably been there: tasked with updating a piece of code, for a bug fix or a new feature, probably under a tight deadline. You’re likely to focus on functionality, on writing code that works and passes the tests. Unless your company adopts serious forms of code inspection, no downstream filter on the quality of the comments is in place. As such, it’s just natural to ignore the comments and deal with the active code lines.
If word spreads that some comments in a given codebase are unreliable because they may be stale, all of the comments immediately become pure noise, even if most of them are in fact good and up-to-date.
Cut back implementation comments in favor of documentation comments, and make sure that all comments are up-to-date. (Code reviews can help.)
According to a well-known quote by Phil Karlton, there are only two hard things in computer science: cache invalidation and naming things. Having touched on cache-related issues in chapter 4, it’s time to face the second hard problem. High-level programming allows you to assign arbitrary names to program elements. In Java, these are packages, classes, methods, and all kind of variables, including fields. The language imposes some restrictions on these names (like no spaces), and practicality suggests that they should be relatively short.
I assume you’re already familiar with the basic lexical convention of Java (shared by many languages, including C# and C++) based on so-called camel case. Here are some general guidelines about the types of names suggested for different circumstances:
Comparator<String> stringComparatorByLength = (a,b) -> Integer.compare(a.length(), b.length());In this context, the reader doesn’t need more descriptive names to figure out your intent. (On the other hand, notice the long name for the comparator itself.)
Use descriptive names, avoid abbreviations, and follow established conventions.
What name is the most appropriate for the field holding the monthly salary in an Employee class: salary, s, monthlySalary, or employeeMonthlySalary?
Finally, most languages, including Java, allow ample freedom regarding the visual layout of code. You can split lines at (almost) every point, freely insert white space around symbols, and insert empty lines everywhere. You should use this freedom not to express your artistic creativity (there’s ASCII art for that), but to lessen the cognitive burden on the fellow programmer who’s going to read your code later on.
Correct indentation is absolutely essential, but I trust you already know and practice it. One step beyond basic indentation, you can use white space to align two parts of a split line. A common case is methods with many parameters, like this String instance method:
public boolean regionMatches(int toffset, String other, int ooffset, int len)
Regarding empty lines in code, think of them as punctuation. If a method is akin to a paragraph of text, both in length and in internal coherence, an empty code line is comparable to a period. Don’t use it when a simple comma would do. You should use empty lines to visually separate code sections that are conceptually diverse, including separating different methods or disparate parts of the same method. You can see an example of the latter in the connectTo method, in both Reference (listing 7.3) and Readable (listing 7.4).
Use an empty line like a sentence-ending period in a paragraph of text.
In the next section, we’ll develop a readability-optimized version of the container class, nicknamed Readable.
Let’s start from Reference and use the following techniques to improve its readability:
First, it’s important to familiarize yourself with the standard format for Java documentation comments: Javadoc.
Javadoc is the Java tool that extracts specially composed comments (using the sort of tags shown in tables 7.3 and 7.4) from source files and lays them out in nicely formatted HTML, thus producing easily navigable documentation. Javadoc originally generates the familiar online documentation for the Java API, as well as the documentation snippets that common IDEs provide on request.
Comments intended for Javadoc consumption must start with /**. Most HTML tags are allowed, such as
Meaning |
|
---|---|
@author | Class author (mandatory) |
@version | Class version (mandatory) |
@return | Description of a method return value |
@param | Description of a method parameter |
@throws or @exception | Description of the conditions for a given exception to be thrown |
{@link ...} | Generates a link to another program element (class, method, etc.) |
{@code ...} | Typesets a code snippet |
Tag |
Meaning |
---|---|
<code>...</code> | Typesets a code snippet |
<p> | Starts a new paragraph |
<i>...</i> | Italics |
<b>...</b> | Bold |
Moreover, Javadoc recognizes various additional tags, all starting with the “@” symbol (not to be confused with Java annotations). For example, in the comment describing the whole class, you’re supposed to insert the self-explanatory tags @author and @version. Both tags are supposedly mandatory for the class description, but Javadoc won’t complain if they’re missing.
In C#, documentation comments should start with “///” (a triple slash) and can include a variety of XML tags. The compiler itself lifts those comments from the source files and stores them in a separate XML file. Visual Studio then uses the information in that file to enrich its contextual help functionalities, and the programmer can summon an external tool to arrange the comments into a readable layout, such as HTML. A popular open-source solution is the DocFX tool, which supports multiple languages besides C#, including Java.
Rather than presenting each Javadoc tag individually, let’s apply them right away to obtain a readability-optimized version of Container. At the very top of the Container source file, add the introductory comment shown in listing 7.1, providing a general description for the class. Such a comment is also the right place to introduce class-specific terminology, such as the word group to indicate the set of containers connected to this one.
By using the <code> HTML tag or the Javadoc {@code ...} tag, you can typeset code snippets. Tables 7.3 and 7.4 summarize the Javadoc and HTML tags you’re most likely to use in a comment.
/** 1 Beginning of a Javadoc comment * A <code>Container</code> represents a water container * with virtually unlimited capacity. * <p> 2 Most HTML tags are allowed. * Water can be added or removed. * Two containers can be connected with a permanent pipe. * When two containers are connected, directly or indirectly, * they become communicating vessels, and water will distribute * equally among all of them. * <p> * The set of all containers connected to this one is called the * <i>group</i> of this container. * * @author Marco Faella 3 Javadoc tag * @version 1.0 4 Another Javadoc tag */ public class Container { private Set<Container> group; private double amount;
Figure 7.1 shows the HTML page that Javadoc generates from the comment in listing 7.1.
Next, the constructor and the getAmount method are so simple that they need no readability enhancements, except for short documentation comments. Use the @return tag to describe the return value for a method.
/** Creates an empty container. */ public Container() { group = new HashSet<Container>(); group.add(this); } /** Returns the amount of water currently held in this container. * * @return the amount of water currently held in this container */ public double getAmount() { return amount; }
The redundancy in the comment for getAmount is justified by the way Javadoc displays the information. Every method is presented twice in the HTML page for the class: first, in a brief summary of all methods (see figure 7.2); then, in a more extensive section, describing each method in detail (see figure 7.3). The first sentence of the comment is included in the summary of all methods, so you can’t omit it. The @return line is only included in the detailed description of the method.
We now turn our attention to the connectTo method, which can use some refactoring to improve its readability. First, recall the implementation of this method in Reference, reproduced here for convenience:
public void connectTo(Container other) { // If they are already connected, do nothing if (group==other.group) return; int size1 = group.size(), size2 = other.group.size(); double tot1 = amount * size1, tot2 = other.amount * size2, newAmount = (tot1 + tot2) / (size1 + size2); // Merge the two groups group.addAll(other.group); // Update group of containers connected with other 1 You can replace comments like this with a properly named support method for (Container c: other.group) { c.group = group; } // Update amount of all newly connected containers for (Container c: group) { c.amount = newAmount; } }
I already pointed out one of the defects of the reference implementation in chapter 3: an abundance of in-method comments, trying to explain every single line. Adding such comments is the natural course of action for programmers who care about making their code understandable by fellow humans. However, it’s not the most efficient way to achieve this excellent objective. A better alternative is the Extract Method refactoring technique.
Refactoring rule Extract Method: Move a coherent block of code into a new method with a descriptive name.
Method connectTo offers ample opportunities to apply this technique. In fact, you can apply it five times and obtain as many new support methods, as well as a new, much more readable version of connectTo, as shown in the following listing.
/** Connects this container with another. * * @param other The container that will be connected to this one */ public void connectTo(Container other) { if (this.isConnectedTo(other)) return; double newAmount = (groupAmount() + other.groupAmount()) / (groupSize() + other.groupSize()); mergeGroupWith(other.group); setAllAmountsTo(newAmount); }
The @param Javadoc tag documents a method parameter. It’s followed by the parameter name and by its description. Compared to Reference, the method is much shorter and more readable. If you’re not convinced, try reading the body aloud and notice how it almost makes sense as a short paragraph of text.
You achieve this effect by introducing five aptly named support methods. Indeed, long method is one of the code smells that Fowler identifies, and extract method is the refactoring technique aimed at getting rid of that smell. In agile parlance, the new version of connectTo in listing 7.4 is five extract-methods away from its old version in Reference.
Whereas adding a comment only explains some code, Extract Method both explains and hides the code, pushing it away in a separate method. In this way, it keeps the abstraction level in the original method at a higher and more uniform height, avoiding the cumbersome swing between high-level explanations and low-level implementations in listing 7.3.
Replace Temp with Query is another refactoring technique that you can use on connectTo.
Refactoring rule Replace Temp with Query: Replace a local variable with the invocation to a new method that computes its value.
You could apply this technique to the local variable newAmount, which is assigned only once and then used as the argument of setAllAmountsTo. A straightforward application of the technique would lead to removing the variable newAmount and replacing the last two lines of connectTo with the following:
mergeGroupWith(other.group); setAllAmountsTo(amountAfterMerge(other));
Here, amountAfterMerge is a new method responsible for computing the correct amount of water in each container after the merge. However, a little thought reveals that amountAfterMerge needs to jump through hoops to fulfill its task because the groups already have been merged when the method is invoked. In particular, the set that this .group already points to contains all the elements from other.group.
A good compromise would be to encapsulate the expression for the new amount into a new method, but keep the local variable as well, so that we can compute the new amount before merging the groups:
final double newAmount = amountAfterMerge(other); mergeGroupWith(other.group); setAllAmountsTo(newAmount);
All in all, I wouldn’t recommend this refactoring because the expression assigned to newAmount in listing 7.4 is quite readable and doesn’t need to be hidden away in a separate method. The Replace Temp with Query rule tends to be more useful when the expression it replaces is more complicated or occurs multiple times throughout the class.
Now, let’s have a look at the five new methods that support the readable version of connectTo. Of these five, two are better declared private because they may leave the object in an inconsistent state, so you shouldn’t call them from outside the class. They are mergeGroupWith and setAllAmountsTo.
Method mergeGroupWith merges two groups of containers without updating their water amount. If someone were to invoke it in isolation, it would most likely leave a wrong amount of water in some or all containers. This method only makes sense in the exact context where it’s used: at the end of connectTo, immediately followed by a call to setAllAmountsTo. In fact, it’s debatable whether it should really be a separate method. On the one hand, having it separate allows us to document its intent with its name, instead of using a comment like we did in Reference. On the other hand, a separate method runs the risk of being called in the wrong context. Because we’re optimizing for clarity in this chapter, we’ll leave it separate. A similar argument holds for setAllAmountsTo.
The code for these two methods is shown in the following listing.
private void mergeGroupWith(Set<Container> otherGroup) { group.addAll(otherGroup); for (Container x: otherGroup) { x.group = group; } } private void setAllAmountsTo(double amount) { for (Container x: group) { x.amount = amount; } }
Private methods aren’t deemed worthy of Javadoc comments. They’re only used inside the class, so few people should ever feel the need to understand them in detail. Hence, the potential benefit of a comment doesn’t repay its cost.
The cost of a comment isn’t limited to the time spent writing it. Just like any other source line, it needs to be maintained, or it may become stale—that is to say, out of sync with the code it’s supposed to clarify. Remember: a stale comment is worse than no comment!
Replacing comments with descriptive names doesn’t rule out this particular risk. Without the proper coding discipline and processes, you may still end up with stale names, which are just as bad as stale comments.
The other three new support methods are innocuous read-only functionalities that may as well be declared public. This is not to say that you should take lightly the decision to make them public. The future maintainability cost of adding any public member to a class is much greater than the cost of adding the same member with private visibility. Additional costs for a public method include
In this particular case, these costs are arguably quite limited because the three methods under consideration are simple read-only functionalities with no preconditions to speak of.[2] Besides, these three methods provide information to the clients that isn’t otherwise available. As such, they significantly improve the class testability, as discussed in chapter 5.
To be precise, isConnectedTo requires its argument to be non-null. This is such a trivial precondition that you don’t need to document or actively check it. Violating it will raise an NPE just as expected.
/** Checks whether this container is connected to another one. * * @param other the container whose connection with this will be checked * @return <code>true</code> if this container is connected * to <code>other</code> */ public boolean isConnectedTo(Container other) { return group == other.group; } /** Returns the number of containers in the group of this container. * * @return the size of the group */ public int groupSize() { return group.size(); } /** Returns the total amount of water in the group of this container. * * @return the amount of water in the group */ public double groupAmount() { return amount * group.size(); }
Incidentally, the isConnectedTo method also improves the testability of our class by making directly observable something that we could only surmise in all previous implementations.
All six methods that make up the connectTo functionality are very short, the longest being connectTo itself, at six lines. Brevity is one of the main tenets of clean code.
Finally, there’s addWater. Its body doesn’t change compared to Reference. We just improve its documentation to better reflect its contract, using Javadoc syntax.
/** Adds water to this container. * A negative <code>amount</code> indicates removal of water. * In that case, there should be enough water in the group * to satisfy the request. * * @param amount the amount of water to be added */ public void addWater(double amount) { double amountPerContainer = amount / group.size(); for (Container c: group) { c.amount += amountPerContainer; } }
Compare this Javadoc method description with the contract for addWater I presented in chapter 5:
Notice how the comments in the listing don’t mention the reaction to the client violating the precondition by removing more water than is actually present. That’s because this implementation (just like Reference) doesn’t check that condition and allows containers to hold a negative amount of water. Looking back at figure 7.2, you can witness the HTML page that Javadoc generates from those comments.
What if the implementation checked that condition and actually implemented the penalty that the contract established by throwing IllegalArgumentException? Both the Javadoc style guide and the Effective Java book suggest to document unchecked exceptions using the @throws or @exception tags (which are equivalent).[3] A line like
See Item 74 in Effective Java, 3rd ed.
@throws IllegalArgumentException if an attempt is made to remove more water than actually present
A quick look at the official Java API documentation shows that this is indeed standard practice. As an example, the documentation for the get(int index) method from ArrayList, returning the element at position index in the list, reports that the method will throw the unchecked exception IndexOutOfBoundsException if the index is out of the proper range.
Suppose a public method may throw an AssertionError if it detects a violation of a class invariant. Would you document this circumstance in the Javadoc for this method?
This chapter is somewhat different from the previous ones in that you can readily apply its advice to most, if not all practical scenarios. Even though I said in chapter 1 that readability may contrast with other quality objectives, such as time or space efficiency, in most of these conflicts it’s readability that should prevail. Human readability is a huge benefit when a given piece of sofware will inevitably need to evolve, because of bugs being found or new features being requested.
Still, we shouldn’t confuse code clarity with algorithmic simplicity. I’m not suggesting to shun an efficient algorithm in favor of a naive one in the name of readability. Rather, you should pick the best algorithm for the job and then strive to code it in the cleanest possible way. Clarity rightfully defies performance hacks, not proper engineering.
For the sake of completeness, I should mention a couple of scenarios in which readability is either a luxury or something to be actively avoided. Examples of the first are tightly timed programming challenges like hackathons or coding competitions. Those scenarios require contestants to quickly write throwaway code that just works. Any delay is a cost, and style considerations go out the window.
Another special scenario arises when companies don’t want their source code to be analyzed by others, including the legitimate users of their software. By hiding or obfuscating their source code, such companies hope to hide their algorithms or data. In such cases, it may seem natural to abandon code readability and go for the most cryptic lines that get the job done. In fact, there’s a specific type of software, called an obfuscator, whose job is precisely to translate a program into another program that is functionally equivalent to the first, but extremely hard to understand for a human reader. You can obfuscate all programming languages,[4] from machine code to Java bytecode or source code. Just googling “Java obfuscator” provides a rich selection of open source and commercial tools for this task. Given the availability of such tools, even the most secretive company can benefit from internally handling clean, self-explanatory code, which is then rendered obscure before being publicly released.
Some languages are designed to be unreadable and hardly need any obfuscation. Do you know any? Hint: 3
In this section, you’ll apply the guidelines for readable code to a different example. It’s a single method that accepts a two-dimensional array of doubles and . . . does something to it. I’ve written the method’s body in an intentionally sloppy style; not exactly obscure, but not very readable either. As an exercise, try to understand what it does before reading ahead.
public static void f(double[][] a) { int i = 0, j = 0; while (i<a.length) { if (a[i].length != a.length) throw new IllegalArgumentException(); i++; } i = 0; while (i<a.length) { j = 0; while (j<i) { double temp = a[i][j]; a[i][j] = a[j][i]; a[j][i] = temp; j++; } i++; } }
Did you feel the pain? Those while loops and meaningless variable names really put a strain on your brain. Imagine a whole program written in the same style!
As you might have guessed, the mystery method transposes a square matrix, a standard operation that swaps rows with columns. The first while loop checks whether the provided matrix is square-shaped—has as many rows as columns. Since Java matrices can be irregular, this entails checking that each row has the same length as the number of rows. Here’s an annotated version of the same method, to help you recognize the various parts:
public static void f(double[][] a) { int i = 0, j = 0; while (i<a.length) { 1 For each row if (a[i].length != a.length) 2 If the row length is "wrong" throw new IllegalArgumentException(); i++; } i = 0; while (i<a.length) { 3 For each row j = 0; while (j<i) { 4 For each column less than i double temp = a[i][j]; 5 Swap a[i][j] and a[j][i] a[i][j] = a[j][i]; a[j][i] = temp; j++; } i++; } }
It’s time to improve the readability of this method using this chapter’s guidelines. First, the initial squareness check is the ideal occasion for the Extract Method refactoring rule: it’s a coherent operation with a clearly specified contract. Once you put it in a separate method, it might also be useful in other contexts. That’s why I’m declaring it public and equipping it with a full Javadoc comment.
Since the squareness check doesn’t modify the matrix, you can use an enhanced for as its main loop:
/** Checks whether a matrix is square-shaped * * @param matrix a matrix * @return {@code true} if the given matrix is square */ public static boolean isSquare(double[][] matrix) { for (double[] row: matrix) { if (row.length != matrix.length) { return false; } } return true; }
Then, the transpose method itself invokes isSquare and then performs its job with two straightforward for loops. An enhanced for would be useless here, because you need row and column indices to perform the swap.
Along the way, improve the names of the variables, and of the method itself, by making them more descriptive. You can keep names i and j for the row and column indices because those are standard names for array indices.
/** Transposes a square matrix * * @param matrix a matrix * @throws IllegalArgumentException if the given matrix is not square */ public static void transpose(double[][] matrix) { if (!isSquare(matrix)) { throw new IllegalArgumentException( "Can’t transpose a nonsquare matrix."); } for (int i=0; i<matrix.length; i++) { 1 For each row for (int j=0; j<i; j++) { 2 For each column less than i double temp = matrix[i][j]; 3 Swap a[i][j] and a[j][i] matrix[i][j] = matrix[j][i]; matrix[j][i] = temp; } } }
You’ve seen and applied some very important principles to improve the readability of this code. Here are a couple use cases to help you understand the practical importance of this trait.
The following examples show how seriously the programming world has taken the idea of readability.
Given the following data:
List<String> names; double[] lengths;
What kind of loop would you use to accomplish the following tasks?
As you might know, the method charAt from the class String returns the character of this string at a given index:
public char charAt(int index)
Write a Javadoc comment describing the contract of this method and then compare it to the official documentation.
Examine the following method. Guess what it does and make it more readable. (Don’t forget to add a Javadoc method comment.) You can find the source code for this exercise and the next one in the online repository (https://bitbucket.org/mfaella/exercisesinstyle).
public static int f(String s, char c) { int i = 0, n = 0; boolean flag = true; while (flag) { if (s.charAt(i) == c) n++; if (i == s.length() -1) flag = false; else i++; } return n; }
The following method comes from a collection of algorithms hosted in a github repository (starred by 10k people and forked 4k times). The method performs a breadth-first visit of a graph, represented as an adjacency matrix of type byte. You don’t have to know this algorithm to complete this exercise. Just know that the a[i][j] cell contains 1 if there’s an edge from node i to node j, and 0 otherwise.
Improve the method readability in two steps. First, make only exterior changes to variable names and comments. Then, make structural changes. All changes must preserve both the API (types of parameters) and the visible behavior (the on-screen output).
/** * The BFS implemented in code to use. * * @param a Structure to perform the search on a graph, adjacency matrix etc. * @param vertices The vertices to use * @param source The Source */ public static void bfsImplement(byte [][] a,int vertices,int source){ //passing adjacency matrix and number of vertices byte []b=new byte[vertices]; //flag container containing status //of each vertices Arrays.fill(b,(byte)-1); //status initialization /* code status -1 = ready 0 = waiting 1 = processed */ Stack<Integer> st = new Stack<>(); //operational stack st.push(source); //assigning source while(!st.isEmpty()){ b[st.peek()]=(byte)0; //assigning waiting status System.out.println(st.peek()); int pop=st.peek(); b[pop]=(byte)1; //assigning processed status st.pop(); //removing head of the queue for(int i=0;i<vertices;i++){ if(a[pop][i]!=0 && b[i]!=(byte)0 && b[i]!=(byte)1 ){ st.push(i); b[i]=(byte)0; //assigning waiting status }}} }
Readability positively affects maintainability and reliability because readable code is easier to understand and modify in a safe manner.
You can’t use an enhanced for because you need to modify the array’s entries, and you need an index for that. The best choice for iterating over a whole array using an explicit index is a standard for loop.
You should consider a comment describing the behavior of a private method an implementation comment. Private methods are not exposed to the clients.
The most appropriate name is probably monthlySalary. Alternatives s and salary contain too little information, whereas employeeMonthlySalary needlessly repeats the class name.
You shouldn’t document an AssertionError because that kind of exception is only thrown if an internal error occurs.
for (String name: names) { System.out.println(name); }
Iterator<String> iterator = names.iterator(); while (iterator.hasNext()) { if (iterator.next().length() > 20) { iterator.remove(); } }
double totalLength = 0; for (double length: lengths) { totalLength += length; }or the following stream-based one-liner:
double totalLength = Arrays.stream(lengths).sum();
boolean containsZero = false; for (double length: lengths) { if (length == 0) { containsZero = true; break; } }The stream library provides a handy alternative:
boolean containsZero = Arrays.stream(lengths).anyMatch( length -> length == 0);
Here’s a slightly simplified version of the Javadoc from OpenJDK 12:
/** * Returns the {@code char} value at the * specified index. An index ranges from {@code 0} to * {@code length() - 1}. The first {@code char} value of the sequence * is at index {@code 0}, the next at index {@code 1}, * and so on, as for array indexing. * * @param index the index of the {@code char} value. * @return the {@code char} value at the specified index of this string. * The first {@code char} value is at index {@code 0}. * @exception IndexOutOfBoundsException if the {@code index} * argument is negative or not less than the length of this * string. */
It’s easy to see that the method simply counts the occurrences of a character inside a string. The while loop and the flag are useless detours, replaced by a simple for loop in the following solution:
/** Counts the number of occurrences of a character in a string. * * @param s a string * @param c a character * @return The number of occurrences of {@code c} in {@code s} */ public static int countOccurrences(String s, char c) { int count = 0; for (int i=0; i<s.length(); i++) { if (s.charAt(i) == c) { count++; } } return count; }
The stream library also allows an alternative implementation, where the method body consists of the following one liner:
return (int) s.chars().filter(character -> character == c).count();
The cast to int is due to the fact that the terminal operation count returns a value of type long. A more robust implementation would take precautions against overflow.
Let’s jump to the final version, including both exterior and structural improvements. First, notice that the algorithm maintains a status for each node, which can take one of three values: fresh (not encountered yet), enqueued (put in the stack but not visited yet), and processed (visited). In the original implementation, this information is encoded in the array of bytes b. The first structural improvement is to use an enumeration for this purpose. Unfortunately, enumerations can’t be local to a method, so you have to put the following declaration in class scope (outside the method):
private enum Status { FRESH, ENQUEUED, PROCESSED };
Now you can refactor the main method, taking advantage of this enumeration, improving variable names, removing implementation comments, and fixing white space and indentation. You should end up with something like this:
/** Visits the node in a directed graph in breadth first order, * printing the index of each visited node. * * @param adjacent the adjacency matrix * @param vertexCount the number of vertices * @param sourceVertex the source vertex */ public static void breadthFirst( byte[][] adjacent, int vertexCount, int sourceVertex) { Status[] status = new Status[vertexCount]; Arrays.fill(status, Status.FRESH); Stack<Integer> stack = new Stack<>(); stack.push(sourceVertex); while (!stack.isEmpty()) { int currentVertex = stack.pop(); System.out.println(currentVertex); status[currentVertex] = Status.PROCESSED; for (int i=0; i<vertexCount; i++) { if (adjacent[currentVertex][i] != 0 && status[i] == Status.FRESH) { stack.push(i); status[i] = Status.ENQUEUED; } } } }
In the previous method, I left the use of the Stack class because it doesn’t affect readability, but you should know that the Stack class has been superseded by LinkedList and ArrayDeque.
18.189.180.76