Chapter 3. Tackling Complexity

Try to solve the following simple puzzle by listening to your intuition; don’t try to solve it with mathematics or calculation.

A baseball bat and ball costs $1.10. The bat costs a dollar more than the ball. How much does the ball cost?

Images

A baseball bat and ball costs $1.10. The bat costs a dollar more than the ball. How much does the ball cost?.

Take note of your immediate response.

This seems like an easy question. Since this is supposed to be a book about engineering, an intellectually demanding discipline, you probably suspect a trap.

We’ll return to the bat and ball shortly.

This chapter takes a step back and attempts to answer a fundamental question: Why is software development so difficult?

The answer that it proposes is equally fundamental. It has to do with how the human brain works. This is the central thesis of the entire book. Before discussing how to write code that fits in your head, we must discuss what does fit in your head.

Subsequent chapters then put this knowledge into practice.

3.1 Purpose

After reading the first two chapters, you may be underwhelmed. Perhaps you thought that software engineering was going to be a cerebral, sophisticated, arcane, and esoteric discipline. We can easily make it more sophisticated than what you’ve seen so far, but we have to start somewhere. Why not start with the easy parts? As figure 3.1 alludes, climbing a hill starts at ground level.

Images

Figure 3.1: Climbing a hill starts at ground level.

Before we continue I think that we should pause and discuss the problem that we’re trying to address. Which problem is that?

The problem that this book is trying to solve is one of sustainability. Not in the usual, environmental, sense of the word, but suggesting that code can sustain the organisation that owns it.

3.1.1 Sustainability

An organisation creates software for various reasons. Often, it’s to make money. Sometimes, it’s to save money. Once in a while, governments institute software projects to supply digital infrastructure for its citizens; there are no direct profits or savings to be gained from the software, but there’s a mission to fulfil.

It often takes a long time to develop a complex piece of software. Months, if not years.

Much software lives for years or decades. During its lifetime, it undergoes changes, gets new features, bugs are fixed, and so on. This requires regular work on the code base.

The software exists to support the organisation in some way or other. When you add new features, or address defects, you support the organisation. It’s best served if you can support it as well today as you could half a year ago. And when you can support it as well in another half year.

This is a continual effort. It must be sustainable.

As Martin Fowler explains: without an eye to internal quality, you soon lose the ability to make improvements in reasonable time.

“This is what happens with poor internal quality. Progress is rapid initially, but as time goes on it gets harder to add new features. Even small changes require programmers to understand large areas of code, code that’s difficult to understand. When they make changes, unexpected breakages occur, leading to long test times and defects that need to be fixed.”[32]

This is the situation I believe that software engineering should address. It should make the software development process more regular. It should sustain its organisation. For months and years and decades.

Software engineering should make the software development process more regular. It should sustain its organisation.

3.1.2 Value

Software exists to serve a purpose. It should provide value. I often run into software professionals who seem blinded by that word. If the code you wrote doesn’t provide value, then why did you write it?

A certain focus on value seems warranted. I’ve also met more than one programmer who, if left to him- or herself, while away hours dilly-dallying with some clever framework of their own devising.

This happens to commercial companies as well. Richard P. Gabriel tells the story of the rise and fall of a company called Lucid[38]. While they were tinkering with the perfect commercial implementation of Common Lisp, C++ came along and took over the market for cross-platform software development languages.

The Lucid people considered C++ inferior to Common Lisp, but Gabriel ultimately came to understand why customers chose it. C++ may be less consistent and more complicated, but it worked and was available to customers. Lucid’s product wasn’t. This lead Gabriel to formulate the aphorism that worse is better. Lucid went out of business.

People who tinker with technology without regard for its purpose occupy the right-hand side of figure 3.2.

Images

Figure 3.2: Some programmers never consider the value of the code they write, while others have difficulty seeing past immediately quantifiable results. Sustainability lies somewhere in-between.

The focus on value seems to be a reaction to this mindset. It makes sense to ask whether code serves a purpose. The term value is often used as a proxy for purpose, despite the fact that you can’t measure it. There’s a school of project management based on the idea[87] that you should:

1. form a hypothesis about the impact of the change you’re about to make

2. make the change

3. measure the impact and compare it to your prediction

This isn’t a book about project management, but that seems a reasonable approach. It fits the observations of Accelerate[29].

The notion that code should produce value unfortunately leads to the logical fallacy that code not producing value is prohibited. The notion that worse is better isn’t far off.

This is a fallacy because some code produces no immediately measurable value. You might, on the other hand, be able to measure the absence of it. A straightforward example is security. You may not be able to measure the value of adding authentication to an online system, but you can probably measure the absence of it.

The same goes for Fowler’s argument about internal quality[32]. A lack of architecture is going to be measurable, but only when it’s too late. I’ve seen more than one company go out of business because of poor internal quality.

Sustainability occupies the middle ground in figure 3.2. It discourages technology for technology’s sake, but it also advises against a myopic focus on value.

Software engineering ought to encourage sustainability. By following checklists, by treating warnings as errors, etcetera, you prevent some cruft[32] from forming. None of the methodologies and heuristics presented in this book guarantee a perfect result, but they pull in the right direction. You’ll still have to use your experience and judgment. This is, after all, the art of software engineering.

3.2 Why programming is difficult

What makes software development so hard? There’s more than one reason. One is, as discussed in section 1.1, that we’re using the wrong metaphors. That clouds our thinking, but that’s not the only reason.

Another problem is that a computer is quantitatively different from a brain. Yes, that’s another problematic metaphor.

3.2.1 The brain metaphor

It seems obvious to liken a computer to a brain, and vice versa. Surely, there are superficial similarities. Both can perform calculations. Both can recall events that happened in the past. Both can store and retrieve information.

Images

Is the brain like a computer? Don’t be mislead by the obvious similarities.

Is a computer like a brain? I think that there are more differences than similarities. A computer can’t make intuitive inferences. It doesn’t interpret sight and sound well1. It doesn’t have intrinsic motivation.

1 So-called AI has made advances in recent years, but the problems researchers are struggling with are still at a level that a toddler can easily solve. Show a computer a children’s book with drawings of farm animals and ask it what’s in each picture.

Is a brain like a computer? Compared to a computer, our ability to calculate is glacially slow, and our memory is so unreliable as to be disreputable. We forget important things. Memories can be fabricated or manipulated[108], and you’re not even aware that this happens. You’re certain that you were at a particular party twenty years ago with your best friend, but she’s sure she never went. Either your memory is wrong, or hers is.

What about working memory? A computer can keep track of millions of things in RAM. Human short-term memory can hold from four to seven2 pieces of information[79][108].

2 You may also have encountered the magical number seven, plus or minus two. I don’t consider the exact number important. What I do find crucial is that it’s orders of magnitude less than a computer’s working memory.

This has profound implications for programming. Even a modest subroutine can easily create dozens of variables and branching instructions. When you try to understand what source code does, you’re essentially running an emulator of the programming language in your mind. If too many things are going on, you can’t keep track of it all.

How much is too much?

This book uses the number seven as a token for the limit of the brain’s short-term memory. You may be able to keep track of nine things from time to time, but seven represents the concept well.

3.2.2 Code is read more than it’s written

This brings us to a fundamental problem of programming.

You spend more time reading code than writing it.

You write a line of code once and read it multiple times[61]. You rarely get to work with a pristine code base. When you work with an existing code base, you must understand it before you can successfully edit it. When you add a new feature, you read the existing code to figure out how to best reuse what’s already there and to learn what new code you’ll have to add. When you struggle to fix a bug, you must first understand what causes it. You’ll typically spend the majority of your programming time reading existing code.

Optimise code for readability.

You constantly hear about new programming languages, new libraries, new frameworks, or new IDE features that enable you to produce more code faster. As the Lucid story shows, it sells well, but is hardly a good strategy for sustainable software development. More code faster means more code that you’ll have to read. The more you produce, the more you have to read. Automated code generation only makes matters worse.

As Martin Fowler writes about low code quality:

“Even small changes require programmers to understand large areas of code, code that’s difficult to understand.”[32]

Code that’s difficult to understand slows you down. On the other hand, every minute you invest in making the code easier to understand pays itself back tenfold.

3.2.3 Readability

It’s easy to say that you should favour readable code over code that’s easy to write, but what, exactly, is readable code?

Have you ever looked at some code and asked yourself: Who wrote this crap?! Then, once you investigate3 it turns out that it was you?

3 git blame is a great tool for such forensics.

This happens to everyone. When you’re writing code, you’re in a situation where you’re aware of all the context that gives rise to the code. When you’re reading code, all that contextual information is gone.

Ultimately, the code is the only artefact that matters. Documentation may be out of date, or absent. The person who wrote the code may be on vacation, or may have left the organisation.

To add insult to injury, the brain performs poorly when reading and evaluating formal statements. How did you respond to the baseball-bat-and-ball question at the beginning of this chapter?

The number that immediately jumped into your head was 10. That’s the answer that most people give[51].

It’s the wrong answer. If the ball costs 10 cents then the bat must cost $1.10, and the total price would be $1.20. The correct answer is 5 cents.

The point is that we make mistakes all the time. When we solve trivial maths problems, and when we read code.

How do you write readable code? You can’t trust your intuition. You’ll need something more actionable. Heuristics, checklists... software engineering. We’ll return to this topic throughout the book.

3.2.4 Intellectual work

Have you ever driven your car somewhere, and after ten minutes of driving, you suddenly ‘wake up’ and horrified ask yourself: how did I get here?

I have. Not that I’ve literally fallen asleep behind the wheel, but I’ve been so lost in thought that I’ve been oblivious that I was driving. I’ve also accomplished the feat of bicycling past my own home, as well as trying to unlock the door to my downstairs neighbour instead of my own.

Based on these confessions I realise that you probably don’t want to get into a car with me, but my point isn’t that I’m easily distracted. The point is that the brain works even when you aren’t aware of it.

You know that your brain controls your breathing, even when you aren’t thinking of it. It takes care of a lot of motor functions without your explicit control. It seems that it does much more than that.

After one of the incidents where I’d found myself behind the wheel of my car, wondering how I got where I was, I was as astounded as I was appalled. I’d been driving in my home city of Copenhagen, and I must have performed a series of complex manoeuvres to get where I was. Stopping for red, turning left, turning right without hitting any of the city’s omnipresent bicyclists, correctly navigating to my destination. Yet I had no recollection of doing any of that.

Your conscious awareness isn’t a required ingredient for complex intellectual work.

Have you ever been in the zone while programming? Looking up from the screen and realising that it’s suddenly dark outside and that you’ve been at it for hours? In psychology, this mental state is called flow[51]. In it, you’re so fully engrossed in your activity that you lose awareness of the self.

You can program without deliberate thinking. Of course, you can also write code while being aware that you’re doing it. The point is that a lot goes on in your brain that you’re not explicitly aware of. Your brain performs the work; your consciousness may be nothing but a passive spectator.

You’d think that intellectual work would be hundred percent deliberate thinking, but the truth is probably that a lot of involuntary activity also takes place. Psychologist and Nobel laureate Daniel Kahneman suggests a model of thought comprised of two systems: System 1 and System 2 .

“System 1 operates automatically and quickly, with little or no effort and no sense of voluntary control.

“System 2 allocates attention to the effortful mental activities that demand it, including complex computations. The operations of System 2 are often associated with the subjective experience of agency, choice, and concentration.”[51]

You probably think of programming as belonging exclusively to the realm of System 2, but that doesn’t have to be the case. It seems that System 1 is always running in the background, trying to make sense of the code it’s looking at. The problems is that System 1 is fast, but not particularly accurate. It can easily make incorrect inferences. That’s what’s happening when 10 is the first number that comes into your mind when confronted with the baseball-bat-and-ball puzzle.

In order to organise source code such that our brains can make sense of it, you have to keep System 1 from going off the rails. Kahneman also writes:

“An essential design feature [of System 1] is that it represents only activated ideas. Information not retrieved (even unconsciously) from memory might as well not exist. System 1 excels at constructing the best possible story that incorporates ideas currently activated, but it does not (cannot) allow for information it does not have.

“The measure of success for System 1 is the coherence of the story it manages to create. The amount and quality of the data on which the story is based are largely irrelevant. When information is scarce, which is a common occurrence, System 1 operates as a machine for jumping to conclusions.”[51]

There’s a machine for jumping to conclusions in your brain4, and it’s looking at your code. You’d better organise code so that the relevant information is activated. As Kahneman puts it, what you see is all there is (WYSIATI)[51].

4 Why is System 1 running all the time, while System 2 may not be? One reason could be that effortful thinking burns more glucose[51]. That would imply that System 1 is an energy-saving mechanism.

This already goes a long way explaining why global variables and hidden side effects make code obscure. A global variable is typically not visible when you look at a piece of code. Even if your System 2 knows about it, that knowledge is not activated, so System 1 doesn’t take it into account.

Place related code close together. All the dependencies, variables, and decisions required should be visible at the same time. This is a theme that runs throughout the book, so you’ll see plenty of examples, particularly in chapter 7.

3.3 Towards software engineering

The purpose of software engineering should be to support the organisation that owns the software. You should be able to make changes at a sustainable pace.

But writing code is difficult because it’s so intangible. You spend more time reading code than writing it, and the brain is easily misled - even by unremarkable matters like the bat-and-ball problem.

Software engineering must address this problem.

3.3.1 Relationship to computer science

Can computer science help? I don’t see why not, but computer science isn’t (software) engineering, just like physics isn’t the same as mechanical engineering.

Such disciplines can interact, but they aren’t the same. Successful practices can provide inspiration and insight for scientists, and results from science can be applied to engineering, as suggested by figure 3.3.

For example, results from computer science can be encapsulated in reusable packages.

Images

Figure 3.3: Science and engineering interact, but aren’t the same.

I had a couple of years of professional experience with software development before I learned about sorting algorithms. I don’t have a formal education in computer science; I taught myself to code. If I needed to sort an array in C++, Visual Basic, or VBScript, I’d call a method.

You don’t have to be able to implement quicksort or merge sort to sort collections. You don’t have to know about hash indexes, SSTables, LSM-trees, and B-trees to query a database5.

5 These are some of the data structures that power databases[55].

Computer science helps the software development industry to progress, but the knowledge gained there can often be packaged into reusable software. It doesn’t hurt to know about computer science, but you don’t have to. You can still do software engineering.

3.3.2 Humane code

Sorting algorithms can be encapsulated and distributed as reusable libraries. Sophisticated storage and retrieval data structures can be packaged as general-purpose database software, or offered as cloud-based infrastructure.

You still have to write code.

You have to organise it in a sustainable manner. You must structure it in such a way that it fits in your brain.

As Martin Fowler put it:

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”[34]

The brain comes with cognitive constraints that are completely different from a computer’s limits. A computer can keep track of millions of things in RAM. Your brain can keep track of seven.

A computer will only make decisions based on the information it’s instructed to consult. Your brain tends to jump to conclusions. What you see is all there is.

Obviously, you must write code such that the resulting software works as desired. That’s no longer the main problem of software engineering. The challenge is to organise it so that it fits in your brain. Code must be humane.

This implies writing small, self-contained functions. Throughout this book, I’ll use the number seven as a proxy for the limits of human short-term memory. Humane code, then, implies fewer than seven dependencies, that cyclomatic complexity is at most seven, and so on.

The devil’s in the details, though, so I’ll show you plenty of examples.

3.4 Conclusion

The core problem that software engineering should solve is that it’s so complex that it doesn’t fit the human brain. Fred Brooks offered this analysis in 1986:

“Many of the classical problems of developing software products derive from this essential complexity and its nonlinear increases with size [...] From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program”[14]

I use the term complexity in the same way that Rich Hickey uses it[45]: as an antonym to simplicity. Complex means ‘assembled from parts’, as opposed to simple, which implies unity.

The human brain can deal with limited complexity. Our short-term memory can only keep track of seven objects. If we don’t pay attention, we can easily write code that handles more than seven things at once. The computer doesn’t care, so it’s not going to stop us.

Software engineering should be the deliberate process of preventing complexity from growing.

Perhaps you recoil from all of this. You may think that it’s going to slow you down.

Yes, that’s the point. To paraphrase J.B. Rainsberger[85], you probably need slowing down. The faster you type, the more code you make that everyone has to maintain. Code isn’t a asset; it’s a liability[76].

As Martin Fowler argues, it’s by applying good architecture that you can keep a sustainable pace[32]. Software engineering is a means to that end. It’s an attempt to shift software development from being a pure art towards being a methodology.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.202.214