Insist on Correctness

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Tip 2

Insist on Correctness

[White Belt] These considerations are essential to your coding from day one.

In toy programs it’s easy to tell the difference between correct and incorrect. Does factorial(n) return the correct number? That’s easy to check: one number goes in, and another number comes out. But in big programs, there are potentially many inputs—not just function parameters, but also state within the system—and many outputs or other side effects. That’s not so easy to check.

Isolation and Side Effects

Textbooks love to use math problems for programming examples, partly because computers are good at math, but mostly because it’s easy to reason about numbers in isolation. You can call factorial(5) all day long, and it’ll return the same thing. Network connections, files on disk, or (especially) users have a nasty habit of not being so predictable.

When a function changes something outside its local variables—for example, it writes data to a file or a network socket—it’s said to have side effects. The opposite, a pure function, always returns the same thing when given the same arguments and does not change any outside state. Obviously, pure functions are a lot easier to test than functions with side effects.

Most programs have a mix of pure and impure code; however, not many programmers think about which parts are which. You might see something like this:

ReadStudentGrades.rb
	def self.import_csv(filename)
	File.open(filename) do \|file\|
	file.each_line do \|line\|
	name, grade = line.split(',')

	# Convert numeric grade to letter grade
	grade = case grade.to_i
	when 90..100 then 'A'
	when 80..89 then 'B'
	when 70..79 then 'C'
	when 60..69 then 'D'
	else 'F'
	end

	Student.add_to_database(name, grade)
	end
	end
	end

This function is doing three things: reading lines from a file (impure), doing some analysis (pure), and updating a global data structure (impure). As this is written, you can’t easily test any one piece.

Said this way, it’s obvious that each task should be isolated so it can be tested separately. We’ll discuss the file part shortly in Interactions. Let’s pull the analysis bit into its own method:

ReadStudentGrades2.rb
	def self.numeric_to_letter_grade(numeric)
	case numeric
	when 90..100 then 'A'
	when 80..89 then 'B'
	when 70..79 then 'C'
	when 60..69 then 'D'
	when 0..59 then 'F'
	else raise ArgumentError.new(
	"#{numeric} is not a valid grade")
	end
	end

Now numeric_to_letter_grade is a pure function that’s easy to test in isolation:

ReadStudentGrades2.rb
	def test_convert_numeric_to_letter_grade
	assert_equal 'A',
	Student.numeric_to_letter_grade(100)
	assert_equal 'B',
	Student.numeric_to_letter_grade(85)
	assert_equal 'F',
	Student.numeric_to_letter_grade(50)
	assert_equal 'F',
	Student.numeric_to_letter_grade(0)
	end

	def test_raise_on_invalid_input
	assert_raise(ArgumentError) do
	Student.numeric_to_letter_grade(-1)
	end

	assert_raise(ArgumentError) do
	Student.numeric_to_letter_grade("foo")
	end

	assert_raise(ArgumentError) do
	Student.numeric_to_letter_grade(nil)
	end
	end

This example may be trivial, but what happens when the business logic is complex and it’s buried in a function that has five different side effects? (Answer: it doesn’t get tested very well.) Teasing apart the knots of pure and impure code can help you test correctness both for new code and when maintaining legacy code.

Interactions

Now what about those side effects? It’s a huge pain to augment your code with constructs like “If in test mode, don’t actually connect to the database….” Instead, most languages have a mechanism for creating test doubles that take the place of the resource your function wants to use.

Let’s say we rewrote the previous example so that import_csv() handles only the file processing and passes the rest of the work off to Student.new():

ReadStudentGrades3.rb
	def self.import_csv(filename)
	file = File.open(filename) do \|file\|
	file.each_line do \|line\|
	name, grade = line.split(',')

	Student.new(name, grade.to_i)
	end
	end
	end

What we need is a test double for the file, something that will intercept the call to File.open() and yield some canned data. We need the same for Student.new(), ideally intercepting the call in a way that verifies the data passed into it.

Ruby’s Mocha framework allows us to do exactly this:

ReadStudentGrades3.rb
	def test_import_from_csv
	File.expects(:open).yields('Alice,99')
	Student.expects(:new).with('Alice', 99)

	Student.import_csv(nil)
	end

This illustrates two points about testing interactions between methods:

Unit tests must not pollute the state of the system by leaving stale file handles around, objects in a database, or other cruft. A framework for test doubles should let you intercept these.
This kind of test double is known as a mock object, which verifies expectations you program into it. In this example, if Student.new() was not called or was called with different parameters than we specified in the test, Mocha would fail the test.

Of course, Ruby and Mocha make the problem too easy. What about those of us who suffer with million-line C programs? Even C can be instrumented with test doubles, but it takes more effort.

You can generalize the problem to this: how do you replace one set of functions at runtime with another set of functions? (If you’re nerdy enough to think “That sounds like a dynamic dispatch table,” you’re right.) Sticking with the example of opening and reading a file, here’s one approach:

TestDoubles.c
	struct fileops {
	FILE* (*fopen)
	(const char *path,
	const char *mode);
	size_t (*fread)
	(void *ptr,
	size_t size,
	size_t nitems,
	FILE *stream);
	// ...
	};

	FILE*
	stub_fopen(const char path, const* char *mode)
	{
	// Just return fake file pointer
	return (FILE*) 0x12345678;
	}

	// ...

	struct fileops real_fileops = {
	.fopen = fopen
	};

	struct fileops stub_fileops = {
	.fopen = stub_fopen
	};

The fileops structure has pointers to functions that match the standard C library API. In the case of the real_fileops structure, we fill in these pointers with the real functions. In the case of stub_fileops, they point to our own stubbed-out versions. Using the structure isn’t much different from just calling a function:

TestDoubles.c
	// Assume that ops is a function parameter or global
	struct fileops *ops;
	ops = &stub_fileops;

	FILE* file = (ops->fopen)("foo", "r"*);
	// ...

Now the program can flip between “real mode” and “test mode” by just reassigning a pointer.

On January 15, 1990, AT&T’s phone network was humming along just fine. Until, that is, at 2:25 p.m. when a phone switch performed a self-test operation and reset itself. Switches don’t reset often, but the network can handle it, and the switch takes a mere four seconds to reset and resume normal operation. Only this time, other switches started to reset, too, and within seconds all 114 of AT&T’s backbone switches were endlessly resetting themselves. The mighty AT&T phone system ground to a halt.

It turns out that when the first switch reset itself, it sent a message to neighboring switches saying it was resuming normal operation. The exchange of messages caused the neighboring switches to crash. They, in turn, automatically reset and sent messages to their neighbors about resuming operation, and so on…thus creating an endless reset/resume/reset cycle.

It took AT&T engineers nine hours to get the phone system working again. It’s estimated the outage cost AT&T $60 million in dropped calls, and it’s impossible to gauge the economic damage to others who relied on their phones to do business.^[3]

What was the cause of the problem? A mistaken break statement. In C, someone had written this:

	if (condition) {
	// do stuff...
	}
	else {
	break;
	}

On the surface, the code reads like “If the condition is true, then do stuff; else, do nothing.” But in C, break does not break out of an if() statement; it breaks out of other blocks like while() or switch(). What happened is that the break broke out of an enclosing block much too early, corrupted a data structure, and caused the phone switch to reset. Because all the phone switches were running the same software and this bug was in the code that handled messages from peers about a reset recovery, the failure cascaded back and forth through the whole network.

Type Systems

When you refer to something like 42 in code, is that a number, a string, or what? If you have a function like factorial(n), what kind of thing is supposed to go into it, and what’s supposed to come out? The type of elements, functions, and expressions is very important. How a language deals with types is called its type system.

The type system can be an important tool for writing correct programs. For example, in Java you could write a method like this:

	public long factorial(long n) {
	// ...
	}

In this case, both the reader (you) and the compiler can easily deduce that factorial() should take a number and return a number. Java is statically typed because it checks types when code is compiled. Trying to pass in a string simply won’t compile.

Compare this with Ruby:

	def factorial(n)
	# ...
	end

What is acceptable input to this method? You can’t tell just by looking at the signature. Ruby is dynamically typed because it waits until runtime to verify types. This gives you tremendous flexibility but also means that some failures that would be caught at compile time won’t be caught until runtime.

Both approaches to types have their pros and cons, but for the purposes of correctness, keep in mind the following:

Static types help to communicate the proper use of functions and provide some safety from abuse. If your factorial function takes a long and returns a long, the compiler won’t let you pass it a string instead. However, it’s not a magic bullet: if you call factorial(-1), the type system won’t complain, so the failure will happen at runtime.
To make good use of a static type system, you have to play by its rules. A common example is the use of const in C++: when you start using const to declare that some things cannot be changed, then the compiler gets really finicky about every function properly declaring the const-ness of its parameters. It’s valuable if you completely play by the rules; it’s just a huge hassle if your commitment is anything less than 100 percent.
Dynamically typed languages may let you play fast and loose with types, but it still doesn’t make sense to call factorial() on a string. You need to use contract-oriented unit tests, discussed in Tip 3, Design with Tests, to ensure that your functions adequately check the sanity of their parameters.

Regardless of the language’s type system, get in the habit of documenting your expectations of each parameter—they usually aren’t as self-explanatory as the factorial(n) example. See Tip 6, Be Stylish for further discussion of documentation and code comments.

The Misnomer of 100 Percent Coverage

A common (but flawed) metric for answering “Have I tested enough?” is code coverage. That is, what percentage of your application code is exercised by running the unit tests? Ideally, every line of code in your application gets run at least once while running the unit tests—coverage is 100 percent.

Less than 100 percent coverage means you have some cases that are not tested. Junior programmers will assume that the converse is true: when they hit 100 percent coverage, they have enough tests. However, that’s not true: 100 percent coverage absolutely does not mean that all cases are covered.

Consider the following C code:

BadStringReverse.c
	#include <assert.h>
	#include <stdio.h>
	#include <stdlib.h>
	#include <string.h>

	void reverse(char str) // BAD BAD BAD*
	{
	int len = strlen(str);
	char *copy = malloc(len);

	for (int i = 0; i < len; i++) {
	copy[i] = str[len - i - 1];
	}
	copy[len] = 0;

	strcpy(str, copy);
	}

	int main()
	{
	char str[] = "fubar";
	reverse(str);
	assert(strcmp(str, "rabuf") == 0);
	printf("Ta-da, it works! "); // Not quite
	}

The test covers 100 percent of the reverse function. Does that mean the function is correct? No: the memory allocated by malloc() is never freed, and the allocated buffer is one byte too small.

Don’t be lulled into complacency by 100 percent coverage: it means nothing about the quality of your code or your tests. Writing good tests, just like writing good application code, requires thought, diligence, and good judgment.

Less Than 100 Percent Coverage

Some cases can be extremely hard to unit test. Here’s an example:

Kernel drivers that interface with hardware rely on hardware state changes outside your code’s control, and creating a high-fidelity test double is near impossible.
Multithreaded code can have timing problems that require sheer luck to fall into.
Third-party code provided as binaries often can’t be provoked to return failures at will.

So, how do you get 100 percent coverage from your tests? With enough wizardry, it’s surely possible, but is it worth it? That’s a value judgment that may come down to no. In those situations, discuss the issue with your team’s tech lead. They may be able to think of a test method that’s not too painful. If nothing else, you will need them to review your code.

Don’t be dissuaded if you can’t hit 100 percent, and don’t use that as an excuse to punt on testing entirely. Prove what’s reasonable with tests; subject everything else to review by a senior programmer.

Actions

Look up the unit testing frameworks available for each programming language you use. Most languages will have both the usual bases covered (assertions, test setup, and teardown) and some facility for fake objects (mocks, stubs). Install any tools you need to get these running.
This tip has bits and pieces of a program that reads lines of comma-separated data from a file, splits them apart, and uses them to create objects. Create a program that does this in the language of your choice, complete with unit tests that assure the correctness of every line of application code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for
Insist on Correctness

Isolation and Side Effects

Interactions

Type Systems

The Misnomer of 100 Percent Coverage

Less Than 100 Percent Coverage

Further Reading

Actions

Table of Contents for Insist on Correctness

Create new playlist

Sign In

Sign Up

Isolation and Side Effects

Interactions

Type Systems

The Misnomer of 100 Percent Coverage

Less Than 100 Percent Coverage

Further Reading

Actions

Table of Contents for
Insist on Correctness