Have you ever bought something significant, like a house, a plot of land, a company, or a car?
If so, you probably signed a contract. A contract stipulates a set of rights and obligations on both sides. The seller promises to hand over the property. The buyer commits to pay the specified amount at a prescribed time. Seller may give some guarantees as to the state of the property. Buyer may promise not to hold the seller liable for damages after the transaction completes. And so on.
A contract introduces and formalises a level of trust that would otherwise not be present. Why should you trust a stranger? It’s too risky to do that, but the institution of a contract fills the gap.
That’s what encapsulation is about. How can you trust an object to behave reasonably? By making objects engage in contracts.
The previous chapter closed without resolving an unbearable tension. Listing 4.15 shows how the Post
method saves a hard-coded reservation while it ignores the data it received.
This is a defect. To fix it, you have to add some code, and that puts us in a good position to start discussing encapsulation. Since this kills two birds with one stone, let’s do that first.
Don’t forget to use a driver if you can. The hard-coded values in listing 4.15 were driven by a single test case. How can you improve the situation?
It’s tempting to just fix the code. After all, what has to happen is hardly rocket science. When I coach teams, I constantly have to remind developers to slow down. Write production code as answers to drivers like tests or analysers. Moving forward in small steps reduces the risk of mistakes.
When you edit code, you transform it from one valid state to another. This doesn’t happen atomically. During modification, the code may not compile. Keep the time when the code is invalid as short as possible, as implied by figure 5.1. This reduces the number of things your brain has to keep track of.
In 2013 Robert C. Martin published a prioritised list of code transformations[64]. While he only intended it as a preliminary suggestion, I find it useful as a guideline. It goes like this:
• ({}→nil) no code at all → code that employs nil
• (nil→constant)
• (constant→constant+) a simple constant to a more complex constant
• (constant→scalar) replacing a constant with a variable or an argument
• (statement→statements) adding more unconditional statements
• (unconditional→if) splitting the execution path
• (scalar→array)
• (array→container)
• (statement→recursion)
• (if→while)
• (expression→function) replacing an expression with a function or algorithm
• (variable→assignment) replacing the value of a variable
The list is ordered roughly so that the simpler transformations are at the top, and the more complex changes are at the bottom.
Don’t worry if some of the words seem cryptic or obscure. As so many other guidelines in this book, they’re food for thought rather than rigid rules. The point is to move in small increments, for example by using a hard-coded constant instead of null1, or by turning a singular value into an array.
1 In the article[64] Robert C. Martin calls an undefined value nil, but from the context, it seems that he means null. Some languages (e.g. Ruby) call null nil.
At the moment, the Post
method saves a constant, but it ought to save data from dto
; a set of scalar values. This is the constant→scalar transformation (or a set of them).
The point with the Transformation Priority Premise is that we should aim to make changes to our code using the small transformations from the list.
Since we’ve identified the change we’re aiming for as one of the warranted changes, let’s go ahead and make it.
The idea behind the Transformation Priority Premise is that once you’ve identified which transformation to aim for, you should write a test driving that change.
You could write a new test method, but it’d be a duplicate of listing 4.10, just with some different property values for the dto
. Instead, turn the existing test into a Parametrised Test[66].
Listing 5.1 shows the change. Instead of the [Fact]
attribute, it uses the [Theory]
2 attribute to indicate a Parametrised Test, as well as two [InlineData]
attributes that supply the data. Notice that the top [InlineData]
attribute supplies the same test values as listing 4.10, while the second attribute contains a new test case.
2 This is xUnit.net’s API for Parametrised Tests. Other frameworks provide that feature in similar or not-so-similar ways. A few unit testing frameworks don’t support this at all. In my opinion, that’s reason enough to find another framework. The ability to write Parametrised Tests is one of the most important features of a unit testing framework.
One thing that should bother you is that the assertion phase of the test now seems to duplicate what would essentially be the implementation code. That’s clearly not perfect. You shouldn’t trust your brain to write production code without some sort of double-entry bookkeeping, but that only works if the two views are different. That’s not the case here.
[Theory] [InlineData( "2023-11-24 19:00", "[email protected]", "Julia Domna", 5)] [InlineData("2024-02-13 18:15", "[email protected]", "Xenia Ng", 9)] public async Task PostValidReservationWhenDatabaseIsEmpty( string at, string email, string name, int quantity) { var db = new FakeDatabase(); var sut = new ReservationsController(db); var dto = new ReservationDto { At = at, Email = email, Name = name, Quantity = quantity }; await sut.Post(dto); var expected = new Reservation( DateTime.Parse(dto.At, CultureInfo.InvariantCulture), dto.Email, dto.Name, dto.Quantity); Assert.Contains(expected, db); }
Perfect, however, is the enemy of the good. While this change introduces a problem in the test code, its purpose is to demonstrate that the Post
method doesn’t work. And indeed, when you run the test suite, the new test case fails.
Listing 5.2 shows the simplest transformation you can make to the Post
method to make all tests pass.
public async Task Post(ReservationDto dto) { if (dto is null) throw new ArgumentNullException(nameof(dto)); var r = new Reservation( DateTime.Parse(dto.At!, CultureInfo.InvariantCulture), dto.Email!, dto.Name!, dto.Quantity); await Repository.Create(r).ConfigureAwait(false); }
This seems like an improvement compared to listing 4.15, but there are still issues that you ought to address. Fight the urge to make further improvements right now. By adding the test case shown in listing 5.1, you’ve driven a small transformation. While the code isn’t perfect, it’s improved. All tests pass. Commit the changes to Git and push them through your deployment pipeline.
If you’re wondering about the exclamation marks after dto.Date
, dto.Email
, and dto.Name
, those are some of the remaining imperfections.
This code base uses C#’s nullable reference types feature, and most of the dto
properties are declared as nullable. Without the exclamation mark, the compiler complains that the code accesses a nullable value without checking for null. The !
operator suppresses the compiler’s complaints. With the exclamation marks, the code compiles.
This is a terrible hack. While the code compiles, it could easily cause a NullReferenceException
at run time. Trading a compile-time error for a run-time exception is a poor trade-off. We should do something about that.
Another potential run-time exception lurking in listing 5.2 is that there’s no guarantee that the DateTime.Parse
method call succeeds. We should do something about that as well.
With the code in listing 5.2, what happens if some client posts a JSON document without an at
property?
You might think that Post
would throw a NullReferenceException
, but in reality, DateTime.Parse
throws an ArgumentNullException
instead. At least that method performs input validation. You should do the same.
The ASP.NET framework translates an unhandled exception to a 500 Internal Server Error
response. That’s not what we want in this case.
When input is invalid, an HTTP API should return 400 Bad Request
[2]. That’s not what happens. Add a test that reproduces the problem.
[Theory] [InlineData(null, "[email protected]", "Jay Xerxes", 1)] public async Task PostInvalidReservation( string at, string email, string name, int quantity) { var response = await PostReservation(new { at, email, name, quantity }); Assert.Equal(HttpStatusCode.BadRequest, response.StatusCode); }
Listing 5.3 shows how to test what happens when the reservation date and time is missing. You may wonder why I wrote it as a [Theory]
with only a single test case. Why not a [Fact]
?
I admit that I cheated a bit. Once again, the art of software engineering manifests itself. This is based on the shifting sands of individual experience[4] - I know that I’m going to add more test cases soon, so I find it easier to start with a [Theory]
.
The test fails because the response’s status code is 500 Internal Server Error
.
You can easily pass the test with the code in listing 5.4. The major difference from listing 5.2 is the addition of the Null Guard.
The C# compiler is clever enough to detect the Guard Clause, which means that you can remove the exclamation mark after dto.At
.
You can add another test case where the email
property is missing, but let’s fast-forward one more step. Listing 5.5 contains two new test cases.
The bottom [InlineData]
attribute contains a test case with a missing email
property, while the middle test case supplies an at
value that’s not a date and time.
public async Task<ActionResult> Post(ReservationDto dto) { if (dto is null) throw new ArgumentNullException(nameof(dto)); if (dto.At is null) return new BadRequestResult(); var r = new Reservation( DateTime.Parse(dto.At, CultureInfo.InvariantCulture), dto.Email!, dto.Name!, dto.Quantity); await Repository.Create(r).ConfigureAwait(false); return new NoContentResult(); }
[Theory] [InlineData(null, "[email protected]", "Jay Xerxes", 1)] [InlineData("not a date", "[email protected]", "Wk Hd", 8)] [InlineData("2023-11-30 20:01", null, "Thora", 19)] public async Task PostInvalidReservation( string at, string email, string name, int quantity) { var response = await PostReservation(new { at, email, name, quantity }); Assert.Equal(HttpStatusCode.BadRequest, response.StatusCode); }
public async Task<ActionResult> Post(ReservationDto dto) { if (dto is null) throw new ArgumentNullException(nameof(dto)); if (dto.At is null) return new BadRequestResult(); if (!DateTime.TryParse(dto.At, out var d)) return new BadRequestResult(); if (dto.Email is null) return new BadRequestResult(); var r = new Reservation(d, dto.Email, dto.Name!, dto.Quantity); await Repository.Create(r).ConfigureAwait(false); return new NoContentResult(); }
Listing 5.6 passes all tests. Notice that I could remove another exclamation mark by guarding against a null email.
Consider listing 5.6. It’s grown in complexity since listing 4.15. Can you make it simpler?
This is an important question to regularly ask. In fact, you should ask it after each test iteration. It’s part of the Red Green Refactor[9] cycle.
• Red. Write a failing test. Most test runners render a failing test in red.
• Green. Make as minimal change as possible to pass all tests. Test runners often render passing tests in green.
• Refactor. Improve the code without changing its behaviour.
Once you’ve moved through all three phases, you start over with a new failing test. Figure 5.2 illustrates the process.
So far in the book’s running example, you’ve only seen oscillations of red-green, red-green, red-green. It’s time to add the third phase.
In the refactor phase, you consider the code you wrote in the green phase. Can you improve it? If so, that would be refactoring.
“Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure.”[34]
How do you know that you don’t change the externally visible behaviour? It’s difficult to prove a universal conjecture, but it’d be easy to disprove. If just one automated test were to fail after a change, you’d know that you broke something. Thus, a minimum bar is that if you change the structure of the code, all tests should still pass.
Can listing 4.15 be improved while still passing all tests? Yes, it turns out that the null guard of dto.At
is redundant. Listing 5.7 shows the simplified Post
method.
public async Task<ActionResult> Post(ReservationDto dto) { if (dto is null) throw new ArgumentNullException(nameof(dto)); if (!DateTime.TryParse(dto.At, out var d)) return new BadRequestResult(); if (dto.Email is null) return new BadRequestResult(); var r = new Reservation(d, dto.Email, dto.Name!, dto.Quantity); await Repository.Create(r).ConfigureAwait(false); return new NoContentResult(); }
Why does this still work? It works because DateTime.TryParse
already checks for null, and if the input is null, the return value is false
.
How could you have known that? I’m not sure that I can give an answer that leads to reproducible results. I thought of this refactoring because I knew the behaviour of DateTime.TryParse
. This is another example of programming based on the shifting sands of individual experience[4] - the art in software engineering.
Encapsulation is more than just checking for null. It’s a contract that describes valid interactions between objects and callers. One way to specify validity is to state what’s considered invalid. By implication all else is valid.
When you prohibit null references, you’re implicitly allowing all non-null objects. Unless you add more constraints, that is. Listing 5.7 already does that for dto.At
. Not only is null prohibited, but the string must also represent a proper date and time.
What about the other constituent elements of a reservation? Using C#’s static type system, the ReservationDto
class shown in listing 4.11 already (by its lack of the ?
symbol) declares that Quantity
can’t be null. But would any integer be an appropriate reservation quantity? 2? 0? -3?
2 seems like a reasonable number of people, but clearly not -3. What about 0? Why would you want to make a reservation for no people?
It seems to make most sense that a reservation quantity is a natural number. In my experience, this frequently happens when you evolve a Domain Model[33][26]. A model is an attempt to describe the real world3, and in the real world, natural numbers abound.
3 Even when the ‘real world’ is only a business process.
Listing 5.8 shows the same test method as listing 5.5, but with two new test cases with invalid quantities.
[Theory] [InlineData(null, "[email protected]", "Jay Xerxes", 1)] [InlineData("not a date", "[email protected]", "Wk Hd", 8)] [InlineData("2023-11-30 20:01", null, "Thora", 19)] [InlineData("2022-01-02 12:10", "[email protected]", "3 Beard", 0)] [InlineData("2045-12-31 11:45", "[email protected]", "Gil Tan", -1)] public async Task PostInvalidReservation( string at, string email, string name, int quantity) { var response = await PostReservation(new { at, email, name, quantity }); Assert.Equal(HttpStatusCode.BadRequest, response.StatusCode); }
These new test cases in turn drove the revision of the Post
method you can see in listing 5.9. The new Guard Clause[7] only accepts natural numbers.
Most programming languages come with built-in data types. There’s typically several integer data types: 8-bit integers, 16-bit integers, and so on. Normal integers, however, are signed. They describe negative numbers as well as positive numbers. That’s frequently not what you want.
You can sometimes get around the issue by using unsigned integers, but it wouldn’t work in this case, because an unsigned integer would still allow zero. To reject a reservation for no people, you’d still need a Guard Clause.
public async Task<ActionResult> Post(ReservationDto dto) { if (dto is null) throw new ArgumentNullException(nameof(dto)); if (!DateTime.TryParse(dto.At, out var d)) return new BadRequestResult(); if (dto.Email is null) return new BadRequestResult(); if (dto.Quantity < 1) return new BadRequestResult(); var r = new Reservation(d, dto.Email, dto.Name!, dto.Quantity); await Repository.Create(r).ConfigureAwait(false); return new NoContentResult(); }
The code in listing 5.9 compiles and all tests pass. Commit the changes in Git, and consider pushing them through your deployment pipeline.
Let’s recapitulate the process so far. What constitutes a valid reservation? The date must be be a proper date, and the quantity must be a natural number. It’s also a requirement that Email
isn’t null, but is that it?
Shouldn’t we require a valid email address? And what about the name?
Email addresses are notoriously difficult to validate[41], and even if you had a full implementation of the SMTP specification, what good would it do you?
Users can easily give you a bogus email address that fits the spec. The only way to really validate an email address is to send a message to it and see if that provokes a response (such as the user clicking on a validation link). That would be a long-running asynchronous process, so even if you’d want to do that, you can’t do it as a blocking method call.
The bottom line is that it makes little sense to validate the email address, apart from checking that it isn’t null. For that reason, I’m not going to validate it more than I’ve already done.
What about the name? It’s mostly a convenience. When you show up at the restaurant, the maître d’ will ask for your name rather than your email address or a reservation ID. If you never gave your name when you made the reservation, the restaurant can probably find you by email address instead.
Instead of rejecting a null name, you can convert it to an empty string. That design decision follows Postel’s law, because you’re being liberal with the input name.
In the green phase, make the test pass. Listing 5.11 shows one way to do that. You could have used a standard ternary operator, but C#’s null coalescing operator (??
) is a more compact alternative. In a way, it replaces the !
operator, but it’s a good trade-off, because ??
doesn’t suppress the compiler’s null-check engine.
[Theory] [InlineData( "2023-11-24 19:00", "[email protected]", "Julia Domna", 5)] [InlineData("2024-02-13 18:15", "[email protected]", "Xenia Ng", 9)] [InlineData("2023-08-23 16:55", "[email protected]", null, 2)] public async Task PostValidReservationWhenDatabaseIsEmpty( string at, string email, string name, int quantity) { var db = new FakeDatabase(); var sut = new ReservationsController(db); var dto = new ReservationDto { At = at, Email = email, Name = name, Quantity = quantity }; await sut.Post(dto); var expected = new Reservation( DateTime.Parse(dto.At, CultureInfo.InvariantCulture), dto.Email, dto.Name ?? "", dto.Quantity); Assert.Contains(expected, db); }
public async Task<ActionResult> Post(ReservationDto dto) { if (dto is null) throw new ArgumentNullException(nameof(dto)); if (!DateTime.TryParse(dto.At, out var d)) return new BadRequestResult(); if (dto.Email is null) return new BadRequestResult(); if (dto.Quantity < 1) return new BadRequestResult(); var r = new Reservation(d, dto.Email, dto.Name ?? "", dto.Quantity); await Repository.Create(r).ConfigureAwait(false); return new NoContentResult(); }
In the refactor phase, you ought to consider if you can make any improvements to the code. I think that you can, but that’s going to be a longer discussion. There’s no rule that prohibits a check-in between the green and the refactor phases. For now, commit the current changes to Git and push them through your deployment pipeline.
Do you see anything wrong with listing 5.11? How does it look?
If we’re concerned with complexity, it doesn’t look too bad. Visual Studio comes with a built-in calculator of simple code metrics, like cyclomatic complexity, depth of inheritance, lines of code, and so on. The metric I mostly pay attention to is cyclomatic complexity. If it exceeds seven4 I think you should do something to reduce the number, but currently it’s at six.
4 Recall from section 3.2.1 that I use the number seven as a token for the brain’s short-term memory limit.
On the other hand, if you consider the entire system, there’s more going on. While the Post
method checks the preconditions of what constitutes a valid reservation, that knowledge is immediately lost. It calls the Create
method on its Repository
. Recall that this method is implemented by the SqlReservationsRepository
class in listing 4.19.
If you’re a maintenance programmer, and the first glimpse you get of the code base is listing 4.19, you may have questions about the reservation
parameter. Is At
a proper date? Is Email
guaranteed to not be null? Is Quantity
a natural number?
You can look at the Reservation
class in listing 4.12 and see that, indeed, Email
is guaranteed to not be null, because you’ve used the type system to declare it non-nullable. The same is true for the date, but what about the quantity? Can you be sure that it isn’t negative, or zero?
At the moment, the only way you can answer that question is by some detective work. What other code calls the Create
method? Currently, there’s only one call site, but this could change in the future. What if there were multiple callers? That’s a lot to keep track of in your head.
Wouldn’t it be easier if there was some way that would guarantee that the object has already been validated?
Reduced to the essence, encapsulation should guarantee that an object can never be in an invalid state. There are two dimensions to that definition: validity and state.
You’ve already encountered heuristics such as Postel’s law that help you think about what’s valid and invalid. What about state?
The state of an object is the combination of its constituent values. That combination should always be valid. If an object supports mutation then each operation that changes its state must guarantee that the operation doesn’t result in an invalid state.
One of the many attractive qualities of immutable objects is that you only have to consider validity in one place: the constructor. If initialisation succeeded, the object should be in a valid state. That’s currently not true for the Reservation
class shown in listing 4.12.
That’s an imperfection. You should make sure that you can’t create a Reservation
object with a negative quantity. Use a Parametrised Test[66] like listing 5.12 to drive this change.
I chose to parametrise this test method because I consider the value zero fundamentally different from negative numbers. Perhaps you think that zero is a natural number. Perhaps you don’t. Like with so many other things5 there’s no consensus. Despite this, the test makes it clear that zero is an invalid quantity. It also uses -1
as an example of a negative number.
5 What’s a unit? What’s a mock?
[Theory] [InlineData( 0)] [InlineData(-1)] public void QuantityMustBePositive(int invalidQantity) { Assert.Throws<ArgumentOutOfRangeException>( () => new Reservation( new DateTime(2024, 8, 19, 11, 30, 0), "[email protected]", "Marie Ilsøe", invalidQantity)); }
The test asserts that when you try to initialise a Reservation
object with an invalid quantity, it should throw an exception. Notice that it doesn’t assert on the exception message. The text of an exception message isn’t part of the object’s behaviour. That’s not to say that the message isn’t important, but there’s no reason to couple tests to implementation details more than necessary. It would only mean that if you later want to change the exception message, you’d have to edit both the System Under Test and the test. Don’t repeat yourself[50].
In the red phase of Red Green Refactor this test fails. Move to the green phase by making it pass. Listing 5.13 shows the resulting constructor.
Since the Reservation
class is immutable, this effectively guarantees that it’ll never be in an invalid state6. This means that all code that handles Reservation
objects can dispense with defensive coding. The At
, Email
, Name
, and Quantity
properties are guaranteed to be populated, and the Quantity
will be a positive number. Subsection 7.2.5 returns to the Reservation
class to take advantage of these guarantees.
6 I’m pretending that FormatterServices.GetUninitializedObject
doesn’t exist. Don’t use that method.
public Reservation( DateTime at, string email, string name, int quantity) { if (quantity < 1) throw new ArgumentOutOfRangeException( nameof(quantity), "The value must be a positive (non-zero) number."); At = at; Email = email; Name = name; Quantity = quantity; }
Encapsulation is one of the most misunderstood concepts of object-oriented programming. Many programmers believe that it’s a prohibition against exposing class fields directly - that class fields should be ‘encapsulated’ behind getters and setters. That has little to do with encapsulation.
The most important notion is that an object should guarantee that it’ll never be in an invalid state. That’s not the callers’ responsibility. The object knows best what ‘valid’ means, and how to make that guarantee.
The interaction between an object and a caller should obey a contract. This is a set of pre- and postconditions.
The preconditions describe the responsibilities of the caller. If the calling code fulfils those obligations, however, the postconditions describe the guarantees given by the object.
Pre- and postconditions together form invariants. You can use Postel’s law to design a useful contract. The less you ask of the caller, the easier it is for the caller to interact with the object. The better guarantees you can give, the less defensive code the caller has to write.
52.14.121.242