Chapter 12. Test Cleanly

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12. Test Cleanly

Working tests are the most precious asset in any software development project. They not only show that your system is still useful in some way, but they also document what the system does—and you can check that with the press of a button if you have automated your tests.

Since working tests are so valuable to your development project, you should keep them working under all circumstances—or make an informed decision to get rid of them. With today’s version control systems it’s not a shame to throw away things that you no longer need. In fact, it gives you some relief. Teams not only report that fewer tests provide them with more flexibility (see [Adz11]), but also relieves some of the burden to our brains [Hun08].

You may argue that a good organization of test cases helps you do that, too. But up to which point do you want to keep all of your tests? A few years back I was part of a project for a customer in Brazil where the testers automated all tests that came to mind. Brazil has a tax system that allows each of the twenty-seven states to apply a different tax system. In our system our programmers had to configure one subset for each tax system.

The first shot of automated tests consisted of roughly 300 times 27 tax variations test cases. The overall test suite took more than two days to execute in a single run. Despite all the efforts put into that code for automation and the tests themselves, that test automation was completely worthless at the time I entered the project.

The automated tests did not provide any useful answer on any update of the software in a timely manner. This resulted in business decisions being made without reference to the test outcomes that turned up two days later. The whole test automation effort had painted itself into the corner.

So, what did we do to get the automation back under control? We had to throw away test cases. We analyzed the approach to generate each product and sought the risks that came with that approach. Then we tackled those risks rather than all the risks that could have happened with any approach. By doing that we were able to throw out the majority of the test cases altogether. Now, I would apply a pairwise approach to reduce the number of test cases (see Chapter 9, “Pairwise Testing” and [HK11]).

The underlying problem was that the tests were not maintained. They were not clean, but instead contained a lot of buzz that burdened the testing team. The business stakeholders made decisions that overruled any test outcome, and finally the bugs that backfired burdened the programmers as well, because they had to deal with the rework.

All of this can be avoided if you keep an eye on your automated tests and avoid tests becoming broken windows. The Broken Window Rule [HT99] is based on an observation from uninhabited houses. Nothing might happen for quite some time to these houses once the last habitant left. But once the first unrepaired broken window shows up, more and more windows get broken. The same happens to our code base as well as to our entire test base.

For your code base, there are very good books on the topic. For example, you may consider Clean Code [Mar08a] by Robert Martin, or Refactoring to Patterns [Ker04] by Joshua Kerievsky. For test code you might consider xUnit Test Patterns [Mes07] by Gerard Meszaros, which should give you a good start. Some of these apply also quite well to acceptance tests.

Over the years I found three main lessons from all the literature on the topic. First of all, test automation is software development. This implies that I have to use proper software development practices to develop my test automation, and that I also need to test my test automation code. Second, your tests will start to smell bad at one point or the other. At this point, it pays to follow this smell and seek opportunities to improve not only the software but also the software development process that created that smell. Last, but not least, one of the most hurtful things missing in software test automation at the time of this writing is refactoring tools that not only change your test automation code, but also your tests. I will provide some clues to the patterns I have seen so far in the different frameworks that probably can be automated.

Develop Test Automation

One of the main advantages of incremental and iterative software development is that it’s hard for ever-larger systems to anticipate all requirements and functions up front. Previously, software was developed with the mindset that all the requirements for the next version of the software can be gathered before any coding starts, only to find that adapting to the changes in the business process was necessary in the meantime, as well as patching your system to adapt to them.

Iterative and incremental development builds upon existing features. The idea is to start with the most basic version of the feature that you can imagine. When you are done implementing this version, you add the next increment to refine it. For example, you might start with a Walking Skeleton (see [Coc06]) and start to add more flesh to it later. You might leave out validation code that rejects invalid input values to the system. By building the most valuable cut through the software system first, then dealing with the special cases, incremental development becomes a success.

The iterative and incremental approach to test automation also works for your glue code for the textual representation of your examples and the system under test. In the airport example we saw how Tony and Alex made the first test pass, then automated one example after the other. Later, Tony automated incrementally from that basis by adding the remaining parking lots to the code.

The key to this basis is a design that is flexible in the dimension that you will extend to later. After being able to input different starting and ending parking times, selecting the proper parking lot was a no-brainer. Design decisions influence the flexibility of the test automation code. In the airport parking lot example, the problem was easy. Alex and Tony could easily foresee design decisions for the future, and therefore also end up with a flexible design approach.

On real projects things often turn out way more ugly than this simple example. Even for the traffic lights example, things started out to be more complicated than we might have anticipated. On more complex domains, it pays therefore to focus on the most basic flow through the system first, then continue from this base camp.

When I worked at a company on a replacement for its test automation system, we had many business use cases that we would have to automate in the replacement system. We could have started with a long analysis, laid out the plan for the new design, and implemented everything we knew. Instead, we analyzed all the use cases and their flow for likelihood of failures and their impact on the business. Then we assigned each of the business use cases a criticality. We started with the most critical use cases in our new test automation approach as it gave us the biggest return on investment quickly. Eighteen weeks later we found ourselves finished with the transition. Having dealt with the biggest problems first, it was very easy to replace the tests for the use cases with lower business criticality. Indeed, the use cases we automated first addressed many risks that allowed us to build a basis for the other use cases later.

One major lesson is to avoid putting all the logic in one large glue code class. In FIT, for example, you can decide to put translation code between the business objects such as money (a double plus a currency) directly into the Fixture class. You can also add large behavior flows like adding an account with an address in your class directly. This approach is as bad as putting all the business logic in the GUI classes of your application.

First of all, you create a code base that is hard to test automatically through unit tests on its own. The object model of your test code lacks the existence of commonly used domain objects like money and an AccountCreator. Eventually, your team members will lack an understanding of how to deal with the code that you created and keep reinventing the wheel. In the worst case, you end up with a large test support code base that is copied and pasted together. At this point you already have shot yourself in the foot with your test automation efforts as new features become harder and harder to automate based on that.

If you build your support code in components, you create independent helpers that you can include or extend in the future. In the traffic lights example we extracted the concept of a CrossingController as a component. Although we actually put it in our domain model in the end, the crossing controller is an example of such an independent component.

As we saw later, there is another advantage to building small components for your test automation support code. Independent components are easier to unit test in themselves. Since test automation is software development, you should consider unit testing more complex flows through the support code on its own. As I mentioned in Chapter 9, I once heard from a team where test-driving the test automation code led to a major increase (10–15%) in test coverage in the whole system. If you find yourself adding one if or switch statement after another, consider refactoring your code to a more object-oriented structure such as a strategy pattern [GHJV94] or an independent component.

Listen to the Tests

Unit tests can teach you a lot about the design of your classes [FP09]. If the unit tests are awkward to write—and read for that matter—they may give you hints to a missing concept in your design.

From my experience, this does not only hold for unit tests, but also for acceptance tests. If you need a lot of support code—especially complicated with many nested ifs—this implies a missing concept either in your application or in your support code. If you have built your acceptance criteria on business examples, it’s most likely that your application is missing a domain concept.

We saw an example of this with the traffic lights example. There we took the tests and the support code to motivate the domain concepts in the code. This is one way to listen to your tests and drive your implementation from these.

But some teams don’t give access to the production code and domain model to their testers. At this point it becomes absolutely necessary to have collaboration between testers and programmers in place. If testers realize problems with long and hard-to-read tests, but can’t do a thing about them, then the code base will start to degenerate quickly.

On the other hand, if testers can provide the programmers the feedback that something is wrong based on what they see happening to the tests, then the programmers might become aware of a problem in the code and come up with creative solutions together with the testers.

Ideally, you worked completely from the outside in to your code base. The probability, then, is small that you will end up with long tests or hard-to-read examples. I found myself painted into the corner more often if I retrofitted acceptance-level tests to already existing code. One of the major drawbacks was that it was inconvenient or even awkward to hook the tests up, since I hadn’t anticipated all necessary entry points for end-to-end tests. If you work from the outside in, you will have to create all the hooks by definition.

Listening to your tests works on three different levels if you apply ATDD. First, you can listen to your examples. If they end up long, and rather reflect a complete flow through the system with many steps, then they indicate a missing abstraction between the business focus and the technical implementation details.

Consider an inlined example for the airport example in Listing 12.1. This test consist of keywords from the Selenium library in Robot Framework. It tests the condition that we parked a car in the Economy Parking lot for one day, 23 hours, and 11 minutes. There are several flaws to this example. The most prevalent one is that we will get a headache once we try to maintain this in the longer run. It does not express its intention. The text is too verbose. We should clearly refactor this test.

Listing 12.1. A verbose example for the airport parking lot calculator

Click here to view code image

1 Basic Test
2     Open Browser  http://www.shino.de/parkcalc/  firefox
3     Set Selenium Speed  0
4     Title Should Be  Parking Calculator
5     Select From List  Lot  Economy Parking
6     Input Text  EntryTime  01:23
7     Select Radio Button  EntryTimeAMPM  AM
8     Input Text  EntryDate  02/28/2000
9     Input Text  ExitTime  12:34
10     Select Radio Button  ExitTimeAMPM  AM
11     Input Text  ExitDate  03/01/2000
12     Click Button  Submit
13     Page Should Contain   (1 Days, 23 Hours, 11 Minutes)
14     [Teardown]  Close Browser

Contrast this with the examples for economy parking that Tony, Phyllis, and Bill identified in the airport example (see Listing 12.2).

Listing 12.2. The Economy Parking Lot automated examples (reprint from Listing 3.4)

Click here to view code image

1 Feature: Economy Parking feature
2   The parking lot calculator can calculate costs for Economy
       parking.
3
4   Scenario Outline: Calculate Economy Parking Cost
5     When I park my car in the Economy Parking Lot for <parking
        duration>
6     Then I will have to pay <parking costs>
7
8   Examples:
9   | parking duration    | parking costs |
10   | 30 minutes          | $ 2.00        |
11   | 1 hour              | $ 2.00        |
12   | 4 hours             | $ 8.00        |
13   | 5 hours             | $ 9.00        |
14   | 6 hours             | $ 9.00        |
15   | 24 hours            | $ 9.00        |
16   | one day, one hour   | $ 11.00       |
17   | one day, three hours| $ 15.00       |
18   | one day, five hours | $ 18.00       |
19   | six days            | $ 54.00       |
20   | six days, one hour  | $ 54.00       |
21   | seven days          | $ 54.00       |
22   | one week, two days  | $ 72.00       |
23   | three weeks         | $ 162.00      |

One way to solve this is to create an abstraction layer in a keyword-driven manner. Tony did this in the airport example when he introduced the examples based upon the business requirements automatically. But you may also discover the necessity for such a layer later. Then you should come up with additional scenarios like we did in the traffic light example for the invalid light state combinations in the first crossing controller. There we added the abstraction of an invalid state. In retrospect what led us there was the notion that our table had many repeated entries for the invalid light states, which led to yellow blinking lights. This was an example of listening to the tests when your examples become longer and redundant.

Another way to solve the problem of long tests is to introduce domain concepts in the support code or domain code. We saw this in the traffic lights example when we introduced the domain concept of a light state to the domain code based upon our acceptance tests. Later we found out about the necessity for a state validator based upon our acceptance tests for a crossing.

Once I worked on a project where we had to deal with different types of accounts. These could have multiple child accounts together with different subscriptions. Our first approach in test automation there was to describe the whole hierarchy for the test we needed. Once we had automated most business flows in this manner, we eventually realized that this was hard to maintain in the long term.

At that point, the tests told us that we worked on the wrong level of abstraction as well. By sitting together with the business side we came up with the concept of different tariff structures expressed in the test examples. Instead of an account hierarchy with one top-level and one subscription account with product XYZ, we called the whole hierarchy branch a subscription in tariff XYZ and hid the implicated selling of different products to the subscriber in that term. See Listing 12.3 for an example of such a table.

Listing 12.3. A setup table preparing three different account hierarchies with different tariffs and products hidden in the back

Click here to view code image

In that project we as testers didn’t have access to the source code, so we put the responsibility for selling the products into the support code for our tests. We created a lookup for different business tariff names and applied several operations on a standard account for each tariff. When we finished, we noticed that we had thrown out accidental complexity from the test examples into the support code, thereby making the tests easier to handle and maintain.

The second way to listen to your tests is by looking at the glue or the support code. In the traffic lights example we saw this happening when we identified the need for an enumeration of light states. The glue code told us, that there were going to be multiple if-then-else constructs adding unnecessary complexity to the glue code.

The glue code there really told us that we needed a higher-level concept. Our examples were easy to read and quite short at that point in time. But our code wasn’t. That’s where listening to the glue code rather than the examples helped us make the decision to come up with the concept of a LightState.

At times your support code may be hard to handle. At this point, it is really telling you that it’s lacking some sort of concept. At this point you should stop, reflect on your code, and see if you can see that new concept arising. If you can’t, give it some more time. If you see the missing concept lurking in the code, then try to extract it.

Finally, there is a third way to listen to your tests. This way applies if you drive your test automation code using test-driven development. Your unit tests might be hard to write. At this point all the lessons from Steve Freeman and Nat Pryce in Growing Object-oriented Software Guided by Tests [FP09] apply. Suffice it to say that you listen similarly to your unit tests as you listen to your acceptance tests. Still, I whole-heartedly recommend Freeman’s and Pryce’s book if you still want to dig deeper.

Refactor Tests

At the time of this writing, one of the final frontiers for automating your tests is the ability to restructure existing tests. I hope to see some advancement in the years to come. There are already some tools available that close the gap, but so far I have seen only add-ons to existing tools to do that.

In programming, refactoring code refers to changing the internal structure of the code, but not changing the functionality or what it does [FBB⁺99]. While the internals to the code structure are changed, the functionality is preserved.

Initially, refactorings were small steps combined such as renaming a method or extracting a variable. Over time, by combining several such lower-level refactorings, more complex ones came to life. For example, Extract Superclass is a higher level refactoring.

A few years back, integrated development environments (IDEs) didn’t come with refactoring support. Back in those days, refactoring was a time-consuming activity that could break the whole code base. For Smalltalk, automated refactoring tools existed, but not for Java or C++. Automated refactorings are safer than following the steps by hand, since they are just executed if they can preserve the previous functionality.

Now, there is nearly no IDE without automated refactoring support. Refactorings such as changing the name of a class or extracting a method from a code snippet are easy to access and safe to use now and are well-used among programmers.

Unfortunately, Agile-friendly test automation tools mostly lack refactoring support for your tests. While easy renaming, such as changing the color red to blue, may be achieved with a shell script using, for example, sed and regular expressions, more complex changes like the exchange of two columns in a table or extending a table by one column are tedious manual tasks.

ReFit from Johannes Link¹ is one of the add-on tools that I would like to see incorporated into the acceptance test frameworks around today. It provides the ability to search and replace in existing data as well as restructure the existing examples to some degree. With an integrated development environment for acceptance tests, we could perhaps overcome this shortcoming as well, but my experience with such IDEs for testers has been that they come with even more payload than necessary.

Most of the tools that claim to be test automation super tools usually follow this pattern:

• They come with a set of standard functions that someone requested, i.e., database queries.

• They come with some graphical representation of the examples.

• They are bundled with a license model that forces companies to buy just enough licenses for their testers, but not the programmers.

The first shortcoming results in a high degree of functions available to testers. Unfortunately, this also may lead to testers using these lower-level functions directly in the tests. At that point, the tests become coupled to how the application is implemented rather than abstracted from that and focusing on the business goals.

The second shortcoming results in a higher complexity for refactoring or updating your tests. Eventually, you will have to redraw all the tiny little graphics that make working in these IDEs very convenient. But the degree of test maintenance necessary to change anything can pretty much blow any project.

The last shortcoming results in poor collaboration in the team. Since programmers do have access to IDEs, they won’t run the functional acceptance tests before checking in any code. This will, of course, lengthen the feedback loop for them in case they break something. When they have moved on to the next task, once you find a bug in their changes, they will have a hard time remembering what they did there in first place.

Regarding refactoring of your tests, the second argument is a showstopper for graphical tools. They might provide an interface that lets you work without any programming knowledge. But this also means that your programmers will not adapt their interfaces to your tests when they execute an automated refactoring in their IDE. This also means that you will have a lot of manual work if you need to update something. Although programmers nowadays come with some abstraction layers on their own to decouple changes in the application from changes in the test examples, the flexibility to refactor your tests to an additional data set is still based on other tools like a spreadsheet program.

Getting back to Agile-friendly test automation tools, all of the tools available today lack the ability to easily restructure the examples. I see great potential in such a feature, because it would simplify very tedious tasks for us testers in the long run, once a new feature makes massive changes to previously existing tests necessary.

In the meantime, I would like to describe two refactorings that I found convenient. These refer back to similar refactorings for source code, and I think the two build a basis for more advanced refactorings of tests in the future.

Extract Variable

The first refactoring in that list is to extract a variable. A variable represents a placeholder, the convenient use of repeated values. For example, we might extract the value for yellow blink into a variable in the traffic lights example. We can then give this variable the name invalid configuration and can easily replace a bunch of occurrences of yellow blink by changing just the contents of the variable.

You will find this refactoring necessary if you can foresee a lot of changes in a particular dimension. While the change for an invalid combination might seem unlikely in the traffic lights example, if you consider shipping our traffic lights system to different countries where different configurations for invalid states might exist, you will see that this prepares the tests for configurability of the application. If this is a future evolution point of the system, we can incorporate this right from the start.

If you already have tests and want to extract a variable from the existing data, here is a sequence I often use.

1. Find the value that you would like to store in a variable.

2. Define the variable and fill it with the desired value. Run your tests so that you see any unwanted changes, such as an already existing variable with the same name.

3. Replace the first occurrence of the value with the newly defined variable. Run your tests to see that you didn’t break anything.

4. For each additional occurrence repeat the previous step. After each change run your tests.

Extract Keyword

In case I see one step being used in several tests, I consider extracting a keyword so that my tests become easier to read. In a source code context, this refactoring is similar to extracting a method from a code snipplet.

In the traffic lights example we saw this nearly happen when we introduced the concept of an invalid light state keyword (Listing 12.4 repeats the examples). We combined multiple steps in the test table to create a new keyword for this commonly used flow. Other uses may be to log into the system by entering a username and a password. The login function might turn out as a function that most tests need to execute at some point.

Listing 12.4. Extracted keyword for invalid light state combinations (repeated from Listing 7.15)

Click here to view code image

1 ...
2 !2 Invalid combinations
3
4 !|scenario       |invalid combination|firstLight||secondLight|
5 |set first light |@firstLight                                |
6 |set second light|@secondLight                               |
7 |execute                                                     |
8 |check           |first light        |yellow blink           |
9 |check           |second light       |yellow blink           |
10
11 !|script|FirstLightSwitchingCrossingController|
12
13 !|invalid combination    |
14 |firstLight  |secondLight|
15 |green       |red, yellow|
16 |green       |green      |
17 |green       |yellow     |
18 |yellow      |red, yellow|
19 |yellow      |green      |
20 |yellow      |yellow     |
21 |red, yellow |red, yellow|
22 |red, yellow |green      |
23 |red, yellow |yellow     |
24 |red         |red, yellow|
25 |red         |green      |
26 |red         |yellow     |
27 |yellow blink|red        |
28 |yellow blink|red, yellow|
29 |yellow blink|green      |
30 |yellow blink|yellow     |

Compared to variables, keywords provide the ability to take parameters. At that point you really create a function with no return value if you extract a keyword.

I worked with systems where we created multiple levels of keywords with higher-level keywords covering higher-level concepts being glued together to an even more higher-level concept and lower-level keywords, which some higher-level keywords could use. There is a drawback to too many levels of keywords. Since you don’t have the ability to easily dive into the keyword hierarchy nowadays, it might become awkward to create, find, and maintain too many levels of keywords.

Here are the steps I usually take to extract a keyword from an existing example.

1. Identify the flow you would like to name more conveniently.

2. Create a new keyword covering all the parameters that you will need. Run your tests to notice side effects of redundant keywords.

3. Replace the first occurrence of the flow with your new keyword, passing all necessary parameters. Run your tests to check that you didn’t break anything.

4. Repeat the previous step for each occurrence of the flow. Run your tests after each change.

Summary

In order to build clean tests, you have to treat test automation as software development. That means to incorporate any development practices that you also apply to your production code. That includes not only to unit test your test automation code, but also to refactor it. Finally, keep decoupled designs and architectures in mind when building your mediation code.

At times you will find out that a particular test is hard to write, or hard to automate. This is a case where you should listen to your tests, and seek the problem in how you build your application so far. Use the information that your tests reveal to change the problem in your code.

If you run into the situation that something in the domain changes, then you will face the struggle that you have to change a large number of tests. For source code, the term refactoring refers to structural changes of the code that do not change anything in the behavior of the program. In order to build and maintain your tests cleanly, we will need tool support for basic refactorings. We have seen two here: extract a keyword and extract a variable. I expect more to be discovered in the near future.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 12. Test Cleanly

Create new playlist

Sign In

Sign Up

Chapter 12. Test Cleanly

Develop Test Automation

Listen to the Tests

Refactor Tests

Extract Variable

Extract Keyword

Summary

Table of Contents for
Chapter 12. Test Cleanly