4

Test Types, Cases, and Environments

A tourist asks a local, “Which way is it to Dublin?”

The local answers, “Well, if I were going to Dublin, I wouldn’t start from here.”

– An old joke

Before we can start detailed considerations of test cases and plans, we must be clear about what each test case should include and evaluate the different approaches to writing them. Should they explicitly prescribe every action to take or merely describe the intended test, leaving individual testers to implement the details?

You have to choose your test environment. You could test on developer’s machines, where new code has been written seconds before, on your live running instance available to your users, or in various locations in between, such as dedicated test areas. Using VMs and containers, as well as defining infrastructure as code, lets you create consistent environments. Choosing the right one is vital to building a successful, reproducible test process and making sure you start your testing from the right place.

Your testing must also have the conflicting strengths of being systematic and rigorous, carefully trying each possibility, while simultaneously being driven by imagination and curiosity and guided by feedback on previous errors. The specification is your guide and you need to thoroughly test it, but also improve it as you go along. We’ll explore how to achieve those conflicting aims simultaneously.

We’ll cover the following topics in this chapter:

  • Understanding different levels of testing
  • Defining test cases
  • Prescriptive and descriptive test steps
  • Evaluating different test environments
  • Setting the correct version, configuration, and environment
  • Performing systematic testing
  • Testing in the release cycle
  • Using curiosity and feedback

In this book, we refer to system tests, which test the whole production system working together. Unit tests should also be written to cover individual functions, and integration tests are needed to drive separate modules or services. These are generally easier to write because the system under test is simpler. You should definitely use those tests too, and they are discussed more in Chapter 6, White-Box Functional Testing. However, there is a class of issues that only appears during system tests, and since your customers run the entire system together, these are the most realistic and complete tests you can perform. The following section describes these types of tests in more detail and considers their strengths and weaknesses.

Understanding different levels of testing

Consider the following simplified architecture of an example system as shown in Figure 4.1. It has two user-facing inputs: a web interface and an API, labeled A and B on the left-hand side of the diagram, respectively. Those requests are processed by Module 1, which has several functions, before being passed over an internal API, labeled C, to Module 2. Module 2 also comprises several functions. It communicates with Module 3 using a backend API, labeled D. Module 2 also has two external, read-only interfaces: a web report, labeled E, and automated emails, labeled F:

Figure 4.1 – Diagram of an example system, showing unit, integration, and system tests

Figure 4.1 – Diagram of an example system, showing unit, integration, and system tests

Unit tests are the lowest test level, breaking the code into the smallest testable entities. That means trying individual functions and classes to ensure they correctly respond to inputs. As shown in the diagram, that would mean testing Function 2 of Module 3 independent of the rest of the system to ensure it functions correctly in isolation. The same could be done to any of the individual subsystems shown in the diagram.

Integration testing isolates a particular module. As highlighted in Figure 4.1, to test Module 1, you need to control its three interfaces, A, B, and C. The web interface and the API are public-facing and are likely to be well-documented and understood. The output of Module 1 is the internal API, C. To test that, you would need to be able to mimic the behavior of Module 2 by building a stub or a mock program.

Stubs are simple pieces of code that return fixed values – for example, always indicating success. They are stateless and hardcoded and are therefore closely coupled with the tests that use them. They’re most helpful for checking the state of the system under test.

Mocks are more complex. They are configurable via some interface and have a broader range of behavior, making them less closely coupled to the tests that use them. They check the communication with the system under test and the interface between them.

In our example, we would need a stub or a mock on interface C to ensure that Module 1 sends the right messages. Then, we can completely ignore the behavior of Module 2 and Module 3; they don’t even need to be running. This is especially useful in a big project to decouple the requirements. If you have a new feature and the change is finished in Module 1 but not in Module 2, you can’t system-test it entirely yet. However, by adding it to the mock program on interface C, you can still perform integration testing on Module 1 in isolation.

Finally, you can perform system testing around the system as a whole. Then, you can ignore internal interfaces C and D and only test the inputs on interfaces A and B and the outputs on interfaces E and F. Your tests should only focus on those interfaces, although the tests can be informed by the contents of interfaces C and D when you design white-box tests (see Chapter 6, White-Box Functional Testing).

System testing is closely related to end-to-end testing, which also tests the external interfaces of a product. End-to-end testing is a subset of system testing, specifically focusing on the flow through the different states of your product and the data integrity during each state. How to consider your system’s state machine is described in Chapter 6, White-Box Functional Testing. System testing is broader and encompasses usability, maintainability, load testing, and the other subjects of subsequent chapters, so I will discuss end-to-end tests as part of system testing.

The next section illustrates the different types of testing with a worked example.

Test level examples

To illustrate these levels of testing, let’s consider the example of creating a new user within the system shown in Figure 4.1.

At the lowest level, unit tests should focus only on a single function in a single module of the code – in this case, the function that creates the user. So the test would call that function and assert that the user is present in the database with all columns set to their correct default values.

This test needs the code and database to be running but doesn’t depend on the user interface (UI) or any internal APIs. Even within this limited test, note that you need to check every column to ensure it is correctly populated – don’t just check for the presence of the user. Even with a small and seemingly simple test, it is possible to leave gaps.

The intermediate level is integration testing – for example, internal APIs between different modules within the code. An example test would send an internal API message to the database to create a new user. Check that the user is available with its data correctly populated.

This API message should be identical to the one that the frontend system sends to the database. Aim to mimic the real usage as closely as possible, which means updating the test if the frontend behavior changes. You’ll need to keep these tests up to date or risk missing bugs, as they behave differently to the live system.

When the user is created, check its result at the same level by sending another command with the same API. This ensures you are testing a single interface, but checks the whole module’s performance, including the database, processing, and transmission of data. If you check the database directly, for instance, as in the previous unit test, you will miss bugs on the internal API.

Finally, the system tests use the entire system in a realistic way. An example test would create a user and ensure all their information is visible on their profile screen.

This test requires the UI and all the internal steps. It relies on the UI working, as well as its connection to the backend and the database. The check is performed at the same level again by reading the UI to verify the entire application. The system test is the most complex and has the most dependencies; it is also the slowest to run. However, it provides the most complete testing. If that passes, the internal interfaces must be working.

The unit test example covered a back end function, but you should also have unit tests for the frontend, of course. For instance, check that user screens are identical after each code change that is supposed to leave them untouched.

These are unit, integration, and system-level tests. The majority of this book will describe system tests, as they are the most complex and have many possible variations, but next, we consider how the different levels of tests should work together.

Test ordering

In this book, I have described exploratory testing being run first, along with specifying the feature. In practice, unit tests are likely to be written by the developer first during the initial implementation. Recall the testing spiral from Chapter 1, Making the Most of Exploratory Testing; the first round of unit testing is completed along with the initial implementation.

Here, I imagine a tester picking up the code for testing after that initial implementation, or possibly working with the developer to review and consult on their work. At that stage, you will need to explore and thoroughly specify the feature – then, you can check on any tests that are already present before expanding them and designing more.

How many tests you should aim to have at each level is considered in the next section.

The testing pyramid

The testing pyramid was described by Mike Cohn in his book Succeeding with Agile. While I have tweaked the details of his description, the overall philosophy is very useful. Cohn proposed that unit tests form the base of the pyramid, with service tests above that and UI tests at the peak. Here, we will translate those into integration tests and system tests, respectively, to apply the same ideas:

  • Write many small, isolated tests covering limited functionality (unit tests)
  • The more functionality tests check, the fewer of them you should have (integration and system tests)

It’s important to have tests at multiple levels. The analysis of advantages and disadvantages in the next section doesn’t conclude that one type is superior, but rather that you need all kinds of tests to mitigate the weaknesses of the others.

The test pyramid, following the terms used here, looks like this:

Figure 4.2 – The testing pyramid in terms of unit, integration, and system tests

Figure 4.2 – The testing pyramid in terms of unit, integration, and system tests

The key strengths of unit tests are how small and fast they are to execute. Because each test is so limited, you will need lots of them, but that’s fine because they are quick to write and run.

At the next level are integration tests. These cover a wide range of functionality but are still limited to one area of the system. You should aim to have fewer of those since they are harder to write, run, and maintain.

Finally, at the top level are the system tests. These are the slowest to run and the hardest to maintain, but provide invaluable coverage of the entire system working together. There may be fewer of them, but they give you the most confidence that your product will work in practice. Due to their scope and complexity, this book focuses on system tests.

The following sections describe the strengths and weaknesses of each kind of test in more detail.

Advantages and disadvantages of unit tests

Now that we’ve seen how the different types of tests work, we can consider their strengths and weaknesses. First, let’s refer to unit tests. Because they check such small pieces of code, the number of variables and possible outcomes is also small. This simplicity is their main strength, as shown in this table:

Advantages

Disadvantages

Simple

More code to change during refactoring

Reliable

Require developer time

Easy for the developer to write

Hard for anyone else to write

Don’t require a full system

Ill-suited to UI tests

Increase code documentation

Risk being unrealistic

Find problems early

Limits to what they can cover

Fast to run

Easier to maintain

Easier to debug

Table 4.1 – Advantages and disadvantages of unit tests

Unit tests tend to be small and reliable, testing a particular function with a known and limited set of inputs and outputs. They are tightly coupled to the code, making them easy to write for the developer associated with that feature but harder for anyone else. Because the tests are limited, they don’t require a whole system to run nor do they require stubs or mock interfaces; this makes them quicker and simpler to set up than integration tests. By showing how the code behaves, they help to increase the documentation, and as they are written at the same time as the feature, they find problems early in the release cycle.

Because they are small functions only running part of the code, unit tests are usually faster than integration or system tests, which can require frameworks or applications to run. Because they each test a very specific function, they have very few dependencies, making them more reliable. Crucially, that makes them easier to debug and maintain, which is the major ongoing cost of automated tests.

On the downside, unit tests are closely coupled with the code they test. If that code is refactored, you will also have to refactor the unit tests, which isn’t the case for integration and system-level tests, which only interact with the interfaces to the system. Since developers usually write unit tests, these cost the developer time and can’t easily be handed over to the test team. However, if the test team can be trained to write these tests, it is useful for them and the project. Unit tests are also poorly suited to graphical outputs such as UIs and are the least realistic test type considered here. You can show that a function behaves correctly with some inputs, but it may never receive those inputs in the running system. Conversely, it may receive inputs that the unit tests don’t cover. It can be hard to make unit tests cover realistic scenarios, which is easier with other forms of testing.

In summary, unit tests are a quick and reliable way to find bugs early. They will improve the quality of your code and should be a part of your test strategy. However, they will never be complete, so you should augment them with integration and system tests.

Advantages and disadvantages of integration tests

Integration tests are more realistic than unit tests because they test an entire module or service. With many functions working together, you can test their interactions without requiring the whole system to be running. The strengths and weaknesses of this approach are as follows:

Advantages

Disadvantages

Can decouple sections of large projects

It takes time to set up stubs and mocks

Match testing to functional units

Risk being unrealistic

Match testing to organizational units

Take developer time

Failures are isolated

More complicated than unit tests

Simpler than system tests

Limited coverage

Agnostic to implementation

Don’t require a full system

Table 4.2 – Advantages and disadvantages of integration tests

Integration tests help decouple different parts of projects. If one team has completed the work in their module, you can test that in isolation and gain confidence, even if other necessary work elsewhere isn’t complete yet. You need further system testing when both sections are finished, but testing can start before that. Integration tests match up with the functional units of the code, making it easier to identify who is responsible for any issues you find. If a single team owns that module, that team needs to fix the problems you encounter there. The failures are quicker to diagnose because they come from a single module, making them easier to debug than system tests where multiple modules work together.

Unlike unit tests, integration tests are agnostic to the implementation. You can refactor the code completely, but as long as the behavior on the module’s interfaces is unchanged, the tests will keep passing without being updated. Integration tests also don’t require a complete system, making them cheaper and less resource-intensive.

The disadvantage of not running on the whole system is that you must substitute stubs or mock programs on the interfaces to other parts of the system. Those take time to write, have to be set up and maintained, and need to be kept up to date to accurately reflect the behavior of the other modules. That takes development time, and stubs can be unrealistic even then. This might invalidate the testing, which is always a risk with integration tests.

An integration test could fail because a specific module returns an invalid response to a particular input. However, if it never receives that input in the actual system, the bug would never be triggered. Integration tests can find such an issue, which should be fixed in case a future code change reveals it. However, that isn’t a realistic bug, so it’s a false-positive result. System testing avoids such false-positives by only performing realistic testing.

Integration tests are far more complicated than unit tests to run and debug, but even with that extra complexity, there are still classes of issues they can’t find, such as issues between modules. Two modules may work individually and pass their respective integration tests, but they can still fail when they run together. For instance, if one module counts users including disabled users while another counts users excluding disabled users, their respective totals and behaviors won’t match. You can only find that class of bug with system testing.

Integration testing can be helpful in systems that are difficult to set up and test as a whole. Still, they take effort to run and maintain while lacking the simplicity of unit tests and the comprehensive coverage of system tests.

Advantages and disadvantages of system tests

System tests are the only tests capable of finding all classes of realistic bugs. For all their disadvantages, that fact makes them indispensable. The following table shows their overall strengths and weakness:

Advantages

Disadvantages

Most realistic

Most complicated

Easiest to set up

Difficult to diagnose issues

Can be performed by any user

Issues have to be assigned to the right team

Agnostic to implementation

Require all components to be available

The only way to find certain issues

Slowest to run

Least reliable

Table 4.3 – Advantages and disadvantages of system tests

System tests are the most complicated to run. Considering all possible system states and inputs, the potential test matrix becomes impossibly large for an application of any size. You always have to choose and prioritize which tests you will perform. As well as being complicated to run, issues are challenging to diagnose, as they could come from any part of the system. The root cause of a problem may be far from the visible symptom. That needs to be investigated and understood before issues can be assigned to the right team, which can add confusion and delays that don’t affect unit and integration tests. System tests also require all components to be available. You can’t carry out system tests on a new feature that spans different modules until all the modules have code ready to test, making it the last form of testing you can run.

System tests are usually the slowest to run, requiring a complete system with messages traversing every stage. If you add shortcuts or mock interfaces, you’ll lose the realism that makes system tests so useful, so there is a limit to how fast they can be executed. Because they use a whole system, they have the most dependencies, which makes them the least reliable form of testing. A failure in any part of the system can result in an error, making it the hardest to maintain.

Despite those disadvantages, system testing is the most realistic. System tests can accurately represent real behavior if you use comparable hardware and data to that of your live product. System tests don’t require developers to write dedicated tests or stub programs. Testing can start with a tester pretending to be a new user and stepping through the initial setup. While creating realistic load levels or comparable data is simplest with system tests, generating realistic load levels can be challenging. However, that is ultimately down to the limitations of your setup; system tests can perform those checks if the tools are available.

Because they involve using your product for real, any user can perform system tests, from developers and testers to product owners and from the documentation team to companies for external testing. As with integration tests, they are agnostic to the implementation. The development team can completely refactor the code, but the same tests will pass if the interfaces are unchanged. Most crucially, there are some bugs that only system testing can find. Due to that unique completeness, system tests must form a part of your test strategy, despite their complexities and difficulties. Because of that completeness, system testing is the form of testing I focus on here to show you how to resolve its challenges and mitigate its weaknesses.

Before running our first systematic test case, we need to be clear about what a test case is, which is described in the next section.

Defining test cases

You will be running many test cases as part of your testing, so it’s important to be clear about exactly what they entail. Each test comprises four elements:

  • Prerequisites
  • Setup
  • Procedure
  • Result

First, you must set up the necessary prerequisites: are you running the correct services with the correct version and configuration? It’s obvious, but it’s easy to miss a critical step and waste test time. The upcoming Setting the correct version, configuration, and environment section describes testing prerequisites in more detail. Those only need to be prepared once for a whole series of tests.

For each test, you need to make sure the setup is correct. Do you need a user or a particular set of data, or is the requirement explicitly to start without any information? Consider the initial state you need for your test and make it explicit. The trick is to make your assumptions clear. There are an infinite number of variables that you could specify; you need to pick the variables relevant to this test. While some are obvious, others might not be. Do users need to clear their cookies, uninstall a program, or clear registry entries before a test? What state might be left behind by previous tests that could confound your tests?

Real-world example – Too many upgrades

We tested many development builds during a waterfall development cycle, finding and fixing scores of bugs. We finally released it on our live system, only for the upgrade to immediately fail. We couldn’t upgrade it at all and had to roll back. We’d never seen that problem in any of our tests, but every live upgrade failed. What had gone wrong?

The live system jumped from the last build on our previous branch, 5.0.17, to the final build on our new branch, 5.1.23. However, it hadn’t gone through all the intermediate steps of 5.1.1, 5.1.2, 5.1.3, and so on. One of our database migrations was faulty and would only work if performed in stages. We had gradually upgraded all our test systems between those builds, but the live system hadn’t been. We needed a new blocking test to perform the upgrade precisely as it would apply to live customers. We required an explicit prerequisite not to go through intermediate builds.

With the procedure and setup clearly described, you can perform the planned tests, and this book describes many possible test procedures for common design patterns. Finally, you need to document the result. As with the setup, there are many different states you can check. The skill in designing the tests is carefully choosing the checks you will perform. Those considerations are discussed in the upcoming Determining what to check section.

You need to document each part of the test case, so you need a test management system to track them. As well as having explicit steps for each test, you also need to be able to reference them. Every test should have an ID that links to the requirements it covers and the bugs it reveals. That lets you track which tests were most useful.

Each test should follow these six principles:

Figure 4.3 - Principles when writing test cases

Figure 4.3 - Principles when writing test cases

  • Be independent of other tests: Ideally, each test would pass or fail on its own. That way, where there is a failure, there is a clear indicator, rather than many tests failing all at once. In practice, this is hard to achieve and many tests depend on previous tests passing. In that case, make tests as linear as possible, so it’s obvious that nothing beyond a certain step worked. Again, that makes investigation much easier.
  • Each test should check one thing: In addition to being independent of others, each test should have one purpose and check only one piece of functionality, as far as possible. Then, you know the cause of the failure and can easily debug it.
  • Don’t be afraid of having many tests: A corollary to each test checking only one thing is that you may need a lot of them. Again, that is fine. Lay them out clearly.
  • Group your tests together: With lots of tests to manage, ensure they are divided up into different sections and folders so you know where to find them and where new ones should be added.
  • Give tests clear names: When looking down a list of failures, make it easy to see what went wrong by including clear names. This may mean you need long names and that’s fine.
  • Prioritize test cases: Plan to have lots of test cases, but not for them all to be equal. If they start taking too long to run, focus on the happy path tests and areas of known weakness where you’ve found bugs before.

Following these principles also makes test statistics more meaningful. How many tests have you run and how many are left to do? Is the pass rate rising or falling? You need graphs and charts to see the status of your testing at a glance. That comes from accurate reporting of the test cases and their results. If you aren’t documenting that already, set up a system to track your next round of testing.

Once your documentation process is in place, you can consider the content of the tests, starting with the level of detail they should contain.

Prescriptive and descriptive test steps

When writing manual tests, you can choose how precise your descriptions are. Either you can specify exactly what to do and how to do it, or you can only say what to do, leaving the details for each tester to decide.

For instance, a descriptive test might say: “On the user settings page, upload an avatar .jpeg image between 100 KB and 5 MB in size.”

Note that while there are instructions on what to do, the tester can use any image they like that matches those criteria. In contrast, a prescriptive test might say exactly which file to use: “On the user settings page, upload /fixtures/image1.jpeg as an avatar image.”

Here, image1.jpeg is a suitable image within that size range. If you are testing a text input, you can specify the exact strings to try or just describe what contents they should have.

Prescriptive tests are also known as low-level test cases, with complete detail on what to do and how to do it. Descriptive tests are also known as high-level cases, stating the goal of the test but leaving out implementation details. Both approaches have different strengths, but overall, there is a benefit to not being overly prescriptive in your test cases.

The advantages and disadvantages of prescriptive test cases are as follows:

Advantages

Disadvantages

Quicker for manual testers to perform

It takes more time and effort to design them

The test is more likely to be run correctly

Limit creativity

Even junior team members can run tests correctly

May make the testing role dull for testers

Easily automatable

Limit test coverage

Easier to describe the required outcomes

Table 4.4 – Advantages and disadvantages of prescriptive tests

If the test plan describes an exact input, that is quicker for manual testers since they don’t need to decide what text to use, for instance, or find a suitable .jpeg image. They are also more likely to perform the test correctly since there is less choice about what they are doing. That means they are easier for junior testers to run because they can see exactly what to do. Prescriptive tests are easy to automate since there is one exact step to take, and means there is a single expected outcome for both automated systems and manual testers to check. Because of that, automated tests are almost always prescriptive.

However, writing a prescriptive test case takes more time and effort because you need to work out precisely the steps to take. That limits the tester’s creativity; they don’t get to choose what to do, discouraging their curiosity and exploration. That makes the role of a tester less interesting, making it harder to find and retain the best team members. Worst of all, though, is that prescriptive tests limit the test coverage you achieve. If you run the same test repeatedly, it will catch regressions in that area, but it will never uncover a new issue.

By allowing testers to use different files in upload, for instance, you add variety that is more likely to discover possible issues and is more like customer behavior. Maybe a new format or option, such as higher resolution files, has gained popularity. Using a single file is fixed in time. If you are still running that same test five years later, you will never try new formats, but by using a new file each time, you have the chance to find that change. If every tester uploads a different image in their tests, you have tested a wider range of inputs. Ideally, you could think this through in advance and review and update tests to use any new options or formats that appear. However, allowing testers some leeway gives them a chance to catch such changes.

Descriptive tests, in contrast, don’t specify the exact method to follow, allowing testers to make choices for themselves. The strengths and weaknesses of descriptive test cases are as follows:

Advantages

Disadvantages

Provide more varied testing

Less reproducible

Keep up to date with changes

Require more experienced staff

Help to avoid the pesticide paradox

May be performed incorrectly

Quicker to define

Use the tester’s experience

Table 4.5 – The advantages and disadvantages of descriptive test cases

As noted, descriptive tests provide more varied testing and keep up to date with changes in how your users use your product. Varying inputs also help avoid what the ISTQB syllabus calls the pesticide paradox. One of its seven software testing principles is that tests are less and less likely to find issues if you keep running the same ones. That’s analogous to using the same pesticide, which gradually becomes less effective, as only pests with resistance to that chemical will remain. In the software domain, that is because the code in that area becomes more mature and stable; it may have fewer changes after its initial implementation. Running the same code, in the same way, will keep passing. You need a more varied approach to find issues and altering the inputs is an excellent way to achieve that.

Because you don’t have to specify the exact steps to take, descriptive tests are quicker to write, making the most of a tester’s experience. If a tester can think up interesting new cases to try, they can perform them as part of that test run.

On the downside, descriptive test cases are less reproducible. If you see a problem with one particular file, for instance, you may not know what about that file caused the test to fail. Your bug reports need to include the details of your choices because those details aren’t included in the test plan.

Descriptive tests need more experience to run, both to try out interesting new cases and ensure they are run correctly. Their vagueness increases the chance of mistakes. For instance, a tester might choose a file that doesn’t meet the test criteria and so does not test the particular case they were meant to exercise. Testers will need more training and monitoring to make sure they understand precisely what they should be doing.

This isn’t an area where compromise works well, so don’t have a mixture of prescriptive and descriptive tests. Pick one approach and stick to it. Mixing tests means that testers will be surprised by a lack of information in the descriptive sections and may not be ready to add the extra detail themselves.

Overall, the superior test coverage provided by descriptive tests is more important. If a tester needs help, you can suggest what they should do, or a test plan could have examples of scenarios to run in case testers are unsure. Ideally, they should not use the old examples, however, but use their own system in their own way to find new bugs. You always find the best bugs when you stray from the test plan.

This discussion of prescriptive and descriptive tests applies to manual testers running a written test plan. For automated testing, you have to specify the file, or set of files, that the automated system will use. If you are writing test cases to be carried out manually, you can let the tester decide, but when running automated tests, you have to choose the dataset it will use. The automated tests then repeat that example on every test run, so it suffers from the disadvantages of prescriptive testing. To mitigate that, you could have a bank of different options that the tests choose between each time, but the possibilities are all specified as part of the auto-test system.

The data you use for automated tests should be regularly reviewed and updated to avoid the pesticide paradox and give you the chance to find new issues. To simplify that, ensure that there is a clear separation between your tests and their data, so you can easily check their values and add new entries.

With the test cases defined and documented, we can start to put the prerequisites in place, the first of which is the test environment. There are several options for that, as described in the next section.

Evaluating different test environments

For system testing, you need a test environment. Unit testing and some integration testing can be carried out on individual modules and components, but to perform system testing, you need a system to test, as the name suggests. That cannot be development machines where the developers may be making constant changes and it cannot be the live environment, by which time it’s too late to prevent the damage bugs cause. You need a test environment between the development and live installations to perform your testing. If you don’t have one, setting that up is your first task before you can do any testing.

The test environment could be a blank installation that you spin up as needed, picking up the latest code. Alternatively, you could have a staging area or beta environment that constantly runs. Both approaches have different benefits, considered next.

Using temporary test environments

This table shows the advantages and disadvantages of temporary test environments:

Advantages

Disadvantages

Guaranteed to be running the latest code

All users need to be able to create the environment

Always start from a known state

Need configuring for non-standard tests

No ongoing maintenance

No real historical usage

No ongoing running costs

Easy to test multiple changes in parallel

Table 4.6 – Advantages and disadvantages of temporary testing environments

With temporary testing environments, you always start from a known state and are guaranteed to be running the latest code. This is great for development when you can try out a new change compared to the old behavior, and that means there are no ongoing maintenance or running costs when the system isn’t in use. Best of all, many different users can create environments, so they can all make many changes simultaneously. That makes this arrangement necessary for development. Testers can also test changes in isolation when they are new and more likely to have issues. When they have passed that initial testing, then changes can be combined.

On the downside, all the users need to be able to create that environment; there isn’t one sitting there ready to be used. That is easier for developers who are more used to the tools and spend most of their time working on the system. For users who only use the test system as a part of their job, such as product owners or the support or documentation teams, setting up a new environment can be more of a challenge, depending on how simple and reliable that process is. Ideally, use infrastructure as code to specify and create the environment in the same way every time it is needed. There should be as few steps as possible for users so that it is quicker and easier with fewer places where it could go wrong.

Because the system is always newly created, you’ll need to configure any non-standard settings you need for your testing and it doesn’t have any real historical usage. You can simulate this by generating test data, but it may not reveal genuine issues.

Using permanent staging areas

In contrast, staging areas run constantly, bringing different strengths and weaknesses. This table lists their advantages and disadvantages:

Advantages

Disadvantages

Quick to use for short tests

Require upgrades

Easier to monitor

Require maintenance

Can find issues due to running long term

Can contain bad data from development builds

Accumulate historical data

Issues affect many users simultaneously

Table 4.7 – Advantages and disadvantages of a permanent testing environment

Staging areas are great for short tests. They run continually, so you can quickly go to the feature you need and try out your test case. That’s useful for product owners trying out new features or testers answering a quick question. You can set up monitoring that matches the live system on the staging area to report errors or system issues. Staging errors make it easier to spot the problems that appear over time, such as memory leaks or excessive disk usage. They also naturally accumulate historical data, such as the configuration created on old versions that have been migrated. This means you get some background testing for free, although it’s hard to judge its extent and coverage.

On the downside, staging areas require upgrades and maintenance. You can automate upgrades, but breakages need to be investigated and triaged and that can be a difficult and unpopular task among the team. Staging areas can also generate spurious errors due to bad data written by the development code. If an early version wrote incorrect data, you have to find and fix the code that went wrong and remove or correct the erroneous data. These can create subtle problems and I’ve seen issues appear over a year after the initial bug. For instance, a new version of code might add additional checks that reject invalid data, whereas before, there were no visible symptoms. A temporary test environment with new data for each installation avoids those uninteresting problems.

And finally, because many users share staging environments, any outages cause problems for many people. You can’t perform destructive tests (see Chapter 11, Destructive Testing) without giving a warning and potentially disrupting other people.

Temporary testing environments and staging areas are helpful for different tasks and I believe you always need both. Temporary environments are useful for developers to work on and staging areas are useful for more realistic testing or quick tests, such as those performed by product owners. Testers sit between those two, so you can choose where is best to base your testing.

Once you have selected your test environment and successfully run it, you need to ensure you are on the correct version with the correct configuration, as described next.

Setting the correct version, configuration, and environment

One of the biggest wastes of time when testing is using the wrong version or configuration. Testing can go quickly when everything is working as planned. As soon as something goes wrong, you must stop and investigate, which should be reserved for genuine bugs. Anything you can fix yourself, you should do first.

Before it can run the correct version, your product must have proper versioning. If your development team lets you test whatever code happens to be in the repository, that is the first thing that has to change. The test team needs to use a stable, numbered build. That isn’t required for exploratory testing and lower-level tests, such as component and integration tests, which can run on every build and use nothing more than a build number. However, by the time you come to run the comprehensive functional tests, you need to know which version of code you were running.

Within fully Continuous Integration/Continuous Delivery (CI/CD) deployments, where every change is tested and pushed live, versioning becomes much more fluid, but even there, you can specify individual builds or tag particular builds to run tests on. That tag might be as singular as “nightly build 1,432," but by the time you have put effort into designing and running system tests, using the whole system together, getting build numbers from the development team is a hard requirement.

These builds should be available every day or so during a project to ensure that changes are being picked up regularly, but not so often that the system under test is constantly changing. For system tests, builds being produced more than once a day is too often. Similarly, if you’re only getting one build per week, that allows too much time to be wasted between introducing defects and them being tested and discovered. A build every day or so keeps everything moving. If upgrades push new code to the test environment automatically, then all the better.

Development team processes vary massively, usually as a function of team size. Possibly, you work in a large multinational, and the thought of not having regular, numbered builds would instantly lead to chaos. Alternatively, you are possibly working as a small, independent software developer with legacy systems, where the idea of having CI is a distant dream. The only requirement is that once you have a test team large enough, specialized enough, and with enough time to be reading this book, you need numbered builds to run against.

Given that you have numbered builds, updated every few days, and an environment on which to run them, you can prepare the system to be tested. Which versions do you need to enable this feature? These details should be tracked in the feature specification. There will often be an obvious machine you must upgrade, but are there any others you need?

Any complex system is likely to have multiple independent builds with different build numbers, so as well as the obvious version you intend to test, you need to know all its dependencies. What other systems do you need to upgrade to use the feature successfully? The developer responsible should provide a complete list, so make sure that you have everything in place. These details aren’t part of the main feature specification but should be part of the notes describing the implementation.

Next, you must check that the configuration has enabled the new feature. While the developers have made it possible to turn it on, is it actually enabled? Again, this is an easy way to waste time. Ensure that all the requirements are documented in the feature specification and then check and enable them in your test environment before you begin.

Using feature flags is a great way to control the testing and rollout of features. They let you turn a feature on for a subset of users or only in certain situations. While useful, they add another setting you have to check as a tester. If you’re wondering why you don’t see your new feature working, check the feature flags first.

When you have all the correct versions running, with all the necessary dependencies and configuration enabled, systematic testing can begin.

Performing systematic testing

The first step for functional testing is to systematically work through the feature specification, expanding each requirement into all the test cases required to cover it. When well-written, with sufficient detail, and each independent of the others, many requirements should need only a single test case to cover them. However, some, especially around error cases or freeform inputs such as textboxes, will require many tests to check all the possible valid and invalid cases.

The feature specification is vital here since it describes all the core functionality. By carefully writing and reviewing the specification, the test plan becomes a secondary document that is easy to prepare. Make sure you use the specification to guide the testing. This section concentrates on the happy path cases and the working routes that exercise the entire feature without hitting errors. Other sections will build on this by adding invalid cases, the performance under load, or the logging that should be written to track the system’s actions. All those other forms of testing rely on the feature working, so you have to run these tests first to find any issues that block subsequent testing.

There may be tests you want to run but don’t yet have the tools to carry out – maybe you need a tool to receive and parse emails or generate large amounts of load. If that’s the case, add them to the test plan anyway. Make sure they aren’t forgotten. Even if you have to skip them for now and raise them as longer-term test tasks, write the tests you want to perform. While someone performs the test plan, others should set those systems up. Within a test team, some members should constantly be working on improving the tools and environment, while others perform the testing of new features.

The functional specification should describe this feature’s effects on all the system’s interfaces. As shown in Figure 4.1, defining the interfaces is key to system testing. Those might be web interfaces, APIs, apps, emails, instant messages, Simple Network Management Protocol (SNMP), SSH File Transfer Protocol (SFTP), or many others. On hardware products, interfaces can include screens and serial interfaces. Ensure you have an explicit list of all your product’s interfaces that you can consider in your testing. As you start systematic testing, check each of those interfaces and the tests you will run on each of them.

Once you have listed your interfaces, you can consider each variable within them. That might be a field in an API, data entered by the user, different responses from third parties, or different states your system can be in. For each variable, consider all possible values it can take and how you can group them. That is described in more detail in the Understanding equivalence partitioning section.

All variables and relevant values should be explicitly listed in the functional specification, but designing your test plan is the last chance to find any you have missed. Once you can see the space of possible inputs and states, you can then determine which are significantly different and require individual tests. That is described in more detail in the Mapping dependent and independent variables section.

Track which tests cover each requirement from the functional specification. A test tracking system will give you tools to make those references to see whether there are any requirements you haven’t covered. If a requirement needs multiple tests, it’s harder to spot, and you need to check that you have them all, so pay particular attention to those.

The next section considers when tests should be run as part of your release cycle.

Testing in the release cycle

You need to choose when you will run these different tests within your release cycle. You can select a subset of your automated tests to run against every product change as part of a CI/CD pipeline, separate from new feature testing. These CI/CD tests are vital checks to avoid bugs and issues from running live, especially if they are set to block the release process. However, these tests have strict requirements:

  • Speed: They must run fast enough that they don’t overly delay releases
  • Reliability: They must work consistently and only fail when there is a real issue
  • Coverage: They must cover all critical aspects of your product

There is a natural trade-off between speed and coverage: the more you test, the slower it goes, so you need to carefully judge the tests to include. These tests also need to be highly reliable, as they are run so often and can delay releases. Unit tests are ideal here as they have fewer dependencies, but you also need a few system tests to ensure the overall application still works.

CI/CD tests are just one place in the release cycle you might run tests, however. Test plans can be categorized into four levels, as shown here:

Figure 4.4 - Subsets of test cases

Figure 4.4 - Subsets of test cases

The base layer is Comprehensive new feature tests, covering anything and everything you can imagine. This book is concerned with that block of tests, which expands as you add new features to your product. Some of these tests might only be run once when a new feature is added. For instance, if you have a country dropdown with 200 options, you might try them all but only once. If Albania works, there’s no reason to suspect that Armenia won’t, but you would want to try it before it goes live for the first time

At the next level are Regression tests. These are almost as extensive as the comprehensive tests but only include tests you do want to run when this feature is modified. While there are no code changes, you don’t need to run all these tests.

At the next level are Manual/Nightly tests. These have a time limit – for instance, 12 hours for an overnight run – so they need to be carefully prioritized. They find issues that might affect live users and would be minor outages, but cover cases that are too time-consuming to include in the CI/CD tests.

Finally, CI/CD tests quickly cover the critical functions of your program, providing confidence after every code change.

These levels, along with examples, are summarized in this table:

Level

When it is run

Example test

Comprehensive tests

Once, when a new feature is added

Test that users can be created with every possible country setting

Regression tests

Any time that feature is changed

Test that users can be created with all localized languages and a selection of others

Manual/nightly tests

Nightly or on-demand

Test that all pages are localized for each language

CI/CD tests

After every code change

Test that one screen is localized for each language

Table 4.8 – Examples of different test plan levels

The design of testing within a cycle and the prioritization of tests is beyond the scope of this book. Here, we focus on writing comprehensive tests covering many eventualities. With a large library of tests for your system, you can choose which to promote to the shorter test runs.

However, you can only promote tests to CI/CD testing from your library if you have an extensive test library to start with, so that is what this book aims to describe. In the next section, we consider the importance of curiosity and imagination when designing your test plan.

Using curiosity and feedback

Stepping through the test cases that the feature specification suggests may sound like a dull, robotic task, but expanding them into detailed test cases requires curiosity and investigation. As described later, it’s impossible to consider every possible combination of inputs, so you must constantly pick and choose which particular options you will try. It’s also always possible to extend tests – once you’re in a given state, you can vary conditions and perform checks that the specification and test plan don’t mention.

Throughout the entire test design and execution, maintain the inventiveness and curiosity you needed during the exploratory tests. Constantly be on the lookout for other interactions, other things to try, and other things to check. Once you are running manual tests or coding automated tests, you are spending longer examining this feature’s behavior, with more information, than anyone else in the project. Make the most of these advantages to think everything through in more depth and find conditions that hadn’t been previously considered.

Recall the feedback diagrams from Chapter 1, Making the Most of Exploratory Testing. You are reviewing the specification as well as the implementation. Does this feature make sense as implemented? Is there a simpler way to achieve the same aims? Are any interactions surprising or unclear? Despite all the work you put into the specification, it is never too late to identify improvements and even fundamental changes. As a tester, you are the harbinger of bad news, so lean into that and never shy away from suggesting changes, no matter their size.

While the feature specification and test plan should be as comprehensive as possible, you find the best bugs when you go beyond the test plan. Use the tests as a guide, but it’s only when trying situations that haven’t been considered before that you find the biggest problems. In Chapter 1, Making the Most of Exploratory Testing, I mentioned stepping through a test plan on a particular web page, then being interrupted and making the window smaller. The test plan worked fine but shrinking the browser window triggered a bug. Look out for edge cases of this kind so they can become test steps in your future test plans. The Testing variable types section lists common, general variables and cases to get you started. Still, you should expand these based on your situation and your product’s particular behavior and weaknesses.

The test plan is a living document and should not be considered fixed. Look out for areas that regularly uncover errors and expand your testing in those areas. The results of exploratory testing and early parts of the system testing should feed back into the rest of the test plan to guide later testing. Bugs aren’t evenly distributed throughout your system; they cluster in certain areas and around particular functions. Find those areas and add as much detail as possible there. Feedback is vital to uncover further bugs in those areas.

If necessary, explicitly schedule a time to consider the bugs you’ve found so far and how to expand the test plan before you sign it off. That can remind you to add this feedback. If you write a test plan and then simply follow it from beginning to end, you’ve missed the chance to improve. Iterate and learn as you go.

Testing needs to combine the rigor of carefully stepping through every case with the freedom and randomness of curiosity and exploration. A good tester will be able to call on both these contradictory skills. Your test plan is just the starting point, ready to grow and evolve based on the bugs it finds. It should be a great, detailed document, but also one you’re not afraid to constantly change and improve.

Summary

In this chapter, you have learned about the different levels of testing and their respective strengths and weaknesses. Unit tests are simple and quick to write but require detailed knowledge of the code and cannot find classes of bugs that only appear when the whole system works together. System tests are the only way to find that class of bug, but they are harder to write, which is why this book focuses on them.

We looked at what a test case should include – its prerequisites, setup, procedure, and result. These are obvious enough to list, but it takes discipline to document them all consistently. You will save valuable time by being clear about what a test requires and what results you should see, so make sure they are a part of your test plans.

Prescriptive tests define precisely what steps you should run, but it can be more powerful to use descriptive tests, which allow variations within given parameters. Those tests should be run on a carefully chosen environment, not live or on the developer’s machines, but a temporary or permanent environment dedicated to testing.

We saw that you should always explicitly ensure that the environment and versions are correct since using the wrong settings is the easiest way to waste time while testing. While running the tests, you also need to aim for the conflicting goals of being both rigorous and systematic while also being curious and responding to feedback. Only by embodying both these roles can you perform top-class testing.

With these considerations in place, we can turn to the details of the tests themselves, starting as a user with little system knowledge checking the main platform functionality: black-box testing, which is covered in the next chapter.

What we learned from Part 1, Preparing to Test

Part 1, Preparing to Test, has covered the necessary preparation for testing. In an ideal world, every feature would have a detailed written description of exactly what it does. In that case, you could skip Part 1, Preparing to Test, entirely and jump straight to Part 2, Functional Testing, where we consider how to test features.

However, in my experience, the test team isn’t always provided with feature specifications, and those that exist aren’t in sufficient detail. It’s almost always up to testers to ask further questions and fill in the gaps in the description. We saw three important stages to achieve that.

Using exploratory testing you map out a feature’s behavior, considering the many approaches described in this book: functional testing, error handling, usability, security, maintainability, and non-functional testing. Exploratory testing also lets you quickly find issues that would block more detailed tests.

Next, you write a feature specification to list the detailed functionality. Those specifications have a particular style – independent, testable statements that describe the behavior but not the implementation.

You cannot write the specification alone, however. You always require input from at least the product owner and lead developer. An official review meeting should step through every requirement in turn to uncover any objections or concerns. With those included, you have an excellent document on which to base your testing.

Finally, Part 1, Preparing to Test, discussed the different levels of testing and types of test cases. We evaluated different test environments and saw the importance of curiosity throughout the test process.

Turning your specification into test cases requires you to consider many other possible scenarios, which forms the skill of testing. Part 2, Functional Testing, shows how to design thorough functional test plans quickly and successfully.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.131.238