Types of tests

Testing is performed for many reasons but there are at least two reasons that we have to mention. One is to hunt the bugs and create error-free code as much as possible. The other is to prove that the application is usable and can be utilized for the purpose it was meant for. It is important from the enterprise point of view and considers a lot of aspects that unit test does not. While unit test focuses on one unit and, thus, is an extremely good tool to point out where the error is, it is totally unusable when it comes to discovering bugs that come from erroneous interfaces between modules. The unit tests mock external modules and, thus, test that the unit works as expected. However, if there is an error in this expectation and the other modules do not behave in the same way as the unit test mock, the error will not be discovered.

To discover the errors on this level, which is the next level above unit test, we have to use integration tests. During integration tests, we test how individual units can work together. When we program in Java, the units are usually classes; thus, the integration test will test how the different classes work together. While there is a consensus (more or less) about what a unit test is in Java programming, this is less so in the case of integration tests.

In this regard, the external dependencies, such as other modules reachable via the network or database layers may be mocked, or may be set up using some test instance during integration testing. The argument is not about whether these parts should be mocked or not, only the terminology. Mocking some components such as the database has advantages as well as drawbacks. As in the case of any mock, the drawback is the cost of setting up the mock as well as the fact that the mock behaves differently from the real system. Such a difference may result in some bugs still remaining in the system and lurking there until a later case of testing or, God forgive, production is used.

Integration tests are usually automated in a way similar to unit tests. However, they usually require more time to execute. This is the reason why these tests are not executed at each source code change. Usually, a separate maven or Gradle project is created that has a dependency on the application JAR and contains only integration test code. This project is usually compiled and executed daily.

It may happen that daily execution is not frequent enough to discover the integration issues in a timely manner, but a more frequent execution of the integration tests is still not feasible. In such a case, a subset of the integration test cases is executed more frequently, for example, every hour. This type of testing is called smoke testing.
The following diagram shows the position of the different testing types:

When the application is tested in a fully set up environment, the testing is called system testing. Such testing should discover all the integration bugs that may have been lurking and covered during the previous testing phases. The different type of system tests can also discover non-functional issues. Both functional testing and performance testing are done on this level.

Functional testing checks the functions of the application. It ensures that the application functions as expected or at least has functions that are worth installing in the production environment and can lead to cost saving or profit increase. In real life, programs almost never deliver all the functions that were envisioned in any requirement documentation, but if the program is usable in a sane manner, it is worth installing it, assuming that there are no security issues or other issues.

In case there are a lot of functions in the application, functional testing may cost a lot. In such a case, some companies perform a sanity test. This test does not check the full functionality of the application, only a subset to ensure that the application reaches a minimal quality requirement and it is worth spending the money on the functional testing.

There may be some test cases that are not envisioned when the application was designed and thus there is no test case in the functional test plan. It may be some weird user action, a user pressing a button on the screen when nobody thought it was possible. Users, even if benevolent, happen to press or touch anything and enter all possible unrealistic inputs into a system. Ad-hoc testing tries to amend this shortage. A tester during ad-hoc testing tries all the possible ways of use of the application that he or she can imagine at the moment the test is executed.

This is also related to security testing, also called penetration testing when the vulnerabilities of the system are discovered. These are special types of tests that are performed by professionals who have their core area of expertise in security. Developers usually do not have that expertise, but at least, the developers should be able to discuss issues that are discovered during such a test and amend the program to fix the security holes. This is extremely important in the case of Internet applications.

Performance testing checks that the application, in a reasonable environment, can handle the expected load that the user puts on the system. A load test emulates the users who attack the system and measures the response times. If the response time is appropriate, that is, lower than the required maximum under the maximum load, then the test passes; otherwise, it fails. If a load test fails, it is not necessarily a software error. It may so happen that the application needs more or faster hardware. Load tests usually test the functionality of the application in only a limited way and only test use scenarios that pose read load on the application.

Many years ago, we were testing a web application that had to have a response time of 2 seconds. The load test was very simple: issue GET requests so that there are a maximum of 10,000 requests active at the same time. We started with 10 clients, and then a script was increasing the concurrent users to 100, then 1,000, and then stepping up by thousand every minute. This way, the load test was 12 minutes long. The script printed the average response time, and we were ready to execute the load test at 4:40 pm on a Friday.
The average response time started from a few milliseconds and went up to 1.9 seconds as the load was increased to 5,000 concurrent users, and from there, it was descending down to 1 second as the load was increased to 10,000 users. You can understand the attitude of the people on a Friday afternoon, being happy that we met the requirements. My colleagues left for the weekend happily. I remained testing a bit more because I was bothered by the phenomenon that the response time decreases as we increase the load above 5,000. First, I reproduced the measurement and then started looking at the log files. At 7 pm, I already knew what the reason was.
When the load went above 5,000, the connections the Apache server was managing started to exhaust and the web server started to send back 500 internal error codes. That is something that Apache can very effectively do. It is very fast in telling you that you cannot be served. When the load was around 10,000 concurrent users, 70% of the responses already had 500 errors. The average went down, but the users were actually not served. I reconfigured the Apache server so that it could serve all the requests and forward each to our application just to learn that the response time of our application was around 10 seconds at the maximum load. Around 10 pm, when my wife was calling my mobile the third time, I also knew how large a memory I should set in the Tomcat startup file in the options for the JVM to get the desired 2-second response time in case of 10,000 concurrent users.

Stress test is also a type of performance test that you may also face. This type of test increases the load on the system until it cannot handle the load. That test should ensure that the system can recover from the extreme load automatically or manually but, in no case, will do something that it shouldn't at all. For example, a baking system should not ever commit an unconfirmed transaction, no matter how big the load there is. If the load is too high, then it should leave the dough raw but should not bake extra bread.

The most important test at the top of the hierarchy is the user acceptance test. This is usually an official test that the customer, who buys the software, executes and in the case of successful execution, pays the price for the software. Thus, this is extremely important in professional development.

Table of Contents for Types of tests

Create new playlist

Sign In

Sign Up

Table of Contents for
Types of tests