Selecting what to benchmark

Knowing whether your program improves efficiency for each change is a great idea, but you might be wondering how to measure that improvement or regression properly. This is actually one of the bigger deals of benchmarking since, if done properly, it will clearly show your improvements or regressions but, if done poorly, you might think your code is improving while it's even regressing.

Depending on the program you want to benchmark, there are different parts of its execution you should be interested in benchmarking. For example, a program that processes some information and then ends (an analyzer, a CSV converter, a configuration parser...), would benefit from a whole-program benchmark. This means it might be interesting to have some test input data and see how much time it takes to process it. It should be more than one set, so that you can see how the performance changes with the input data.

A program that has an interface and requires some user interaction, though, is difficult to benchmark this way. The best thing is to take the most relevant pieces of code and benchmark them. In the previous chapter, we learned how to find the most relevant pieces of code in our software. With profiling techniques, we can understand which functions and code pieces impact the execution of our application the most, so we can decide to benchmark those.

Usually, you will want to mostly have fine-grained benchmarks. This way, you will be able to detect a change in one of the small pieces of code that affect the overall performance of the application. If you have broader benchmarks, you might know that the overall performance of one part of the application has regressed, but it will be difficult to tell what in the code has made that happen.

In any case, as we will see later, having continuous integration for benchmarks is a good idea, creating alerts if a particular commit regresses the performance. It's also important for all benchmarks to run in as similar as possible environments. This means that the computer they are running on should not change from one run to the next, and it should be running only the benchmarks, so that the results are as real as possible.

Another issue is that, as we saw in the previous chapter, the first time we run something in a computer, things go slower. Caches have to be populated, branch prediction needs to be activated, and so on. This is why you should run benchmarks multiple times, and we will see how Rust will do this for us. There is also the option to warm caches up for some seconds and then start benchmarking, and there are libraries that do this for us.

So, for the rest of the chapter, you should take all this into account. Create small micro-benchmarks, select the most relevant sections of your code to benchmark, and run them in a known non-changing environment.

Also, note that creating benchmarks does not mean that you should not write unit tests, as I have seen more than once. Benchmarks will only tell you how fast your code runs, but you will not know whether it does it properly. Unit testing is out of the scope of this book, but you should test your software thoroughly before even thinking about benchmarking it.

Table of Contents for Selecting what to benchmark

Create new playlist

Sign In

Sign Up

Table of Contents for
Selecting what to benchmark