© Avi Tsadok 2020
A. TsadokPro iOS Testinghttps://doi.org/10.1007/978-1-4842-6382-2_8

8. Cover Another Aspect of Your App – Performance Testing

Avi Tsadok1 
(1)
Tel Mond, Israel
 

Just as athletes can’t win without a sophisticated mixture of strategy, form, attitude, tactics, and speed, performance engineering requires a good collection of metrics and tools to deliver the desired business results.

—Todd DeCapua

Introduction

Performance Tests are another aspect of software testing. We can say that performance tests are not about “if things work” but rather “how things work,” and that positions them as a unique test bundle compared to the other test methods.

In this chapter, you will learn
  • What is the basic idea of performance testing

  • How measure() function works and how to define a baseline

  • The different metrics you can use starting from Xcode 11

  • How to configure your tests

  • How to write a-sync performance testing

  • Where Xcode saves your test baseline information so you can adjust it to your CI/CD environment

The Basic Idea of Performance Test

Unlike other tests such as Unit or Integration tests, Performance Tests are a little “catchy.” They have several unique characteristics, which make them less predictable.

For example, running performance tests on an old device probably produces different results than running them on a new one.

Also, on one run, you can have a certain result, which may be different than the second or the third run. Not to mention other factors such as machine state, CPU load, free memory, caching, and more.

So, based on the given details, we understand that performance tests work a little bit differently:
  • Each tested code runs several times to prevent any one-time result that may affect our test results. At the end of the test run, the final results will be based on the average of all executions.

  • Because the average result might be different from tests to test, it is not enough to satisfy our needs. We still need to set some baseline to make sure the change is not too big and it’s within the reasonable spectrum.

  • The last issue is also a major one – the baseline is linked to a specific device based on its UUID. The reason is obvious – not only that each device has different hardware, but it also has different settings and installed software.

So, the unpredictable nature of performance tests makes it a unique creature in our testing suite, and we should use it for specific use cases or flows that may cause us performance issues in future changes.

The Basic Measuring Function

Let’s start with writing our first performance test:
class PerformanceTests: XCTestCase {
    func testPerformance() {
        let imageProcessor = ImageProcessor()
        measure {
            _ = imageProcessor.generateImage()
        }
    }
}

In the preceding test, we have a class named ImageProcessor with a function called generateImage(). We know that the function generateImage() is doing some heavy task, and we want to execute this code as part of the measure function.

The Measure() function is part of XCTestCase, and it’s the basic performance method we have. It has one single parameter, which is a closure. What measure() function does is executing the closure ten times and calculating the average time in the end.

Let’s run the test (Figure 8-1).
../images/496602_1_En_8_Chapter/496602_1_En_8_Fig1_HTML.jpg
Figure 8-1

Running our first performance test

We see some interesting information after our first run. First, we see the average time, 1.127 seconds. We also see a message saying there is no baseline time. This leads to our third insight – you can see that our test actually passed.

Unlike other tests, performance tests don’t use assertion. Instead, we define a baseline for our metric to make sure our result stays below it.

Define the Baseline

You don’t have to work hard to set a baseline for your test. Pressing on the gray diamond next to the message “No baseline average for Time” opens a small popup window with more details and functionality (see Figure 8-2).
../images/496602_1_En_8_Chapter/496602_1_En_8_Fig2_HTML.jpg
Figure 8-2

Performance Result Settings window

In this popup, you can see additional information about your run and an option to set a baseline easily by just pressing a button.

On the lower part of the popup, you can see your executions over time.

Note

It’s not rare for the first execution to be much longer than the others. It has to do with things like caching or internal behaviors of the Swift language. This is part of the reason we run this test several times to get a score that is closed to a real-life state.

Pressing the “Set Baseline” button changes the state of the popup window to Edit mode (see Figure 8-3).
../images/496602_1_En_8_Chapter/496602_1_En_8_Fig3_HTML.jpg
Figure 8-3

Performance test baseline edit

Tapping the “Accept” button sets the current average result as our baseline for the next test.

You can also edit the baseline manually just by tapping it and type the new value.

To confirm the change, just press on “Save”.

What the “Baseline” Means for Our Test?

Performance tests are based on two important values – Baseline and Max STDDEV.

The Baseline value is the bar your test needs to reach. If your execution code runs 10%+ slower than the baseline, your test will fail.

Another value being calculated is the standard deviation, the STDDEV. If the deviation of your runs is more than 10%, a value that can be changed easily, your test will fail as well.

Why Is the Deviation Important?

When running performance tests, we need to make sure our score is reliable. If you get a high deviation in your tests, it might be a code smell and point on two things:
  • It may be an indication of a problem in your code. Basically, you need to expect heavily loaded code to perform similarly in multiple runs. If this is not the case, it means your code executes in an unexpected manner and maybe be affected by external values or states.

  • A big deviation means that there are some executions that are slow, much slower than the average score you get. It also means that our average score is not relevant and our users experience poor performance even though our test might be below the baseline.

If your test fails because of high deviation, don’t increase the bar for no reason. You should investigate the behavior of your code before making any changes.

measure(metrics:) Function

Up until Xcode 11, the only metric you could measure is execution time.

But the new Xcode version brought new metrics to the table:
  • XCTClockMetric  – This is the execution time metric similar to what we’ve learned in the previous section.

  • XCTCPUMetric – This metric gives you information about the CPU activity during the run.

  • XCTMemoryMetric – Measure allocated during the test.

  • XCTStorageMetric – Record bytes written to disk.

  • XCTOSSignpostMetric – Measure execution time for a specific part of your code, defined externally in your code using os_signpost functions.

The most basic metric developers use is the time/clock metric, but there are many cases why you would want to check other metrics as well.

It doesn’t mean you need to run the performance test for each one of your metrics – you can pass array of metrics and get results for all of them:
func testGeneratingImageWithAllMetrics() {
    let imageProcessor = ImageProcessor()
    measure(metrics: [XCTClockMetric(), XCTCPUMetric(), XCTStorageMetric(), XCTMemoryMetric()]) {
        _ = imageProcessor.generateImage()
    }
}
Running the test while passing all the metrics gives you the same popup as before, but now with information for each of your metrics (see Figure 8-4).
../images/496602_1_En_8_Chapter/496602_1_En_8_Fig4_HTML.jpg
Figure 8-4

Setting baselines to all metrics

This is plenty of information! Let’s try to dig in and understand what it means.

Analyzing the Metrics

Clock Monotonic Time

This measurement is part of the XCTClockMetric, and it measures the exact duration of your execution block. At this point, I want to explain what exactly Monotonic Time means.

If you want to measure code without using the measure function, you can do something like this:
let startTime = Date()
_ = imageProcessor.generateImage()
let endTime = Date()
let duration = startTime.timeIntervalSince1970 - endTime.timeIntervalSince1970

We measure the time before the execution and the time after the execution. Obviously, the elapsed time between them is the duration of the execution, right?

Well, not exactly. Doing that would be wrong.

There are two different clocks in almost every modern operating system – the Wall Clock and Monotonic Clock.

The Wall Clock is the clock that is presented to the user (and the application). This is the time that we get when we use the Date() function to get the current time. Wall Clock time is affected by NTP (Network Time Protocol) and can be synchronized during the application running. Therefore, not only the elapsed time might not be accurate; it can even be negative.

Monotonic Clock, on the other hand, cannot be affected by any external influence. Monotonic Clock is not aiming to give the current time since it doesn’t have a “starting point.” What it does is to give you a stable duration measurement, and this is why we use it in performance tests.

CPU Cycles, CPU Time, and CPU Instructions

OK, so we have a clock time, why do we need a “CPU Time”? And what is it anyway?

So, first, CPU Time doesn’t represent the total execution time, but only the time the CPU was busy executing your instructions. For example, the total execution duration also includes any I/O operations or even network requests (although it’s not recommended to include network time in your performance tests).

So, if you want to eliminate any external factors and focus on your processing time, CPU Time under XCTCPUMetric is the way to go.

So, what are CPU Instructions Retired and CPU cycles?

CPU Cycles is the metric that shows you how much your CPU worked hard during the block execution, and CPU Instructions metric contains the number of the actual instructions completed – in general, a low number of instructions for the same task, points of better efficiency, and power consumption.

Checking Your Writing Activity with XCTStorageMetric

XCTStorageMetric is another interesting aspect of the performance tests. Instead of measuring time, it measures your writing to disk activity. This might not sound like an interesting metric, but when concluding it with the clock metric, it’s a great metric to help you optimize your code.

Writing to disk is considered to be a heavy task much more than writing to memory. It is best practice to avoid it if possible. A big increase in this metric can explain poor results in the clock metric and can be an indication of unnecessary writing activity.

More Configuration with XCTMeasureOptions

Using performance metrics is pretty straightforward. In fact, they are so useful and effective that you don’t really need any configuration for them. But, still, there is an option that can help you tune your performance tests better to get more accurate results.

The way of doing that is bypassing an object of type XCTMeasureOptions. XCTMeasureOptions was added along with the performance test metrics, and it has two properties that you are able to configure.

iterationCount

The first property you can update is the iterationCount . This property defines the number of times your test runs. The default is 5, but you should be aware that XCTest always adds another iteration and ignores it (it actually ignores the first one).

Why would we want to change the number of iterations? There could be two reasons – the first one is heavy and time-consuming performance tests that you want to run no more than one or two times. The second option might be the opposite – very small performance tests that you need to run many times to get accurate results as possible.

In 95% of the cases, you don’t need to change the default value. Also, if you run your test without passing an XCTMeasureOptions object, the number of iterations will be ten times and not five as described earlier in this chapter.

invocationOptions

Performance tests are great, but they still have one major drawback, and that’s controlling the start and the end of the measured part of your code.

I’ll explain – we know that performance tests run multiple times, and they all should start from the same state. In fact, they are exactly like any other tests – you need to have some setup code before you start and do a cleanup when you finish.

The problem is that you need to execute the setup and cleanup code inside the measured block, which means that all the metrics cover these parts of your block as well.

The invocationOptions property lets you define how your measurements are taken. It’s an optionSet that has two options – manuallyStart and manuallyStop.

If invocationOptions contains manuallyStart, it means that measurements are taken when you call the function self.startMeasure() in your execution code. If manuallyStop is included in invocationOptions, it means the Xcode stops the measurement when on self.stopMeasure().

Look at the following code:
    func testGeneratingImageWithAllMetrics() {
        let imageProcessor = ImageProcessor()
        let options = XCTMeasureOptions()
        options.invocationOptions = [.manuallyStop ,.manuallyStart]
        measure(metrics: [XCTClockMetric(), XCTCPUMetric(), XCTStorageMetric(), XCTMemoryMetric()], options: options) {
            // do some preparations
            self.startMeasuring()
            _ = imageProcessor.generateImage()
            self.stopMeasuring()
            // do some cleanup
        }
    }

Looking at the code, you can see we can easily insert some setup and cleanup code inside our execution closure and define exactly what part we want to measure.

Measuring App Launch

One great way you can make use of performance tests is to measure your app launch.

App launch time is extremely important to your app user experience and, in many cases, is the root of ongoing frustrations among users.

Setting up a test for that mission is very easy. In fact, you don’t need to do anything – any new UI testing target comes with a predefined app launch test:
    func testLaunchPerformance() throws {
        if #available(macOS 10.15, iOS 13.0, tvOS 13.0, *) {
            // This measures how long it takes to launch your application.
            measure(metrics: [XCTOSSignpostMetric.applicationLaunch]) {
                XCUIApplication().launch()
            }
        }
    }

It is pretty amazing that in two rows we can measure our app launch time.

This test also contains baseline just like all the other performance tests, and since it’s already written for you, it’s recommended for you to include it in your test bundle.

Asynchronous Performance Tests

So, we can see how easy it is to measure the performance of a specific function/method by just wrapping it inside the measuring closure. But what if we want to measure an a-sync function?

In general, it is much simpler to measure synced functions, but it is still possible to also test a-sync function using the XCTestExpectation tool we’ve learned in previous chapters.

Note

If you don’t remember how to use XCTestExpectation, go back to the unit test chapters and go over this part.

The basic steps to create a performance test for a-sync function are as follows:
  • Open measuring closure while setting the automaticallyStart to yes.

  • Create the XCTestExpectations inside the closure. Now, this step is important. Creating the expectation object outside the closure will raise an exception.

  • Wait for the expectation to be fulfilled inside the closure, just like the expectation’s creation itself.

Let’s see an example:
func testImagePrcessongAsync() {
    measure(metrics: [XCTClockMetric()]) {
        let expectation = XCTestExpectation(description: "Image processing")
        let imageProcessing = ImageProcessor()
        imageProcessing.generateImageAsync {
            expectation.fulfill()
        }
        wait(for: [expectation], timeout: 2.0)
    }
}

Remember that executions run one after the other, so the wait() function halts the run until the expectation is fulfilled before it continues to the next one.

Also, you need to be careful about the waiting timeout duration – if it’s too low, say, lower than the baseline, the test can fail even though it ran better than the baseline.

The Baseline Under the Hood

Unlike other tests, performance tests rely on the specs of the machine that runs them.

So, you can conclude that different machines give you different results; therefore, the baseline has to be corresponding to the host machine.

And this is something you need to understand, especially if you run your tests on a continuous integration environment – Xcode saves the baseline values for any combination of the host machine (your Mac) and device (including simulators).

Although iOS simulators are not emulators, meaning there shouldn’t be any CPU difference, they can still give you different results.

For example, you might turn off/on different features for different devices in your code. Also, the device resolution can have an impact on the simulator performance (again, this is up to the host machine as well).

Where Xcode Saves the Baseline?

This is an important question, especially if you work in a big corporate, and your app integration process is running on different machines.

True to Xcode 12, the baseline values are saved inside your Xcode project file.

Xcode project file (*.xcodeproj) is a package, meaning it’s actually a folder that displayed like a typical file.

To reveal the package content, right-click the package (xcodeproj) and select “Show Package Contents”.

Navigate to xcshareddata/xcbaselines/.

The first important file you see there is info.plist. This file contains the list of the “host machine+device” combinations. Xcode generates a unique UUID for each combination and saves it (see Figure 8-5).
../images/496602_1_En_8_Chapter/496602_1_En_8_Fig5_HTML.jpg
Figure 8-5

Info.plist file, containing the host machine details along with the target device information

If you look at the info.plist, you’ll see the generated UUID. For each UUID, Xcode creates another plist file in the same directory, containing the list of baselines for each test method (Figure 8-6).
../images/496602_1_En_8_Chapter/496602_1_En_8_Fig6_HTML.jpg
Figure 8-6

List of test methods and their baselines for each metric

How Xcode Pulls the Baseline from These Files

If you take a look again at Figure 8-5, you can see that Xcode doesn’t save the serial number of the machine, but rather its specs. This means that if you run your tests on a different machine but with the same specs, Xcode will pull the corresponding baselines for this test.

Why is this important? Because this is the way you can set the baselines for your CI environment – by adjusting the “combo” settings to match your remote machine.

Summary

You don’t have to write performance tests for every method in your project. More than that, there are projects that performance tests are useless.

Performance is all about the big numbers – if you have heavily loaded functions or pieces of code, this tool is a great way to optimize them and verify you don’t have any regressions. Not only testing performance on small, unimportant function is useless; it’s also a mistake that can make the maintenance of your test difficult.

We are heading to the next chapter – a technique that can help you define the “expected result” easily.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.16.184