© Andrew Davis 2019
A. DavisMastering Salesforce DevOps https://doi.org/10.1007/978-1-4842-5473-8_8

8. Quality and Testing

Andrew Davis1 
(1)
San Diego, CA, USA
 

Quality refers to the degree to which software does what it’s supposed to do, and is built in a way that is structurally sound. Testing refers to checking and giving feedback on software quality. Testing can assess functional quality: does it do what it’s supposed to do and not do things it shouldn’t? Testing can also assess structural quality: is it reliable, secure, maintainable, performant, and appropriately sized?

Some amount of everyone’s work and customizations will be thrown away or never used. If your work is lucky enough to be used in production, it will always be tested. Your work will either be tested intentionally prior to release, or it will be tested implicitly when it’s used or updated. When end users intentionally review and give feedback on functionality that is pending release, this is called “user acceptance testing.” But sometimes unsuspecting end users are exposed to functionality that has not been tested by any other means. In that case, the “testing” is inadvertent, uses real business data, and can lead to confusion, problems, and ill-will.

The purpose of this chapter is to explain key concepts of Salesforce software quality, share the many available testing mechanisms, and encourage practices that increase the reliability of your production systems. Although I frequently reference “code” quality, Salesforce customizations involve an interplay of code and non-code configuration. In this chapter, the term “code” is mostly used as concise shorthand for any Salesforce customization.

Understanding Code Quality

Quality is subjective, but needs to be considered from the perspective of both end users and the technical team that creates and maintains the entire system. End users are concerned mostly with whether code functions as it should, whereas it is the role of the technical team to ensure that the code is maintainable, secure, performant, testable, and so on. Although end users aren’t concerned with the underlying implementation, flaws in the code that lead to slow performance, data loss, and exploits can quickly become massive issues that capture the attention of the business, customers, and even the media.

Quality code can be seen as code that meets the current and potential needs of the customer. This is by definition a very challenging problem. Quality is not something you can perfect; but it’s often something you can improve.

Functional, Structural, and Process Quality

As shown in Figure 8-1, software quality can be divided into functional quality, structural (or nonfunctional) quality, and process quality.

Functional quality refers to whether the software functions as it should and meets the stated needs of the customer. In agile terms, code that fulfills the purpose of the story and meets the acceptance criteria can be said to have functional quality. One of the benefits of iterative development is that the sooner you can put working software in front of end users, the sooner you can validate that it meets their actual needs. If end users are not satisfied, further development cycles can focus on increasing the functional quality in their eyes.

Structural quality, also known as nonfunctional quality, deals with the quality of the underlying implementation. Well-crafted software will function reliably, day in and day out, even as other aspects of the system evolve. The evolution of any IT system is inevitable—it is a living thing—and so the maintainability of code is also crucially important. It should be possible to make small changes to the system in a small amount of time, without undue effort spent deciphering the meaning of poorly named, poorly commented, rambling “spaghetti code.” And as the user base continues to grow, functionality continues to expand, and data accumulates, the application should perform consistently at scale. Software is designed with certain users and use cases in mind. But how will it behave when unintended users do things to the system that were never considered in its design? The IT world exists under the shadow of countless threats. Any system exposed to the Internet will immediately be exposed to thousands of automated attacks from bots built to identify and exploit common vulnerabilities. The security of your application is thus a critical hidden factor that must be considered.

In all of this, the size of your code, both its scope and the LOC (lines of code) count, has a great impact. For example, a design that served the application well when it was just beginning may be entirely unsuitable once the application has grown to enterprise levels.

Process quality refers to the quality of the development process: whether it supports the development of quality software in an efficient and reliable way. Much of the rest of this book focuses on process quality, such as the use of version control, continuous integration, and automated testing. Process quality also includes the process of gathering and tracking requirements and coordinating the work of the development team.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig1_HTML.jpg
Figure 8-1

Code quality has many hidden levels

Understanding Structural Quality

It is important for development teams to understand structural quality in more detail, so they can understand how to improve and ensure it. Different sources use different terms to describe the aspects of software quality, but the following divisions cover most concerns.1

Reliability

Reliability refers to the ability of your code to perform consistently, every time it’s executed, in spite of variations in inputs, changes to other systems, and ongoing updates to functionality. A reliable system “just works”: no 500 errors, unhandled exceptions, and the like. This implies that the system has been tested with inputs including edge cases such as null or missing values, values approaching and exceeding size limits, and so on. Reliability implies that your design is not brittle and dependent on underlying data and systems that might change due to factors outside your control. If a key piece of configuration data is suddenly absent, your application should generate a meaningful error on the backend and fail gracefully for users.

Many DevOps practices such as “chaos engineering” are focused on building reliable distributed systems at scale. Reliability at scale implies that systems should continue to function even if individual nodes periodically go offline.

One key to ensuring reliable code is to practice test-driven development (TDD) or its offshoots such as behavior-driven development (BDD). Underpinning each acceptance criteria for your application with an automated test that ensures this functionality allows you to run regression tests after every significant change to your application. This provides quick visibility into any failures that might compromise functionality. Writing a rich set of automated tests depends on your code being testable. To be testable, code needs to be modular. And testing code that depends on external systems requires there to be a layer of abstraction so that a “test double” can be used instead of actually contacting the external system. In this way, the process of writing tests creates a natural pressure toward good coding practices.

Maintainability

The people who maintain software are rarely the ones who originally wrote it. If a developer revisits code they wrote 6 months ago, it might look as unfamiliar as code written by another person. Maintainability means that software is written in such a way that the purpose of each method is easy to understand and that small changes can be made easily without huge effort and without fear of breaking the application. Clear naming of variables, classes, and methods is critical. If it takes you 5 minutes to figure out what a small method does, take 1 minute to give it a more helpful name that clearly describes its purpose. As Figure 8-2 indicates, naming things is hard. Invest time in choosing self-explanatory names.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig2_HTML.jpg
Figure 8-2

Naming things is hard2

Another important aspect of maintainability is reducing the “cognitive complexity” of the code by reducing the number of if statements, loops, and other logic in a single block of code. This is also known as “cyclomatic complexity”: the number of different logical paths a section of code could follow.

Reducing the sizes of classes and methods is a key way to ensure maintainability. The “single responsibility principle” states that every piece of code should have just one purpose. A method that does two things should be divided into two methods with very clear names. Adding abstraction in this way makes code quick and easy to understand, without requiring maintainers to spend hours deciphering complex logic.

Code comments are very important to explain complex pieces of logic and to otherwise clarify the code. But wherever possible use simple logic, clear names, and short methods instead of writing comments. A standard to strive for is “self-documenting code,” where the names are so clearly chosen, and methods are so brief and well structured that even a nonprogrammer could understand its purpose just by reading it.

There are many other design principles that contribute to maintainable code. Larger codebases require enterprise design patterns to ensuring maintainability. The book Clean Code by “Uncle” Bob Martin3 is a classic recommended for all programmers.

Performance

Performance implies that your code will function at an acceptable level even as the number of users, parallel processes, and data increase. Performance testing in a “production-like” environment is a key stage of software development. And capacity planning—anticipating both best case and worst case rates of growth—is important to ensure you don’t encounter a scaling crisis a year or two into production.

Salesforce takes care of the performance and scalability challenges of the underlying platform. But the performance of your customizations is the responsibility of your own team.

Security

Security vulnerabilities are arguably the scariest flaws in software. Nevertheless, security analysis and remediation are among the most neglected aspects of quality analysis. Developers may be oblivious to security threats or simply hopeful that they’re not an issue. Security analysis can also be a specialized skill, neglected during developers’ basic training. For these reasons, having automated tools that can identify security vulnerabilities and recommend resolutions is key. This is especially important for any system that faces the public Internet or carries sensitive information such as financial data. Web administrators can assure you that the Internet is a hostile place, with new servers receiving automated scans and attacks within minutes of going online.

Size

Size is a consideration in structural quality simply because the larger the application becomes (both in terms of code and functionality), the more challenging the preceding issues become. As the size of an application increases, its attack surface increases, meaning that there are more possible vulnerabilities. The principle of security asymmetry says that it’s always harder to defend than to attack, since an attacker only needs to find one way in, whereas the defender needs to defend all possible points of entry.

Understanding Process Quality

Whereas functional and structural quality relate to what is being built, process quality refers to how you are building.

There are two main aspects of process quality: does your process lend itself to creating a quality product and is the process itself safe, sustainable, efficient, and so on?

The main focus of this book is to suggest technical processes that lead to higher-quality work as well as a more sustainable and efficient process, so the entire book can be seen as falling into this topic. A comprehensive discussion of process quality would also touch on many business and human aspects and is beyond the scope of this book.

Testing to Ensure Quality

Many factors ensure quality. As the saying goes, “quality is everyone’s responsibility.” It’s also said that quality is far too important to be left to the QA team. Developer training and skill is of course important, as is a manual testing process. But between development and manual testing are automated systems that can be put in place to provide fast feedback to developers, early warning of issues, and reduced burden on manual testers.

Many aspects of this book deal with development process improvements such as the use of version control, continuous integration, and modular architecture. But what processes and systems can be put in place to help enforce and ensure quality in the development process?

Why Test?

The purpose of testing is to protect current and future users from unreliable systems. Another way of expressing this is that testing helps build quality into the work you’re delivering.

As shared in Chapter 3: DevOps, there are five metrics you can use as key performance indicators (KPIs) of your software delivery process. While the first two deal with speed, the latter three deal with system reliability: deployment failure rate, mean time to recover, and uptime. Historically, most teams assumed there is a tradeoff between speed and reliability, or between innovation and retaining users’ trust. But the State of DevOps Reports validate that high-performing DevOps teams excel in both speed and reliability.

The only way to deliver innovation to production more quickly without sacrificing reliability is to build quality into your delivery process. W. Edwards Deming, widely credited with transforming industry in Japan and later the United States, offered a 14-point synopsis of how businesses can improve their operations.

Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place. 4

—Point 3 of Deming’s 14 points

Deming’s work is credited with instigating the Total Quality Management movement in business. The basic idea is simple: as shown in Figure 8-3, the earlier issues are caught, the less expensive they are to resolve. Finding bugs through manual QA is time-consuming and expensive compared to using automated methods. But that cost is far less than the cost of end users finding and reporting these bugs in a system they assumed was reliable.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig3_HTML.jpg
Figure 8-3

The cost of fixing bugs increases the later in the process they are found

This also gives rise to the exhortation to “shift left.” As shown in Figure 8-4, this means to put increasing emphasis into ensuring that the original design, architecture, and coding are of high quality, rather than attempting to catch most bugs in a manual testing phase. Manual QA will never be 100% reliable. In addition to the high costs of manual testing, if nothing is done to address the quality of the inputs (the original development), quality issues will continue to slip through no matter how much you adjust your manual testing methods. Deming often used the “Red Bead Experiment”5 to provide a visual demonstration of this concept. If an incoming collection of white beads is contaminated by some amount of red beads, those red beads will continue to pollute downstream processes despite the best efforts of workers and management. The only way to ensure that the final product is reliable is to ensure that the inputs are reliable, which points to the importance of giving developers fast, high-quality feedback.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig4_HTML.jpg
Figure 8-4

“Shifting left” reduces overall costs by addressing quality issues earlier in the development lifecycle

What to Test?

In general, you need to test the business-critical customizations you make on the Salesforce platform. With extremely rare exceptions, you don’t need to test Salesforce itself; and you don’t need to test trivial customizations like page layouts. But you can and should test any places you use complex logic (including complex validation rules and workflow rules) to ensure that your own logic is correct. And you should have automated regression tests in place to ensure that critical business functionality remains correct, especially around the systems that help your company to make money.

If your employee idea board goes down for 8 hours, it may be an annoyance to employees, but it’s not likely to cost your company any money. If your sales people are unable to work for 8 hours, or your eCommerce site goes offline for that period, you will have sacrificed a portion of your annual sales. In short, not every system has equal value to the company, and thus not every test has equal protective value.

You should prioritize tests according to the production value of the system under test.

Testing Terminology

Different sources use different names to describe different kinds of tests. And just as with “different kinds of mammal” and “different kinds of emotion,” the distinctions are not always clear. Tests can be distinguished based on whether they test the behavior of the system or its characteristics (functional or nonfunctional tests), how they interact with the system (code tests, API tests, or UI tests), what level of the system they test (unit, component, or integration tests), how they are performed (manual, automated, or continuous), and when they should be run (commit stage tests or acceptance tests). This section briefly introduces those distinctions in hopes of demystifying them.

Don’t waste time feeling bad if you’re confused about testing terminology. The software testing community has created an amazing variety of terms to describe types of tests. Perhaps this is because they have been forced to do so much tedious, manual testing that their creativity needed to find an outlet. I’d be complicit in this linguistic crime if I shared them all, but I refer you to www.softwaretestinghelp.com if you want to satisfy your curiosity about how monkey testing differs from gorilla testing.6

Functional and Nonfunctional Tests

There are different ways to look at a system: what it does, and what it’s made of. Functional tests judge what something does. For example, if a bit of logic is supposed to apply price discounts for customers with an annual contract value above 1 million dollars, you can write functional tests to ensure that logic is correct. Nonfunctional tests look at other aspects of the system, like how long it takes to run, whether it suffers from security or coding style flaws, and so on. This is discussed in detail in the section “Functional, Structural, and Process Quality.”

Code, API, and UI Tests

The only built-in testing tool that Salesforce provides is the ability to run tests written in Apex. For this reason, when people refer to “Salesforce testing,” they are often referring to tests written in Apex that test the behavior of other Apex classes or triggers. It’s helpful to take a broader view of this though and distinguish that there are three ways that a test might be run: via code, via an API, or via UI tests.

Code-based tests include Apex tests as well as JavaScript tests of Lightning Components and the JavaScript used in VisualForce. API-based tests are those that use Salesforce’s APIs to test behavior. API tests become quite important when exercising integrations with other systems. MuleSoft’s MUnit capabilities7 are integration tests that can be used to validate behavior using an API. There’s a gray area between code and API in cases where tests are written in languages such as Ruby or Python to test the behavior of Salesforce, since those tests generally use the Salesforce API to test.

UI tests are what you do when you log in to Salesforce to observe and check its behavior. Manual QA and UAT are almost always done in this way. UI testing can also be done in an automated way, using tools such as Selenium, Puppeteer, Provar, or Tosca.

Unit, Component, and Integration Tests

It’s very common to use a threefold distinction between unit, component, and integration tests. Unit tests are those that exercise only one unit of code, such as a single method. Component tests are those that exercise multiple units of code or configuration, such as a test on a Trigger that also executes Trigger handler classes, or other business logic. Integration tests are those that exercise an entire integrated system, cutting across multiple Salesforce components or maybe even across other systems. Apex can be used to write unit tests or component tests that also test workflow rules, processes, and flows. But it can’t make HTTP callouts and has CPU, memory, and execution time limits that limit your ability to write true integration tests in Apex.

This threefold distinction is also the most common source of terminology confusion. “Integration test” can be used to mean different things. Salesforce developers often refer to any Apex-based test as a “unit test” even if it cuts across many components. I use the term “unit test” exclusively for code-based tests, but I’ve (unfortunately) heard people refer to narrowly scoped manual tests as unit tests. Google did a nice job of cutting through such confusion by dividing tests simply into small, medium, and large.8

Manual, Automated, and Continuous Testing

Central to the points made in this book is this threefold distinction in terms of how tests are performed. Manual tests are those where humans perform actions and check the output. Manual testing is the default behavior for traditional quality testers, for user acceptance tests, and even for programmers and tech leads wishing to validate a system. I’ve heard it said that you can’t automate something that you can’t do manually, so manual testing may also be a precursor to other kinds of automated testing.

The benefit of manual testing is that much of it can be done by relatively low-skilled workers, it doesn’t require any specialized tools, and it doesn’t require much planning or architectural consideration. The challenges with manual testing are that it’s relatively slow (minutes or hours instead of seconds), it’s error-prone, it’s tedious (if you have to perform regression tests repeatedly), it requires people to be available (which may cause delays), and it’s expensive (even offshore resources get paid more than computers). The challenges with manual testing should drive you to reduce reliance on it and to upskill your manual testers to help write automated tests.

Automated tests are those that can be done by computers. The benefits of automated tests are in speed, reliability, and cost. Those benefits change the economics of testing and open the door to performing automated tests far more frequently and across a diversity of platforms. Platforms like SauceLabs and BrowserStack revolutionized web testing by allowing companies to test their web sites or applications across a myriad of devices and browsers in parallel, uncovering issues in minutes that might otherwise have taken weeks to discover.

Continuous testing refers to running automated tests every time code is changed on a particular branch in version control. Continuous testing is in fact an aspect of continuous integration/delivery, since it provides a way to ensure the reliability of a system that is undergoing rapid change. Not all automated testing tools support continuous testing. For example, Tricentis Tosca provides a nice set of tools to build UI tests for Salesforce. But it can only run on Windows-based systems, and so it can be a struggle to use it for initiating automated tests from a Linux-based CI system.

Commit-Stage Tests and Acceptance Tests

Finally, tests can also be distinguished in terms of the point in the development lifecycle when they are run. This distinction is made in the Continuous Delivery book and is relevant when planning your approach to testing.

The basic idea is that some tests should run every time you make a commit in version control (continuous testing), but that it’s important that those tests can complete in 5 minutes or less. The reason is that unless developers get fast feedback from those tests, they are likely to ignore them. It also makes it harder to recognize if a test failure is associated with a particular change if several hours have elapsed since the change was made.

In addition to this fast-running subset of tests, you need a robust and comprehensive set of tests that can be run before code is finally released. Acceptance tests refer to the broader set of tests designed to ensure that your work meets requirements before it’s released. These tests are generally functional tests that ensure your code functions properly. They’re often accompanied by different nonfunctional tests that also help ensure quality.

Thus commit-stage tests represent a small and fast-running subset of acceptance tests. Commit-stage tests rarely include UI tests, since those tend to take longer to run. In the Apex world, it’s beneficial to create one or more test suites that contain certain critical test classes you want your developers to run regularly. The point of these commit-stage tests is to provide an early warning to developers if they break something and to provide it immediately to minimize time spent diagnosing the cause.

The remainder of this chapter is divided into two sections: “Fast Tests for Developers” and “Comprehensive Tests.” “Fast tests” is the simplest term I could think of to describe commit-stage testing, and hopefully the meaning is unambiguous. “Comprehensive tests” is the simplest term I could think of to describe acceptance tests, which will accumulate over time to represent the widest range of test cases, with much less regard for how long they take to run.

Yes, I just added two new names to the overflowing lexicon of testing.

Test Engines

The different types of test discussed later require different “test engines.” For Apex tests, the Apex test runner is a native capability of Salesforce. For manual tests, the “test engine” is the person performing that test. For other types of test, you’ll typically need a tool to help with this process.

In each of the following test types, you’ll find discussion about which tools can help support that process. Some tools like JMeter are specific to one type of testing, in this case performance testing. Other tools, like static analysis tools, can help across different stages of the test lifecycle.

Test Environments

Different types of testing bring different demands for test environments. The general strategy to support these demands is to “shift left” as much as possible and to try to batch multiple different demands into a small number of environments as illustrated in Figure 8-4. Details about the environments needed for the various types of tests are shared along with the description of those tests later.

As described in the following, some kinds of testing require long-lived, fully integrated sandboxes, whereas other types can be performed in scratch orgs or disposable developer sandboxes. Long-lived sandboxes that need to be integrated with external systems and require manual setup are probably best to be created manually from the EnvironmentsSandboxes screen in your production org. If you find yourself regularly needing to refresh them, however, you may consider investing in automating their creation.

Short-lived testing environments should be created and destroyed automatically as part of your CI process, using the Salesforce CLI and the same set of scripts you use to provision developer environments. The Salesforce CLI now also allows you to create and log in to sandboxes, so this process can be used to automate the refresh of testing sandboxes.

Test Data Management

The vast majority of test types require data. “Given certain input data, do we receive the correct outputs?” Some types of test require minimal data, others require massive amounts of data. Managing test data is an integral part of your test setup. Details on providing appropriate data are integrated with the following descriptions of each test type.

Fast Tests for Developers

This section and the following “Comprehensive Tests” section constitute a twofold division of all the different types of tests. Fast tests refer to tests that should be run before and/or after every commit. The formal name for this is commit-stage testing, which consists of code analysis and unit tests.

Coding is hard. Managing a team of distributed programmers, enforcing style guidelines, and performing code reviews are even harder. Fortunately, there are automated tools that can help enforce and remind developers of standards and best practices for each language.

Code quality analysis is an automated system for providing code feedback and a helpful mechanism to improve code quality. It’s like spelling and grammar check for your code. Code quality maintenance and improvement requires attention and focus throughout a project’s lifecycle. Issues with code quality, such as poorly designed or poorly documented code, will accumulate easily if left unchecked. These issues are known as technical debt , and if left to grow, they make software maintenance increasingly difficult, time-consuming, and risky. In the same way that one might deal with financial debt, the key to mitigating technical debt is to acknowledge and address quality risks or concerns as early as possible in the development process—not to let them accumulate.

Throughout the remainder of the chapter, we introduce several types of test and how you can run them. In each case, we address the following questions:
  • What is this test? Why do you need it?

  • When are these tests triggered?

  • What environment(s) does the test run in?

  • What data do you need to run this test?

  • How do you create these tests?

  • What else should you consider regarding these tests?

  • What happens with the results of these tests?

Hopefully this will provide a consistent guide as you gradually layer in different types of testing.

Static Analysis—Linting

Static analysis is an automated analysis of the source code to identify possible structural faults in performance, style, or security. A simple example is measuring the length of each class or method and marking long classes or methods as possible maintainability problems. The benefit of static analysis is that it can be run quickly and inexpensively, as often as needed, and so can provide feedback faster than any other type of test.

Static analysis can be performed as one stage in a CI process, but is most effective when it’s done in real time in the developer’s code editor. This kind of immediate feedback is called linting and is similar to spell checking or grammar checking. This provides real-time feedback on code right inside the code editor … there’s no better method to ensure developers see and act on feedback.

Linting is a great example of shifting left, since it’s done during the coding process itself as opposed to waiting for the CI/CD server. Whereas unit tests are used to confirm the specific behavior of your code, static analysis checks to ensure that it complies with general rules such as “methods should not have too many parameters.”

How to Run Linting

By definition, linting runs in the developer’s code editor in real time. Linting is language-specific. For JavaScript there are many linting tools available, but ESLint has come to be the dominant choice. ESLint provides general feedback on JavaScript style. It offers a set of recommended rules, but can also be configured to enable or disable additional rules.9 Additional ESLint rules have been written for both (Aura) Lightning Components10 and Lightning Web Components,11 making ESLint the clear choice for JavaScript linting in Salesforce.

I am aware of two linting solutions for Apex code: ApexPMD and SonarLint. PMD is the most popular static analysis tool for Apex (partly because it’s free). Chuck Jonas wrote a well-maintained PMD extension for VS Code.12 PMD is also integrated into the commercial IDEs, Illuminated Cloud and The Welkin Suite.

SonarLint is the linting component of SonarQube, a very popular static analysis tool. SonarLint can be run “online” or “offline.” When run “offline” it just uses a standard set of built-in rules. When run “online” it actually connects to a SonarQube instance to download a customized ruleset for your company or project and so ensures that the linting engine runs the same rules as the static analysis you run on the entire codebase. The largest set of static analysis rules for Apex can be found in CodeScan, a variant of SonarQube focused just on Salesforce.13 SonarQube Enterprise Edition now also supports static analysis rules for Apex and is working to make them increasingly robust.

When Does Linting Run?

As described, linting runs continually in the IDE as the developer works, similar to spell checkers or grammar checkers (Figure 8-5). Linters typically deal with just one code file at a time (see Figures 8-6 and 8-7), although most linting tools also allow developers to run those rules systematically across their local codebase.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig5_HTML.jpg
Figure 8-5

Grammar checking using Grammarly

../images/482403_1_En_8_Chapter/482403_1_En_8_Fig6_HTML.jpg
Figure 8-6

Instant feedback on JavaScript from ESLint

../images/482403_1_En_8_Chapter/482403_1_En_8_Fig7_HTML.jpg
Figure 8-7

Instant feedback on Apex from ApexPMD

Where Does Linting Run?

Linting is performed in the IDE, directly on the codebase, and so does not require a Salesforce environment.

Data Needed for Linting

Linting analyzes the code itself and does not require any test data.

Linting Rules

Linting applies generic rules to the codebase, as opposed to testing for specific business scenarios or use cases. Therefore, unlike unit tests, which are generally unique to each Salesforce org, linting rules are not usually customized.

ESLint has an excellent reputation for ease of writing custom rules. So it’s certainly possible for you to write ESLint rules for your team, although with the same effort you could benefit all Salesforce teams by contributing those rules to the open source ESLint projects.

PMD has a reputation for being harder to write rules for. Fortunately, there is a graphical rule designer for PMD14 that makes it much easier to design rules. The designer parses the code and then allows you to write rule specifications in XPath or Java.

Some of the languages that SonarQube supports have open source rule definitions that you could contribute to, but both SonarApex and CodeScan use proprietary rules and so are not easy to augment with your own custom rules.

Again, while it’s possible to write custom rules, it’s rare that you would need or want to do that. What’s more common is simply to define a specific subset of rules that you want to run for your projects. All of these linting tools provide mechanisms to do this.

Considerations for Linting

Linting provides extremely fast, helpful feedback to address common malpractices in coding. The spread of ESLint, for example, has had a very beneficial impact for JavaScript developers. Don’t expect linting to catch every issue, but it can hopefully help provide guidelines for both new and experienced developers. Just as being free from speling errors doesn’t imply good writing, being free from linting errors doesn’t imply good code. But linting can still help you identify and remove certain faults.

How to Act on Feedback from Linting

Just as with spell check and grammar check, this real-time feedback can either be acted on or ignored by developers. The intention is to provide developers with high-quality suggestions but not to force them to make changes. If you find that certain types of rules are not helpful and just add noise to the workspace, you should remove those rules.

Developers are under no obligation to act on the feedback from linting. If, however, you want to ensure that most or all such rules are obeyed, you can use the quality gate feature of many static analysis tools to enforce them.

Static Analysis—Quality Gates

Static analysis tools apply a standard set of automated analyses to code. There are three levels at which this analysis can be done: to provide real-time feedback (linting), to pass or fail a specific set of changes (quality gates), or to provide an assessment of the entire codebase. The rules may be the same at every level, the difference is in the scope of what’s assessed. Linting is done “on the ground,” as developers are working on a small block of code. Quality gates are typically applied at the level of a commit or a pull request and take a “1,000 foot view” of changes that might span many files. And a full codebase assessment gives a “30,000 foot view” of the state of the overall project.

As mentioned before, linting is not meant to enforce these analysis rules. By contrast, quality gates provide a pass/warn/fail assessment of a group of changes.

How to Establish Quality Gates

Any tool that gives you the ability to assess a specific set of changes, generate a pass/warn/fail status, and prevent those changes from being merged or deployed can be used as a quality gate.

Most static analysis tools such as SonarQube, Clayton, Codacy, or CodeClimate give you this ability, when used in conjunction with pull requests and a CI/CD tool. Those tools are discussed in more detail later.

Some of the commercial Salesforce release management tools also provide this as a native capability, with integrated PMD scans.

Copado Compliance Hub provides a similar capability that is targeted specifically at security-related changes to Salesforce. Compliance Hub works with metadata like profiles, permission sets, Settings, and Custom Objects to ensure that there are no unintentional or malicious changes that could impact the org’s security. For example, changing the Org-wide default sharing for an object from private to public could constitute a security breach. This is a very Salesforce-specific form of analysis.

When Do Quality Gates Run?

By definition, a quality gate either runs when a developer makes a commit in a shared version control system or when they make a pull request. Applying these rules to pull requests is the most common scenario. Pull requests aggregate information from static analysis, unit test execution, and check-only deployments and facilitate a formal code review armed with this additional insight.

Where Do Quality Gates Run?

Quality gates are run by the static analysis engine, typically as part of a CI process. As such, they don’t require a Salesforce environment for execution. As shown in Figure 8-8, this analysis can be applied as code comments, which are visible in pull requests. More importantly, an overall pass/fail status can also be shown in the pull request itself as shown in Figure 8-9.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig8_HTML.jpg
Figure 8-8

SonarQube can be enabled to write back to a code repository with its feedback

../images/482403_1_En_8_Chapter/482403_1_En_8_Fig9_HTML.jpg
Figure 8-9

SonarQube quality gate results shown in a pull request

Data Needed for Quality Gates

Static analysis does not require data for execution.

Determining Quality Gate Criteria

The criteria used in a quality gate should ideally be consistent with the criteria used in linting and in analysis of the complete codebase. This kind of consistency means that everyone on the team can be aligned about the project’s compliance with these standard rules. Project managers can take a high-level view of the state of the overall codebase, tech leads can review a group of changes at the level of a pull request, and developers can review linting feedback right within their IDE. This helps ensure that developers take linting feedback seriously and that unhelpful rules are removed. It gives the tech lead confidence that developers are not ignoring feedback from linters. And it gives a project manager a way to see upward or downward trends in technical debt and other issues.

Considerations for Quality Gates

Quality gates are not a substitute for other forms of quality analysis, but they can help ensure that certain kinds of faults are not introduced into the codebase. A passing quality gate does not guarantee that the code is free from any quality issues; static analysis doesn’t even guarantee that code will execute successfully. But it can be a useful indication.

The first project where I used quality gates involved extensive use of JavaScript inside Visualforce pages. We decided to implement JavaScript unit tests for this code, but the existing codebase had negligible code coverage. We implemented a quality gate in SonarQube that assessed the code coverage of recently changed JavaScript code (the “leak period”). Any time we added code or refactored existing code, the quality gate would assess the coverage of the modified lines and prevent us from deploying if the coverage was below 80%. We made no effort to systematically write coverage for old code; nevertheless, over several months we ratcheted up coverage across the entire codebase as shown in Figure 8-10.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig10_HTML.jpg
Figure 8-10

Increase in code coverage after using quality gates to enforce coverage on any modified code

How to Act on Quality Gate Results

By definition, a passing quality gate allows you to proceed with subsequent steps in an automated process, typically a deployment. By contrast, a failing quality gate will cause your build to fail or your pull request to be rejected. Systems that provide a “warn” status may allow the build to proceed while nevertheless showing you that there may be issues that require attention.

Unit Testing

Unit tests are the best known type of automated test. Because the Apex test runner is part of Salesforce, and because Salesforce requires 75% test coverage on Apex classes and triggers, every Salesforce developer is familiar with Apex unit testing.

Technically, a unit test is a small, fast-running test to evaluate a particular unit of code. I’ve divided the discussion of Apex tests between this section and the later section “Code-Based Acceptance Tests” to indicate that not all tests written in Apex are truly “unit tests.”

While they are developing, your team needs to be able to quickly run a small number of tests to give them feedback whether their functionality is working and that they haven’t broken other code.

Whereas static analysis helps to enforce general good practices (coding standards), automated testing checks the specific behavior of the system to ensure it’s correct. Creating automated tests is an investment, so skill and thought are required to discern when to use automated tests, what type of test to employ, and how to write the test. But over the life of an application, the cost of letting failures pass undetected into production or repeating manual regression tests is far greater than the cost of building automated tests from the outset.

Unit testing encourages developers to structure their code into smaller, more modular functions. These types of functions are easier to test, but they are also easier to reuse. This means more readable code and less redundancy. Well-tested code mitigates the risk of future failure, and more modular code reduces complexity thus making code safer to refactor and easier to maintain.

How to Run Unit Tests

There are four coding technologies used on the Salesforce platform, and so there are differences in the test engines needed for each. The good news is that three of these technologies are built on top of JavaScript, and Salesforce is increasingly converging on Jest as a recommended test engine for JavaScript.

Apex

As mentioned, the Apex test runner is built into the platform. This test runner is proprietary, and you can’t run Apex or Apex tests outside of Salesforce.

JavaScript in VisualForce

VisualForce has long allowed embedded JavaScript as a way to provide more dynamic on-page functionality. My first introduction to unit testing JavaScript involved building business-critical JavaScript that was hosted in Visualforce pages. I have never seen a test framework that actually runs tests inside of a Visualforce page. Instead, we established a strict separation of our JavaScript code into static resources, leaving very little JavaScript in the Visualforce page itself.

We used Visualforce Remoting, and the only JavaScript we embedded into Visualforce was to assign controller-binding variable values to local JavaScript variables. That left us free to treat the code inside of the static resources as separate, independent functions.

You can use any JavaScript testing framework to test code that’s stored in static resources. If you don’t already have a strong opinion, I would recommend Jest as a very usable test engine.

(Aura) Lightning Components

The original Lightning Components are written in a JavaScript framework called Aura. Some teams within Salesforce created a Lightning Testing Service15 to allow these Aura components to be unit tested. The Lightning Testing Service supports Jasmine and Mocha, two JavaScript test engines. It’s possible that someone will port this over to Jest, but that has not happened so far. All of these JavaScript test frameworks are very similar, but they use slightly different syntaxes to define and run tests.

Unlike testing JavaScript in static resources or Lightning Web Components, these Aura tests have to run inside a Salesforce org, and so the Lightning Testing Service is installed as an unmanaged package in your org.

Lightning Web Components

Lightning Web Components bring many benefits such as being open source, standards-compliant, fast to execute, and executable outside of Salesforce. In addition, they have the first officially supported JavaScript testing framework for Salesforce.16 This testing framework uses Jest as the engine.

Just as with the convergence on ESLint as the JavaScript static analysis tool of choice, the JavaScript community seems to be converging around Jest as its preferred testing engine. Whereas other test engines require you to use different tools to support aspects of testing such as mocking, Jest bundles everything you need—the test engine, test syntax, mocking frameworks, and more—into one package.

When Do Unit Tests Run?

Unit tests have two purposes: to verify the code currently under development and to verify that current development hasn’t broken existing code. Each piece of code generally has a small number of unit tests that directly pertain to it. These should be run throughout the development process to ensure the code itself is functioning.

It’s also very helpful to identify any preexisting tests that pertain to related functionality. You can also run those throughout your development to ensure you haven’t broken someone else’s work.

Because Apex tests run more slowly than comparable Java or JavaScript tests, if teams are practicing continuous delivery it may not be practical to rerun every Apex test before every release. At a minimum, a subset of Apex unit tests should be rerun before each release. Tools like ApexUnit17 or the Developer Console query the ApexCodeCoverage table via the Tooling API to dynamically determine which test classes cover which pieces of code. The FlowTestCoverage object provides comparable information when testing processes and autolaunched flows. This information can be used to dynamically determine which tests to run based on which Apex code was changed. Running the full suite of tests as often as possible is still necessary to identify regression failures due to unintended side effects of a change.

One very helpful recent addition to the Salesforce extensions for VS Code was the ability to trigger and monitor Apex unit tests from inside the IDE. Tests can run in the sidebar of your IDE while you work on other changes to the codebase.

As mentioned later in the section on “Code-Based Acceptance Testing,” prior to releasing, the entire set of unit tests should be rerun to ensure that new development didn’t cause unexpected regression failures in other parts of the codebase.

Unit Testing Environments

Apex code and Apex unit tests can only run in a Salesforce org. Apex tests can also be used to test other kinds of business process automation such as complex process builder processes and autolaunched flows. During the development process, developers should use their development scratch orgs to run tests related to the work they’re developing. Scratch orgs can also be created dynamically by the CI/CD system and used to run unit tests as part of the delivery pipeline.

JavaScript code in a Salesforce org (including Lightning Components) can be tested using a JavaScript test framework. JavaScript code that runs on Visualforce pages might be testable directly on a developer’s workstation if it can be extracted into a separate static resource. In this case, those tests should be run locally as part of a pre-push hook or using a CI/CD process that runs those JavaScript unit tests. Lightning Components can be tested using the Lightning Testing Service,18 although these require a Salesforce org to run. As with Apex unit tests, when developers are working on a particular piece of functionality, they should repeatedly run the relevant tests in their development org. To identify any regression failures, the complete set of Lightning tests should be rerun in the same scratch org created for running all Apex unit tests. Another reason for running these in scratch orgs is that unlike Apex tests, JavaScript tests have no transaction management so they actually change data in the org, and those changes are not automatically rolled back.

One very nice characteristic of the new Lightning Web Components is that they can be tested outside of a browser, which allows those tests to be run very quickly.19 LWC testing uses Jest, which is arguably the easiest to use JavaScript testing framework.

Data Needed for Unit Tests

The purpose of unit tests is to test the underlying logic of the code. Therefore tests should create their own test data so that their success or failure is determined by the code being tested rather than the underlying data in the org.

When Apex tests were first introduced, they defaulted to being able to see all of the data in the org. This behavior was reversed in the Spring ’12 edition of Salesforce, so that tests no longer have access to org data unless they use the @isTest(SeeAllData=true) annotation. There remain some rare exceptions where it may be necessary to allow a test to access data in the org, but this should generally be avoided. Tests that do not require access to org data are called “data silo tests,” and they help avoid many problems. Relying on data in the underlying org means that tests might pass in some orgs and not pass in others. It also means that a user in that org can inadvertently break tests if they change or delete data used by the test. Data silo tests also make it easier for Salesforce to detect and resolve problems in upcoming releases, since Apex tests are run as part of a regression testing process called the Apex Hammer.

As mentioned later in the section “Code-Based Acceptance Testing,” it’s possible to specify test data using CSV files or using frameworks such as the open source Apex Domain Builder.20

Apex tests run inside a Salesforce database transaction. This means that they can create, modify, and query data, but that data is not persisted after the test finishes.

JavaScript tests don’t run in a transaction, which means that any data they create or modify in an org will remain in that org after the test completes. Lightning Web Components and JavaScript in static resources can be tested outside of a Salesforce org, using a framework like Jest to specify inputs and check outputs. These tests don’t use a database, so data is held in memory and not persisted after the tests finish. But when using the Lightning Testing Service to test (Aura) Lightning Components, you need to be very careful to segregate test data from any other data.

There are three ways to segregate test data used by the Lightning Testing Service from other data. First, you should not use the Lightning Testing Service in a production org; instead run it in scratch orgs or testing sandboxes. Second, each test should create its own data as described earlier and should ideally delete that data after finishing. Finally, any data that’s created should be clearly named to distinguish it from any other data.

Creating Unit Tests

Salesforce has prepared helpful guides for creating Apex,21 LWC,22 and Aura component tests.23 And you can find various third-party resources that explain how to test JavaScript in static resources such as the Dreamforce ’14 talk I gave entitled “JavaScript-heavy Salesforce Applications.”24

Considerations for Unit Testing

Salesforce establishes some minimum guidelines for automated testing. Apex classes and triggers must have 75% of their lines covered by automated tests before they can be deployed to production. If you deploy Flows and Processes to a production org as part of a CI/CD process, they also require code coverage if they are deployed as active. You may also enforce your own code coverage thresholds for JavaScript code using external quality gates like SonarQube.

There is a law of diminishing returns on test coverage. The Pareto principle dictates that for many kinds of system, 80% of the progress will require 20% of the effort, while the remaining 20% of the progress will require 80% of the effort. That remaining 20% will not necessarily bring much value, so you should never push hard for 100% test coverage. I have personally been guilty of doing code acrobatics to try to achieve 100% test coverage. As soon as you start noticing your Apex code becoming more complex and more difficult to read or filling up with Test.isRunningTest() checks, you have started to go too far.

How to Act on Unit Test Results

When you’re practicing test-driven development (TDD) , you’ll generally iterate quickly, running tests every few minutes until you get your code to succeed. In those cases, test success or failure provides you feedback on the functionality of your code. In general, test failures provide a useful indication if you’ve broken some underlying functionality.

If you notice tests failing because of some harmless change that doesn’t impact non-test code, you should consider whether your tests are architected in a flexible way—for example, if you add a new required field to an object that can cause all the tests that create those objects to fail. Using a central test factory to create objects allows you to add such required fields in a single place. Your tests can call the test factory to define the base objects and then modify their data as appropriate before creating them.

Comprehensive Tests

Although there are many types of automated testing, and many names used to describe them, the book Continuous Delivery simplifies the various types of testing down into “commit-stage testing” and “acceptance testing.” The previous section discussed the use of static analysis and unit testing as commit-stage testing to provide fast feedback to developers. This section addresses acceptance testing: running comprehensive checks on the code prior to release.

Acceptance testing means checking to ensure that functional and nonfunctional requirements are being met. In effect, testers are asking “Does this do what it’s supposed to do?” or “Will end users accept what has been built?”. The agile convention is to write specifications in the form of user stories, each of which has associated acceptance criteria. Acceptance testing confirms that the acceptance criteria (and sometimes other expectations) are being met.

Acceptance testing can be done manually or in an automated way. Commit-stage tests should run quickly and frequently and can run in a simplified testing environment like a scratch org. Automated acceptance tests are intended to test end-to-end functionality, sometimes including external integrations. To provide realistic results, automated acceptance tests should run in a production-like environment and will generally take far longer to run. They are therefore run less frequently.

This section discusses both functional and nonfunctional testing. Functional tests, whether automated or manual, are intended to check that code and configuration functions as it should and doesn’t break other things. Nonfunctional tests look at other aspects of the system such as security, maintainability, reliability, and performance.

As it says in Continuous Delivery, “The majority of tests running during the acceptance test stage are functional acceptance tests, but not all. The goal of the acceptance test stage is to assert that the system delivers the value the customer is expecting and that it meets the acceptance criteria. The acceptance test stage also serves as a regression test suite, verifying that no bugs are introduced into existing behavior by new changes.”

We first discuss automated functional testing and nonfunctional testing, which both typically make use of automated tools. We conclude with a discussion on manual QA and user acceptance testing.

Automated Functional Testing

Automated functional testing builds on the unit tests described earlier, but may go further to include long-running code-based tests, as well as UI tests that simulate user interactions through a web browser.

Code-Based Acceptance Testing

There is no technological difference between code-based unit tests described earlier and code-based acceptance tests. I’ve divided this discussion between these two sections to emphasize that the same technologies can be used for two purposes. Whereas unit tests should run quickly and be narrowly focused on giving feedback to the developer, acceptance tests may take hours to run and function as confirmation that code continues to meet specifications and does not suffer from regression failures.

How to Run Acceptance Tests

Because the technology is the same, the same test engines described earlier in “How to Run Unit Tests” can be used to run acceptance tests.

When Do Acceptance Tests Run?

There are typically three occasions when code-based tests are run: during development, during deployments, and triggered by some external process. The preceding section discussed executing these tests during the development process, as unit tests. The same unit tests that help developers ensure their code’s logic is reliable gradually accumulate to become the acceptance test suite for the entire org.

Each test can be viewed as an executable specification for how the code should function: given certain inputs, when a particular action occurs, then assert “do we get the correct result?” If a test is written in this way, once added to the acceptance test suite, it provides an ongoing indicator that the specified behavior is still intact. For this reason, code-based acceptance tests provide a powerful built-in mechanism to protect the integrity of your org’s customizations.

Salesforce generally has an extremely enthusiastic developer and admin community. But this enthusiasm has been slow to spread to automated testing. Whereas some languages such as Ruby have cultivated a passionate testing community, 42% of the Salesforce orgs scanned by the Clayton analysis tool show a pattern of just using tests to achieve coverage.25 Appirio’s own CMC Metrics tool found an average of one assert for every 222 lines of code across the more than 6,000 orgs we’ve scanned.

This anemic approach to testing indicates that a large portion of Salesforce developers see code-based testing as an inconvenience and apply minimal effort beyond what’s required to get a deployment to succeed.

Salesforce provides the ability to run Apex tests during a deployment. This behavior is enforced when deploying to a production org, along with a 75% minimum code coverage threshold. Tests run during a deployment execute in the target org after the metadata has been successfully deployed to that org, and so can provide a very good indication of whether the related Apex code will run successfully. If one of these tests fail, or if you are deploying in check-only mode, Salesforce rolls back that deployment, thus preventing the deployment from completing. The ability to manage deployments and test execution as an atomic transaction (it either all succeeds or all fails) is one of the excellent capabilities of the platform.

JavaScript tests on Salesforce can’t run inside a Salesforce deployment transaction. This means that you can run tests on (Aura) Lightning Components in a target org, for example, but they can only run after the deployment has completed.

In addition to running tests as part of a deployment, you can trigger this test execution at any time. The Salesforce CLI provides an sfdx force:apex:test:run command, which you can run as part of a CI process or on a schedule. This command provides the ability to specify one or more test suites if you want to run a predefined set of tests. If desired, it’s also possible to use the Salesforce scheduler to run one or more tests. You can create a small block of scheduled Apex that queries a group of test IDs and then executes them on a schedule. See the section “Running Tests Using ApexTestQueueItem” in the Apex Developer Guide.26

Acceptance Testing Environments

Code-based acceptance tests can generally be run in scratch orgs or testing sandboxes. This is especially important for JavaScript and UI tests, since these make actual changes to the data in an org which are not rolled back after tests complete. Scratch orgs allow you to create an org and to specify test data for that org. You can then ensure that the testing environment starts clean every time.

Despite the fact that these tests can be run in scratch orgs, it is important to have a comprehensive set of checks run in a production-like environment, and thus if there are concerns that your scratch orgs may not fully resemble your sandboxes or production orgs, automated acceptance tests should be run in a partial or full-data sandbox. This can be the same sandbox used for manual testing, as long as the data used for automated tests is clearly segregated from the data used for manual tests. You will also need a mechanism for resetting this test data in the sandbox.

Data Needed for Acceptance Testing

As with code-based unit tests, the data used for code-based acceptance tests should generally be stored within the test itself. Tests that rely on data in the org are necessarily brittle, behave inconsistently if that data is changed, and represent a form of tight coupling which makes it hard to refactor your codebase or make it modular.

Apex tests allow for larger volumes of data to be stored as static resources in CSV format. That data can be loaded when a test class first begins to execute using the @TestSetup annotation. Each test method has to query that data to access it, but the DML operation to load the data only runs once per test class. This reduces test execution time and governor limit usage.

A new open source framework called the Apex Domain Builder ( https://github.com/rootstockmfg/domainbuilderframework ) offers a very performant and readable way of creating test data. Since Apex-based tests should run in isolation, it’s important that each test class create the data that it will need for its test methods. This can easily lead to lots of repetitive code and slower test execution and tends to clutter the test methods with data setup steps that are irrelevant to the test itself. Domain Builder uses a “fluent” syntax similar to that used in the FFLib modules mentioned in Chapter 5: Application Architecture and in fact uses some of the same underlying code. Listing 8-1 shows the elegance of such an approach. Required and boilerplate field values can be defined centrally in the Account_t class, so that inside the tests themselves you only need to define the field values that should be specific to that test.
  @IsTest
  private static void easyTestDataCreation() {
    // Given
    Contact_t jack = new Contact_t().first('Jack').last('Harris');
    new Account_t()
        .name('Acme Corp')
        .add( new Opportunity_t()
                    .amount(1000)
                    .closes(2019, 12)
                    .contact(jack))
        .persist();
    // When
    ...
    // Then
    ...
  }
Listing 8-1

The Apex Domain Builder syntax for creating test data

Creating Acceptance Tests

As mentioned before, you should write unit tests with the idea that they will each become part of the acceptance test suite for the application.

Unit tests are typically written just to test the behavior of a specific unit of code such as a method or to achieve code coverage so you can deploy. Seeing unit tests as contributing to the integrity of the entire package or org helps you approach them in a broader light, as acceptance tests.

To bridge these two goals, I’ve found it helpful to adopt the test-writing approach known as behavior-driven development (BDD) introduced by Dan North,27 sometimes also called acceptance test–driven development (ATDD).

The basic idea of BDD is simple: write each test as a specification of the behavior of the system under test. Sometimes these tests are even called “specs.” Each test method name should be a sentence describing the expected behavior, and each test should follow a consistent given-when-then format.
  @isTest
  static void itShouldUpdateReservedSpotsOnInsert() {
    System.runAs(TestUtil.careerPlanner()) {
    // Given
    Workshop__c thisEvent = TestFactory.aWorkshopWithFreeSpaces();
    Integer initialAttendance = TestUtil.currentAttendance(thisEvent);
    final Integer PRIMARY_ATTENDEES = 3;
    final Integer NUMBER_EACH = 4;
    // When
    Test.startTest();
    TestFactory.insertAdditionalRegistrations(thisEvent, PRIMARY_ATTENDEES, NUMBER_EACH);
    Test.stopTest();
    // Then
    Integer expectedAttendance = initialAttendance + PRIMARY_ATTENDEES ∗ NUMBER_EACH;
    system.assertEquals(expectedAttendance, TestUtil.currentAttendance(thisEvent),
      'The attendance was not updated correctly after an insert');
    }
  }
Listing 8-2

A sample BDD-style test in Apex

Listing 8-2 shows a method named itShouldUpdateReservedSpotsOnInsert() which clearly states the intended behavior of the class being tested. The BDD convention is for each test to begin with itShould.... The body of the test is grouped into three sections, indicated by the comments // Given, // When, and // Then. This threefold division provides a clear syntax and structure, and is equivalent to the older AAA (Arrange-Act-Assert) division. I advise everyone to use this same structure to write their tests.

In all other respects, this is a normal Apex test. Some languages provide test frameworks such as Cucumber that allow for a human-readable domain-specific test language (DSL) with inputs and expected outputs separated from the actual code. That’s not easy to achieve in Apex, but simply structuring your tests in this way provides many of the same benefits.

Considerations for Acceptance Testing

In addition to focusing on acceptance criteria, acceptance tests may also cover a broader scope than unit tests, testing multiple components. Because Apex tests are subject to the same governor limit restrictions as other Apex code, there’s a limit to how comprehensive these tests can be. Complex, multistep procedures, especially those involving multiple Triggers, Processes, or Flows, can easily time out or exceed CPU, DML, SOQL, or Heap size governor limits. UI tests might be a better choice for simulating complex test scenarios.

Acceptance test suites focus on thoroughness over speed. But speed is still important. One of the most important ways to speed up your tests is to run them in parallel. Jest runs JavaScript tests in parallel, but JavaScript tests are generally extremely fast anyway. The Apex test engine runs up to five tests in parallel by default, but there’s a checkbox in SetupCustom CodeApex Test ExecutionOptions to “Disable Parallel Apex Testing.” Some teams have disabled parallel execution because they encounter UNABLE_TO_LOCK_ROW errors when Apex tests access the same records. But this can make test execution extremely slow. As an alternative, mark all of your Apex tests with the @IsTest(isParallel=true) annotation, but disable that annotation for those which are not parallel safe.

You will never succeed in testing every aspect of a system, nor should you try. Even with extensive collections of tests, failures will still sneak through your delivery pipeline. Production failures are expensive, at least in terms of stress, and sometimes monetarily. Production failures provide a valuable opportunity to do postmortems and to see whether systems or tests could be put in place to prevent such failures from happening again.

Consider your acceptance tests an investment in the overall reliability of your org. As mentioned earlier, you should prioritize the most business-critical aspects of your processes. By doing this, your investments in testing will be practical and deliver a return on investment by continually blocking deployments that trigger known failures.

How to Act on Acceptance Test Results

Apex tests that run as part of deployments will automatically cause the deployment to fail if any Apex test fails. Deployments to production will also fail if any Apex class or trigger has 0% coverage, or if the overall coverage for that deployment is less than 75%. This coverage gateway is well known and much dreaded by those deploying Apex to production. There’s no feeling quite like battling through hundreds of deployment errors, only to have your deployment fail with the error: Code coverage is insufficient. Current coverage is 74.92840025717459% but it must be at least 75% to deploy.

JavaScript test results are not built into deployments in the way Apex is, but can still be used to pass or fail a build, by using your CI system. If you have any JavaScript testing in your delivery process, you should run that as one stage in your delivery pipeline. Failing tests should block later stages of the pipeline.

Tracking Code Coverage

Code coverage reports provide some indication of whether you have written thorough tests. High code coverage doesn’t guarantee that you’ve written good tests; but low code coverage guarantees that you have not.

The Salesforce Developer Console and some IDEs can show your Apex code coverage including which lines are covered or not covered. Some code quality tools like SonarQube also allow you to ingest unit test code coverage reports. This allows you to track coverage over time, as shown in Figure 8-11, and to view coverage in a single UI alongside code quality feedback as shown in Figure 8-12. One benefit of such tools is that they can track coverage information for both Apex and JavaScript tests. Enabling this capability may take a bit of experimentation.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig11_HTML.jpg
Figure 8-11

A snippet of the code coverage graph from SonarQube

../images/482403_1_En_8_Chapter/482403_1_En_8_Fig12_HTML.jpg
Figure 8-12

Many static analysis tools like SonarQube also show line-level coverage results

Both CodeScan and SonarQube allow you to ingest Apex code coverage results generated from the Salesforce CLI. To do this, you first run your Apex tests using sfdx force:apex:test:run -c -d test-results -r json to generate the coverage results in a file called test-results/test-result-codecoverage.json. This file can then be uploaded at the same time you run your static analysis jobs. In CodeScan you add the parameter sf.testfile=test-results/test-result-codecoverage.json to your static analysis job, and in SonarQube you add the parameter sonar.apex.coverage.reportPath=test-results/test-result-codecoverage.json. Other quality analysis tools may offer similar capabilities with different syntax.

If you are using Jest to test your JavaScript, you can output coverage files that can then be ingested by SonarQube. Running npx jest --ci --coverage will create coverage files in a coverage/ directory and summarize it in config/lcov.info as shown in Figure 8-13. You can then specify that directory when running your static analysis job by adding the parameter sonar.javascript.lcov.reportPaths=coverage/lcov.info. This property will ensure that coverage information is uploaded and tracked in the SonarQube user interface.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig13_HTML.jpg
Figure 8-13

Code coverage results output from Jest

UI Testing

Automated acceptance tests can also be done using UI automation tools. The most common UI testing technology is Selenium, although commercial testing tools like Provar and Tricentis Tosca are also available.

There are several benefits to using UI tests on Salesforce. First, they can be used to test complex sequences of actions that would otherwise exceed Salesforce governor limits if run using Apex. Second, they require significantly less expertise to create and understand, since they use the Salesforce user interface which is already familiar to manual testers and business users. Finally, they can be used to test things that can’t be tested just with Apex or JavaScript, including things that happen in the web browser but not directly inside of Salesforce.

Although they are easy to create, UI tests are generally significantly slower than Apex tests, which in turn are slower than JavaScript tests. UI tests are also notoriously brittle, since even small changes to the UI can cause these tests to break if they are not written carefully. Continuous Delivery makes the case for avoiding UI tests whenever alternatives exist, but on the Salesforce platform, there are few alternatives. Creating robust tests is likely to require careful thought, including involvement from developers and architects to ensure that the tests remain reliable and performant.

How to Run UI Tests

As their name implies, UI tests automate actions that are normally performed through a user interface, and can verify the output that would appear in that user interface. Salesforce does not provide a built-in engine for running UI tests. Therefore you’ll need to provide your own mechanism for running these tests.

UI tests can be run programmatically like other code-based tests, using either Selenium or Puppeteer. There are also several click-based tools that provide graphical interfaces for building these UI tests.

Selenium is the most popular framework for building UI-based tests. It’s free and allows you to use a wide range of programming languages such as JavaScript, Ruby, Java, and Scala to drive interaction with a web browser. Selenium runs on any web browser, on any platform. This capability allows companies like BrowserStack and SauceLabs to offer UI testing as a service, so that you can test your web applications across a wide range of browsers and desktop or mobile platforms.

Puppeteer is a new offering from the team who builds Google Chrome that allows you to build UI tests using JavaScript and to run them in a “headless” Chrome browser. Many UI tests actually run in a headless browser, which means that they interact with web pages through the underlying DOM rather than actually rendering a user interface. Puppeteer only supports JavaScript and Chrome. But it is quickly growing in popularity due to its speed and simplicity. It won’t help you catch browser-specific edge cases, but it handles 80% of the UI testing needs, with less overhead to manage than Selenium.

Be advised that building Salesforce UI tests programmatically is not for the faint of heart and brings significant challenges compared to testing home-grown web applications. The Salesforce user interface is subject to change at least three times per year, and the HTML underpinning it is dynamically generated, so you can’t rely on IDs or other elements to remain consistent. Repetitive actions like logging in and data entry will need to be put into centralized modules, so you don’t need to maintain many repetitive blocks of code.

For this reason, it’s almost certainly more cost-effective to use a prebuilt Salesforce UI testing solution. Both Copado and AutoRABIT offer the built-in ability to record user interactions as Selenium scripts and to replay those as tests. They have invested time and energy in handling repetitive tasks and can keep that up to date for all their customers as the Salesforce UI evolves.

There are two other commercial options for UI testing. Provar is a Salesforce-specific UI testing tool that allows you to set up and tear down test data, record and replay user interactions, and run tests interchangeably across Lightning and Classic. The entire company is focused on supporting Salesforce, so they stay up to speed on the latest releases and are able to handle complex Salesforce UIs such as the Service Console.

Another commercial UI testing tool that supports Salesforce is Tricentis Tosca. Tosca has a good reputation and strong capabilities. They may be a good fit if you want to test non-Salesforce user interfaces as well. But if your only concern is testing Salesforce, Provar is likely to be more robust.

When Do UI Tests Run?

Because UI tests typically take longer to run than unit tests, they’re not usually run by developers as part of the development process. They are best suited to be run as part of regression tests to confirm that deployments have not broken anything. Because they can take so long to run, you should parallelize them if possible.

UI Testing Environments

Unlike Apex or JavaScript tests, UI tests depend on a fully rendered Salesforce user interface to run. You have to actually deploy your customizations into a Salesforce environment and then point your test engine to that environment. If you’re able to supply a full set of test data and keep it up to date, you can run these UI tests in a scratch org. This is ideal since the tests modify data during the course of execution and so may make undesirable changes to long-running test sandboxes. Test sandboxes can be used, as long as your UI tests don’t make undesired changes to data that’s needed for manual testers.

Data Needed for UI Tests

One of the biggest challenges in maintaining automated UI tests is that test data needs to be maintained along with the tests themselves. As your Salesforce org evolves, you’re likely to need to add new kinds of configuration data, specify new fields on existing objects, and adjust to ongoing changes to things like user roles.

If possible, it’s ideal if your automated UI tests can use the same test data you use for your manual testers. Even code-based tests may depend on large groups of records stored in CSV files, and you can reuse this same data in your UI tests. Maintaining a single consistent set of test data that can be loaded into scratch orgs for manual testers or fed into the UI test engine will help ease the overall maintenance burden. Test data updates then only need to be done once.

As with code-based tests, you should be sure to supply each UI test with the data it needs and never make assumptions that underlying data will remain unchanged. A possible exception to this rule is when your tests depend on configuration that is stored as data in sandboxes. Sandbox refreshes let you benefit from cloning underlying configuration data (like CPQ data) from a production org, so that it remains up to date. As mentioned earlier, UI tests actually change data in the underlying org so take steps to ensure that you are never changing data used by manual testers.

Creating UI Tests

There are three ways of creating UI tests: programmatically, by using a test recorder, or by specifying a test data model. Programmatic tests work similarly to other code-based tests, but rely on libraries (either Selenium or Puppeteer) that allow them to navigate to URLs, modify input fields, check output, and so forth.

As mentioned earlier, the Salesforce UI is subject to change at least three times a year and is autogenerated with each page load. Standard HTML tags are dynamically generated, CSS classes are subject to change, and the underlying DOM is basically far more dynamic than a home-grown web app.

At Appirio, our QA team invested substantial time building a dynamic, Selenium-based testing engine. They wrote standard modules to handle login, data load, record page and list view navigation, and so forth. Unless you’re prepared to invest substantial time building and maintaining your own Salesforce testing architecture, I’d advise you to use a prebuilt tool for your UI testing. Whereas there are ample online resources about doing Salesforce deployments, you’ll find almost no assistance on the Internet if you decide to venture into automated testing of the Salesforce UI.

Provar and Tosca provide test recorders to record user interactions and supplement that with the ability to use a test data model to load and query test data using the API. That combination provides an excellent balance of reliability, speed, and usability. Just as code-based tests shouldn’t bother to test standard Salesforce functionality, your UI tests shouldn’t take time to input test data into standard Salesforce record detail pages. Test data should be loaded using the API, which is fast and reliable. Your tests can then focus on validating complex, custom aspects of the UI or multistep processes.

Copado and AutoRABIT both use variations on the open source Selenium test recorder to record user interactions.

The most important guideline when create custom pages that can be tested more easily is to add Id attributes to all page elements. UI tests typically use HTML DOM Ids to uniquely identify the elements in any page. Salesforce will either not generate Ids or autogenerate them if none are specified. These autogenerated Ids can easily break Selenium scripts that were written to search for an Id that isn’t always present on the page. If defining an Id is not feasible, you can use the class or other data attributes to uniquely identify DOM elements.

It can be helpful to use a naming convention to name the Id attributes. This leads to more readable and maintainable UI scripts. Appirio typically recommends PageName_FieldName_FieldType. For example, if you have a custom Case Summary page that has a dropdown selector for owner, you can give that the Id CaseSummary_Owner_Picklist. Both tests and code need to be clear to be maintainable.

UI Testing Considerations

If you’ve used a test recorder to build your initial test, you have access to modify the steps and the criteria used to select elements on the page. It’s this step that requires care and sophistication. Any user can turn on a test recorder, make a few clicks, and enter a bit of data. But for your tests to be stable, you should think about what field elements are most likely to remain stable as your org evolves.

This is one of the reasons that the UI testing architecture needs to be robust and flexible. If a new required field is added to the Account object, you want to be able to update all of your UI tests at the same time. If your tests are simply recorded sets of clicks, you may be forced to rerecord them, unless you are comfortable editing the underlying scripts or have built a modular testing architecture.

With the arrival of the tools mentioned earlier, Salesforce UI testing is a more achievable goal than it was in 2016. UI testing provides unique opportunities for ensuring your delivery pipeline is reliable. But adoption is still in its early phases, and you should reserve this form of testing only for critical or complex processes that cannot be tested by any other method.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig14_HTML.jpg
Figure 8-14

The test pyramid, introduced by Mike Cohn

Figure 8-14 is known as the test pyramid, an often-cited recommendation for how many tests of different types to create.28 The diagram indicates that the foundation of your test strategy should be code-based unit tests, which run quickly and test isolated parts of your codebase. There should be more of these than any other test. At the top of the test pyramid are UI tests, which test the entire integrated system but are slower to run. You should have far fewer of these tests than unit tests. The middle layer refers to acceptance tests which cover larger chunks of functionality, but don’t require a fully rendered UI. You should have a middling number of these. As mentioned, such tests go by various names, such as “integration tests,” “component tests,” and “service tests.”

There are several reasons you should have fewer UI tests than other types of test. A single UI test can cover many underlying pieces of functionality, and thus fewer tests suffice. UI tests also require more time and care to maintain, since a change to one underlying piece of functionality can require a change to many UI tests, and so more tests are more expensive. Since UI tests cover broad segments of the org’s behavior, they have vastly more permutations of inputs and possible outputs, and so it might be tempting to use them to cover every edge case. The test pyramid advises restraint from such temptations. You should test the critical path and fragile scenarios that are frequent causes of regressions, but never try to cover every test case with UI tests.

How to Act on UI Test Results

As with other types of tests, UI tests should be integrated into your delivery pipeline so that you can run at least a subset of them every time you deploy code. UI tests can only run after a deployment is completed, but if a UI test fails after deployment to one org, your CI system can prevent that code from proceeding to subsequent orgs.

Perhaps more than other types of tests, UI tests can be brittle and subject to false positives. UI tests that are “flappers,” alternating unpredictably between success and failure, should be removed or fixed. Like the proverbial “boy who cried wolf,” tests that generate frequent false positives will lead your team to ignore test results altogether, thus undermining their effectiveness.

Nonfunctional Testing

Nonfunctional testing examines the structural characteristics of the code. As mentioned, there are five structural characteristics to software quality: reliability, maintainability, security, performance, and size.

There are different ways to evaluate those structural characteristics, both manual and automated. In this section, we again look at static code analysis, this time in the context of assessing the entire codebase, followed by discussions of security analysis, performance analysis, and code reviews.

Static Analysis—Full Codebase

Static analysis provides fast, automated feedback on code quality with no risks and little or no effort. For that reason, it’s now appearing for the third time in this chapter, in hopes that you will hear this message loudly and clearly. I might be accused of repeating the previous sections on linting and quality gates, but this section is introducing a third distinct use of static analysis: assessing the entire codebase and evaluating trends over time.

There are many tools which can help give insight into code quality by scanning code and flagging vulnerabilities. Tools such as PMD, SonarQube, and CheckMarx can identify many issues and track trends over time. This allows project teams to view the current and historic code health of their project.

When the main focus is simply getting code to work, code quality may not be a developer’s first priority ( https://xkcd.com/844/ provides a lighthearted depiction of this challenge). However, as professionals, improved code quality and efficiency should be one of the main concerns for the project team (and consequently the developer).

Code quality maintenance and improvement requires attention and focus throughout a project’s lifecycle. Issues with code quality, such as poorly designed or difficult to understand code, will accumulate easily if left unchecked. These issues are known as technical debt, and if left to grow they will make software maintenance increasingly difficult, time-consuming, and risky. In the same way that one might deal with financial debt, the key to mitigating technical debt is to acknowledge and address quality risks or concerns as early as possible in the development process and not to let them accumulate.

This section reviews recommended components and tools that project teams can use to monitor and improve code quality across the entire codebase.

How to Run Static Analysis

As indicated previously, static analysis tools can be used to give feedback on the code currently being edited (linting), on a collection of changes being staged for deployment (quality gates), or on the entire codebase. In some cases, the same engine can be used for all three of these purposes. For example, SonarLint is able to connect to a SonarQube instance to synchronize the ruleset between the two.

There are six well-established tools for performing static analysis on a Salesforce codebase: Clayton, SonarQube, CodeScan, PMD, Codacy, and CodeClimate. We’ll look at each of these in turn before discussing how to integrate static analysis with your workflow.

Clayton

Among these static analysis tools, Clayton is the only one designed exclusively for Salesforce. Clayton can connect directly to a Salesforce org, or more commonly to a code repository stored in GitHub, GitLab, or Bitbucket. Clayton provides a library of rules to select from, based on the experience and insights of Salesforce Certified Technical Architects, principally its founder, Lorenzo Frattini. Clayton analyzes your metadata based on the rules you select to identify security, maintainability, and performance flaws.

When used as a quality gate, Clayton can add code comments to a pull request or block that pull request until issues are addressed. When run on the entire codebase, Clayton generates a report that you can view through its user interface or export as a CSV.

In addition to offering a carefully thought-through set of rules, Clayton provides references to training materials on Trailhead and clear suggestions for remedying any errors that are detected.

SonarQube

SonarQube is an open-core product used to track quality metrics across multiple languages and projects. SonarQube scans are typically run as part of continuous integration jobs whenever changes are made to a codebase.

These scans identify issues ranging from excessive complexity to security flaws. It can also track unit test coverage and code duplication. SonarQube tracks issues over time, ranks them by severity, and attributes them to the developer who last touched that line of code. This allows your project team to see quality trends, take action on particular issues, and prevent code from proceeding through the continuous deployment process if it shows significant quality issues.

SonarQube examines and evaluates different aspects of your source code: from minor styling details, potential bugs, and code defects to lack of test code coverage and excessive complexity. SonarQube produces metrics and statistics that can reveal problematic source code that’s a candidate for inspection or improvement.

Appirio made extensive use of the free Enforce plugin for SonarQube29 to provide support for Salesforce Apex code analysis. That plugin only works for SonarQube version 6 and below and struggles to parse some kinds of Apex. Since 2019, SonarQube enterprise edition offers native Apex support. Appirio technical architect Pratz Joshi collaborated with SonarSource to write specifications for many of the Apex rules.

It’s also possible to use SonarCloud.io for a fully SaaS-based code analysis solution. SonarCloud now has feature parity with SonarQube, supporting more than 25 languages without requiring you to install your own server.

CodeScan

CodeScan is a static analysis tool for Salesforce that is based on SonarQube. They use SonarQube community edition as their underlying engine and user interface and have written a very extensive set of rules for Apex, Lightning, and Visualforce. CodeScan offers both a Cloud/SaaS edition and a self-hosted option.

Since it’s based on SonarQube, it has the same underlying capabilities, but has a completely separate set of rules from the ones provided by SonarSource or Enforce. CodeScan was among the first to market for Salesforce static analysis and offers the largest number of quality rules among its competitors.

ApexPMD

Because it’s open source, PMD ( https://pmd.github.io/ ) forms the underlying analysis engine for many other products and is thus the most widely used static analysis engine for Apex. Robert Sösemann did most of the foundational work for ApexPMD and remains its biggest and most popular champion. PMD itself does not provide a graphical UI. But tools such as Codacy or CodeClimate add a UI layer on top of the PMD engine. Many of the commercial Salesforce release management tools such as AutoRABIT also include a built-in PMD scanner.

PMD can be run from the command line, or from within the ApexPMD extension for VS Code, and its results output in multiple formats such as CSV and HTML.

PMD finds common programming flaws like unused variables, empty catch blocks, unnecessary object creation, and so forth. It comes with a rich and highly configurable set of rules that developers can quickly set up, straight out of the box. It also includes CPD, the copy-paste-detector, to help identify duplicate code.

After installing the PMD extension for VS Code, you can use it to scan all the files in your current workspace by running Apex Static Analysis: Current Workspace in the VS Code Command Palette. Problems will appear in the Problems panel, as shown in Figure 8-15, and be indicated on the files themselves.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig15_HTML.jpg
Figure 8-15

A full code analysis using Chuck Jonas’ VS Code PMD extension

People often struggle to get started with using PMD on the command line. You can run this sample command inside a Salesforce project directory as an example to help you get started:
  $ pmd -d src -failOnViolation false -f text -language apex -R rulesets/apex/style.xml,rulesets/apex/complexity.xml,rulesets/apex/performance.xml
Breaking that down:
  • -d src—Which subdirectory do you want to scan? This assumes you’re running the scan in a directory that has a subdirectory called “src”.

  • -failOnViolation false—When running on the command line, set this to false. If you want to run this as part of a CI job and you want the scan to FAIL if PMD generates errors, you can set this flag to true.

  • -f text—The file output format. Other formats include csv, xml, and json.

  • -language apex—The main language being scanned.

  • -R rulesets/apex/style.xml,rulesets/apex/complexity.xml,rulesets/apex/performance.xml—This is the critical parameter, the three Apex rulesets. See the PMD documentation for a complete list of rulesets.

PMD also includes a utility called CPD that can identify duplicate code in your org. Again, here is a sample command you can use to get started:
  $ cpd --files src --language apex --minimum-tokens 100 --format csv
Breaking that down:
  • --files src—The subdirectory, same as earlier.

  • --language apex—The code language, same as earlier.

  • --minimum-tokens 100—The minimum number of “tokens” that need to match for two sections to be considered duplicates. Smaller values will find more matches, but more of them will be “false positives” or insignificant duplication.

  • --format csv—The output file format, same as earlier. Other options include text, xml, and json.

Codacy

Codacy provides a SaaS-based static analysis tool that connects to your codebase in GitHub or Bitbucket and provides static analysis across many languages. Codacy uses well-established analysis engines such as ESLint and PMD but provides a user interface, authentication, and other capabilities on top of that.

Codacy also offers an Enterprise Edition that companies can host themselves. The Enterprise Edition also supports GitLab.

CodeClimate

CodeClimate is similar to Codacy in using PMD and ESLint as its underlying analysis engines for Apex and Lightning, respectively. CodeClimate provides a user interface to allow you to track quality trends in your codebase over time.

CodeClimate offers two products: Velocity and Quality. CodeClimate Quality is the actual static analysis tool. CodeClimate Velocity analyzes developer activity to help you track DevOps metrics such as cycle time, as well as metrics of interest to engineering managers such as what activities are consuming most of the team’s time.

When Does Static Analysis Run?

Some of the analysis tools mentioned earlier, such as Clayton, connect directly to your code repository and run analysis when commits are pushed. Others such as SonarQube or CodeScan require you to run the analysis as part of a CI job, which is then uploaded to the tool.

Running regular scans builds a history that allows you to track trends over time.

Where Does Static Analysis Run?

Static analysis runs either in a CI job or in the backend of one of the SaaS analysis tools, and never inside of your Salesforce org.

Data Needed for Static Analysis

As mentioned earlier, static analysis does not require test data to run.

Creating Static Analysis Rules

Rather than creating static analysis rules, your focus is typically on selecting or deselecting which rules will run from among the options provided by each tool. It is very important to monitor your team’s response to various rules when you first roll them out, so you can ensure they are providing value and not just noise.

Creating static analysis rules is generally outside the skillset of most developers. One exception is configurable rules such as class naming conventions. Tools such as SonarQube allow you to use regular expressions to specify acceptable patterns for your class and method names.

Considerations for Static Analysis

There are two common reactions to scan results from those new to static analysis. One reaction is cognitive overload, since these scans can expose thousands of issues in need of remediation. It was SonarQube that first introduced me to the term “kilodays” when describing the estimated time required to address issues on one codebase. The other reaction is exaggerated optimism in the tools’ ability to identify quality issues. Just because Clayton doesn’t find any issues with your code doesn’t mean that you’ve written good code. It’s entirely possible to write buggy and incoherent code that passes static analysis. Other forms of testing such as code reviews and unit tests are important complements to these scans.

How to Act on Static Analysis Results
Tracking trends provides a way to identify whether issues are accumulating over time and to see hotspots in your codebase. Figure 8-16 shows the analysis of a large codebase in SonarQube. One particular Apex class, MetadataService.cls , is five times larger than the next largest class and represents one quarter of the total issues in the codebase. Understanding or updating this class is likely to be a nightmare for future developers, so it should be prioritized for refactoring while the developers responsible for this abomination are still around.
../images/482403_1_En_8_Chapter/482403_1_En_8_Fig16_HTML.jpg
Figure 8-16

A bubble chart from SonarQube revealing that one Apex class is vastly more complex than anything else in the codebase

Security Analysis

Having thoroughly discussed static analysis, we can look at tools focused on security. There is some overlap between static analysis tools and security analysis tools. All of the static analysis tools discussed in the last section also include some security-focused rules, but there are other tools that explicitly focus on security testing for Salesforce. The points made in the static analysis section about how and when scans run generally apply here as well. The two tools introduced here, CheckMarx and Fortify, can scan and identify security vulnerabilities in Apex code. Salesforce also has a tool called Chimera available for ISV partners to perform security scans on non-Salesforce web sites.

IT Security is a challenging area, made more challenging by a shortage of skilled workers and being left as an afterthought on many projects. Salesforce provides a highly secure foundation for building applications and running your business. But its flexibility means that customizations can expose security vulnerabilities if not thought through carefully.

The security team at Salesforce has been trying for years to promote secure coding best practices on the platform. And https://trust.salesforce.com/en/security/security-resources/ provides a great collection of training materials that are important for developers to internalize.

The automated tools mentioned here and in the static analysis section can buttress the other security precautions you take in your development processes. There’s no substitute for thoughtful architecture, but there are many common and subtle vulnerabilities that can be readily caught by these automated tools.

CheckMarx

CheckMarx is far and away the dominant security analysis tool for Salesforce. Many years ago, CheckMarx partnered with Salesforce to provide security analysis for Apex. Salesforce maintains an instance of CheckMarx that they make freely available to customers to perform scans on their own codebase.30 There are limits to how often customers can run the free scans, so large Salesforce customers often procure their own instances of CheckMarx to enable ongoing scans of their codebase. Having a CheckMarx license also provides access to many features such as dynamically filtering and drilling into results, marking some as false positives, and so on.

Unlike the static analysis tools mentioned earlier, CheckMarx provides the ability to identify security issues that cross file boundaries. One of the most commonly surfaced issues is SOQL injection vulnerability, in which input from a text field could find its way, unfiltered, into a SOQL query and potentially expose private information. For example, CheckMarx can identify when text field input is passed unsanitized from Visualforce to an Apex controller to an Apex service class and finally into a query. In the same way, CheckMarx can detect stored XSS attacks in which unsanitized script data might be written into the database and possibly rendered in users’ browsers.

Micro Focus Fortify also provides Apex scanning capabilities, but CheckMarx is almost unchallenged because of the sophistication of their tool and their partnership with Salesforce. Their tool is expensive, and so not in the toolkit of most small or medium enterprises, but it provides an important complement to developer training, especially if you are building highly secure or Internet-facing applications.

The basic CheckMarx application can connect to a code repository to retrieve your metadata, or you can manually upload it as a ZIP file. If you purchase the add-on integration package, you can also use their command-line tool CxSAST or plugins for CI tools like Jenkins, Bamboo, or VSTS to run security analysis as part of your build process.

Micro Focus Fortify

Fortify is a security analysis tool originally developed by HP. It is much less well known in the Salesforce world, but does support scanning Apex. One benefit over CheckMarx is that Fortify also provides an on-demand SaaS-based scanner. CheckMarx must be installed on a server, whereas Fortify provides either cloud or on-premise options.

Performance Testing

Performance testing is very different from either static analysis or security analysis. It is an aspect of nonfunctional testing that evaluates how well your applications will perform under normal or higher-than-normal load. This is generally used to evaluate response time (delay) and throughput (how many transactions per second the system can handle). It also provides useful information on side effects, such as whether the response time changes for other applications during the tests.

Salesforce handles performance tuning for the underlying platform for you, and provides load balancing and many other mechanisms to ensure that your applications scale and remain performant. In general you don’t need to worry about how well Salesforce will handle large volumes of transactions as long as you’re using built-in capabilities of the platform.

But as you begin to create complex custom applications on the platform, you may encounter scalability issues that don’t surface until you are receiving large numbers of requests. Two of the most common scalability issues are large data volume (LDV) issues and large traffic issues.

LDV issues relate to the amount of data stored in the org, rather than current usage. They usually begin to arise when you are dealing with tens of millions of records on a single object or are making reports or SOQL queries that are unusually expensive. If a query or report is inefficient, that issue will arise whether you are receiving 1 request or 1 million requests. Thus LDV issues can be investigated relatively easily by developers and architects as long as there is an org with sufficient data. LDV issues are dealt with at length in Salesforce’s documentation.

Large traffic issues are much harder to monitor and assess, since they happen in real time and may not be visible to every user. For normal IT systems, monitoring traffic is a large focus of the operations teams. Salesforce orgs are mostly accessed by employees with user accounts, and so the number of users is typically predictable. But as you start to expose your org to customers through Sites or Communities, you may be exposed to more unpredictable levels of traffic.

This is just a superficial review of the topic, largely to introduce high-level performance testing concepts that might be useful to help with LDV or traffic issues. The following tools can be used to generate large volumes of randomized sample data in a testing org prior to go-live. They can also be used for actual performance testing: simulating real-time traffic to help determine whether performance issues begin to appear when the system is under load.

How to Run Performance Tests

A simple subset of performance testing can be done just using Chrome Dev Tools or Firebug from a developer’s machine. If your concern is simply with page load time, this may be sufficient. When Lightning Components were first launched, they were slow and buggy, and the Salesforce Lightning Inspector31 was extremely helpful to identify performance bottlenecks. The Developer Console also remains a helpful way to identify performance issues on the Salesforce backend. But none of these tools allow you to test the system under large loads, which is the true meaning of performance testing.

To run performance tests, you will need a commercial tool such as Micro Focus LoadRunner, an open source tool such as Apache JMeter, or a cloud-based testing tool like SendGrid’s Loader.io. JMeter is popular and open source, but will require a bit more scripting and experimentation to get working. Loader.io offers a free tier and a simple SaaS-based user interface and may be sufficient to help you get started. LoadRunner is the most well established of these but will require installation and high license fees. There are now many other options as well.

If your only goal is to generate large volumes of sample data, you may be able to use a data-focused tool such as Odaseva or OwnBackup which provide tools for loading large volumes of data.

Performance testing is normally divided into load testing and stress testing. Load testing means simulating expected volumes of transactions and possibly varying that load across the normal range. Stress testing simulates higher-than-normal numbers of transactions to determine how the system behaves as loads increase. Stress testing can either take the form of soak testing or spike testing. Soak testing applies a stable or steadily increasing load over an extended period of time; in traditional applications that’s useful to determine memory leaks, whereas in Salesforce it can give you an indication of how data or asynchronous jobs accumulate over time, and whether you see changes in the performance of the underlying Salesforce Pod. Spike testing applies sudden bursts of traffic to assess what happens in extreme circumstances.

Performance testing tools have several main capabilities. First, they allow you to design tests that can send data to Salesforce’s API or user interface. You can specify what data you’ll send, how often, in what volumes, and so on. Second, they include tools to actually orchestrate and generate that load. Like UI testing tools, they can record and replay user interactions in a browser, but in parallel at an extremely high rate; they can also apply large numbers of API requests in parallel. Finally, these tools record response times and error rates, summarizing this in time series graphs that allow you to monitor throughput and correlate load with response time. Those time series graphs can also reveal side effects like increasing response times on unrelated parts of Salesforce as your main test load increases.

When Do Performance Tests Run?

Unlike the other types of testing discussed in this chapter, performance tests should not be run on an ongoing basis as part of your development lifecycle. Salesforce prohibits large-scale performance testing in sandboxes except by prior authorization. Salesforce provides specific guidance on performance testing at https://help.salesforce.com/articleView?id=000335652&type=1 . Such tests need to be planned and scheduled at least 2 weeks in advance and must be done in conjunction with Salesforce support staff so that they can abort the tests if they cause adverse effects on other customers.

Performance Testing Environments

Salesforce is a multitenant system, and the “pods” that host sandboxes generally have lower spare processing capacity than production systems. The good news is that if something performs well in a sandbox, it will generally perform even better in a production org. Because this testing must be scheduled in advance and involves creating and changing large volumes of data, performance testing should be done in a full sandbox. It can also be done in a partial copy sandbox, but you’ll need to ensure you have enough space available to perform the tests.

Data Needed for Performance Tests

One of the main capabilities of performance testing tools is their ability to generate large volumes of randomized data. This requires time to set up and configure, since the data must be appropriate to the objects and fields under test. If your goal is to diagnose actual performance issues seen in production, then you will definitely want a freshly cloned full sandbox or at least a partial copy sandbox that includes data from all of the relevant objects.

Creating Performance Tests

As with any kind of testing, it’s important to start small and build your tests up gradually. How exactly each test is created is highly dependent on the tool. An older Dreamforce talk on performance testing by Raj Advani and Randy Case32 offered a generalized four-step process for building your performance tests: build a test plan, run a baseline test, identify your target load, and scale up your tests gradually.

After selecting and setting up your performance testing engine, your first step is to build your test plan. This is an essential preparation and is also something you’ll need to submit to Salesforce before you can schedule your test. Your test plan must identify the key business transactions you want to test. Your focus should be on testing your custom code and processes, not on testing Salesforce itself. Even if you discover some performance bottlenecks coming from Salesforce, you won’t be able to fix those, so focus your tests on areas under your own control like Apex code and Processes. Your test plan should assess what data will be needed, what data volumes and rates, and what APIs or UI endpoints you’ll be testing. The Salesforce help article at https://help.salesforce.com/articleView?id=000335652&type=1 can be your guide.

You can and should run multiple baseline tests before actually beginning performance testing. Baseline tests use your performance testing tools to execute small groups of transactions. This lets you validate your scripts and establish baseline expectations for performance.

Based on that initial information, and your expectations about the loads you need to test, you can then identify your target load. Your target number should be realistic. For example, if you estimate you may encounter 200 parallel requests, you do not need to test against 10,000 parallel transactions.

Once you’ve identified your target load, when it comes time to coordinate your load test with Salesforce, you should scale the test up gradually. Begin with half your estimated load, then move to 75%, before finally running the full load.

Performance Testing Considerations

Again, performance testing is prohibited on Salesforce except by prior arrangement and at a scheduled time. You can develop your performance tests by using these tools to load small amounts of data, but if you start generating unusually high volumes of data, you will violate Salesforce’s terms of use.

How to Act on Performance Test Results

The main purpose of performance testing is to gain confidence that your applications will be able to handle normal traffic and some level of surge traffic. If issues are identified, they become the starting point for analysis and remediation. Next steps depend on the issue being uncovered. As mentioned, Salesforce provides excellent resources for addressing large data volume issues, from indexing fields and checking query plans to creating skinny tables and looking at archiving solutions.

Salesforce’s multitenant platform means that the performance of your org is not entirely under your control. And their governor limits mean that you can hit hard limits if you have not designed your applications in an efficient way. But performance testing can give early insights into these issues and make the difference between going live with confidence and experiencing unanticipated failures.

If your baseline tests indicate that a particular process takes 5 seconds to complete, you may be looking to see the variability in that response time. The testing tools can help you determine which stages in each transaction take the most time or are subject to the most variance or errors. Identifying and remedying a small number of performance hotspots can make a huge difference in the eventual performance.

Code Reviews

Having extensively discussed various forms of automated tests, we now look at code reviews, a manual form of nonfunctional testing. Code reviews are one of the most powerful methods of ensuring consistent high-quality code, providing training to developers, and ensuring that more than one person is familiar with every line of code in the system. Code reviews can be performed by one or more senior members of the development team, or they can be peer reviews done by other members of the development team. An Extreme Programming (XP) version of the code review is “pair programming” where developers always work in pairs, taking turns having “hands on the keyboard,” but applying both of their minds to the problem at hand.

Coding standards are sets of rules adopted by development teams based on collective experience and wisdom. These standards are typically adopted by an organization, but may vary slightly from project to project. These standards may also differ for each programming language (Apex, Visualforce, JavaScript, Python, etc.) but typically concern metadata organization, indentation, commenting, declarations, statements, white space, naming conventions, programming practices, and the like. The main advantage of defining and holding true to standards is that every piece of code looks and feels familiar. Consistent organization makes code more readable and helps programmers understand code written by others more quickly.

If coding standards are followed consistently throughout a project and across an organization, code can be more easily extended, refactored, and debugged.

Using coding standards in the development process is important to programmers for a number of reasons:
  • Software is almost never maintained for its whole life by its original author.

  • Enforcing collective standards reduces the time and cost required for software maintenance.

  • Code conventions improve the readability of software, allowing programmers to understand unfamiliar code more quickly.

One easy way to ensure that code adheres to coding standards is to include (and effectively use) a code review step in the development process. This ensures that you always have at least two sets of experienced eyes on all of the code on your project.

How to Perform Code Reviews

Often, code reviews are the responsibility of a project’s Tech Lead or Dev Lead, but that may vary by project. Through the code review process, reviewers are able to coach developers, provide feedback on code quality, and ensure the delivery of high-quality code. Everyone learns and improves along the way.

Consider the following suggestions to improve code quality:
  • Follow the programming language style guide for the language(s) being developed in.

  • Give descriptive names for methods and variables.

  • Do not overdesign.

  • Use efficient data structures and algorithms.

  • Create proper test classes and modularize code.

  • Document any complex manual steps, provide scripts to simplify them if possible, and try to make your code self-documenting as much as possible.

  • Keep all elements of your project in a version control system.

By sticking to these points and using the code quality analysis tools suggested earlier, project teams can create more readable, reliable, and manageable code. Improved code quality helps development teams work quickly and safely, which benefits them and the businesses they support.

When Are Code Reviews Performed

As mentioned, code reviews can either be done at the same time as development using pair programming, informally after the fact through peer review, or as part of a formal code review process, possibly using a pull request.

One of the most important recommendations to come out of the State of DevOps Reports was based on analyzing the relationship between team performance and how they managed code reviews and change approvals. As summarized in the book Accelerate:
  • [The State of DevOps Survey] asked about four possible scenarios:

  • 1. All production changes must be approved by an external body (such as a manager or change advisory board).

  • 2. Only high-risk changes, such as database changes, require approval.

  • 3. We rely on peer review to manage changes.

  • 4. We have no change approval process.

  • The results were surprising. We found that approval only for high-risk changes was not correlated with software delivery performance. Teams that reported no approval process or used peer review achieved higher software delivery performance. Finally, teams that required approval by an external body achieved lower performance.

  • We investigated further the case of approval by an external body to see if this practice correlated with stability. We found that external approvals were negatively correlated with lead time, deployment frequency, and restore time, and had no correlation with change fail rate.33

Based on that analysis, they recommend using a lightweight change approval process. Furthermore, while pull requests are an ideal approach for reviewing untrusted contributions to open source projects, using them for code reviews within a team implies the use of feature branches instead of trunk-based development and thus can interfere with a team’s velocity and ability to refactor.

Such recommendations contrast sharply with the practices and expectations of many teams, especially those subject to regulatory compliance. If you’re in doubt, I strongly recommend you read the three-page discussion on this topic in the book Accelerate. In brief, it is logical that external reviewers with limited understanding or time to evaluate changes will add little or no benefit compared with the teams who have spent days creating and testing a feature. By contrast, pair programming or intrateam reviews bring a more educated review.

Peer review together with ensuring that all changes are tracked and deployed through the delivery pipeline will satisfy both the letter and the spirit of “segregation of duties.” Review processes cannot guarantee that a change won’t lead to a failure, but being able to quickly deploy small changes reduces the risk of such deployments. It also allows for easier debugging, faster rollback, and an efficient feedback loop to the developers who most need to learn from any failures.

Code Review Environments

Code reviews are based directly on the source code and so don’t require a test environment. They are most effective when done as peer reviews or pair programming, but can also be done by reviewing pull requests in a version control system like GitHub.

Data Needed for Code Reviews

Code reviews are based directly on the source code and so don’t require any test data.

Performing Code Reviews

Code reviews provide an opportunity to give feedback on coding style, the efficiency of the logic, naming conventions, and many other aspects of coding. Importantly, code reviews also transform development from an isolated activity into a social and collaborative activity.

Part of the mythos surrounding programming casts it as a solitary activity done by socially awkward individuals who are more comfortable interacting with machines than with humans. In the United States, that stereotype layers in images of young, white males binging on junk food and working mostly at night. Those stereotypes are so strong that they have affected university enrollment in computer science programs for decades, leading to gender and racial imbalances across the IT industry.

In reality, programming never happens in isolation from the business or social needs it’s serving, and there are many social networks (in the original, human sense) that support people in building code and making wise decisions.

Code reviews provide a way to transfer knowledge organically throughout a team and so avoid knowledge getting overconcentrated in one person. They lead to standardization and improved code quality, as well as deep camaraderie between those involved.

It was W. Edwards Deming, the father of industrial modernization, who debunked the myth that domineering and judgmental managers help ensure a quality product. Deming’s 14 points for management emphasize the importance of driving out fear, breaking down barriers, eliminating performance reviews, and focusing on pride in workmanship and achieving a common goal.

Although this book is full of exhortations to automate processes, it’s in the spirit of freeing teams to focus on productive, valuable work. Software is a codification of shared knowledge and so must necessarily be a shared activity. Code reviews provide a perfect opportunity to carry that out.

Manual QA and Acceptance Testing

Having discussed both functional and nonfunctional testing, we finally look at manual testing. Whereas code reviews are part of structural analysis, looking “under the hood” at how applications are built, manual testing involves manually checking whether the application functions as it should. This can be done by specialized members of the QA team, or developers can take turns performing QA for one another’s work.

Manual testing is mentioned last, only after extensive discussion of many automated test methods, not because it is not important, but because the time and skills of testers will be far more valuable when used to supplement and extend the kinds of automated testing described earlier. Where automated tests are available, they are cheaper, faster, and easier to run than manual tests. Wherever possible, QA resources should focus on exploratory testing and testing one-off scenarios that don’t justify automation.

Manual testing is a critical aspect of the development process. But one of the highest value activities that testers can do is discern when a test should be automated and help to implement effective automated tests. By automating critical aspects of the system, and aspects that are brittle or require repeated testing, the skill and energy of testers can remain focused on exploratory testing and high-value analysis that cannot be automated.

There are typically two phases of manual acceptance testing. The first stage is performed by members of the development or QA team themselves, prior to making functionality available to potential users. The second stage involves user acceptance testing (UAT), getting feedback from actual subject matter experts.

Prior to UAT, the development or QA team should perform a round of manual exploratory testing as a sanity check to confirm that functionality seems to be working as specified. Catching issues at this stage is far preferable to exposing UAT testers to obvious bugs. Not only does this allow bugs to be caught earlier, but UAT testers are often performing this testing part-time, alongside their regular responsibilities, and will quickly grow weary of sending obviously defective work back to development teams.

How to Do Manual QA and UAT

There is a distinct difference between the skills and attitude of a developer and the skills and attitude of a tester. Developers focus on building things and moving on as quickly as possible; testers focus on breaking things and not moving on until they are confident something won’t break.

With increased emphasis on automated testing, test-driven development, and other methods of building in quality, the role of manual QA sometimes comes into question. Teams are still experimenting with variations on the role of traditional testers, to see what works best. Understandably there are tradeoffs with all approaches.

The leanest approach is to simply rely on developers to test their own work. While developers should certainly test as much as possible, this doesn’t tend to work well. Development takes a massive amount of mental energy and is sometimes done under significant time pressure. Exhaustion and wishful thinking can combine to make developers overoptimistic that they can quickly move on to their next task. Excessive familiarity causes developers to make assumptions about how users will behave and makes them stay close to the “happy path” of using the application the way it was designed.

Salesforce’s own IT teams are among the groups who have experimented with a variation on this, which is to have developers alternate between doing their own work and testing the work of others. Like peer reviews, this is a fantastic way of knowledge sharing and encouraging dialog around solutions. But even when testing the work of others, developers still display a bias toward moving things along rather than trying hard to break them.

Good testing is indeed a specialized skill, and although the role of QA testers is evolving quickly, it’s not going away any time soon. Testers require patience, a tolerance for repetitive behavior, and an eye for how applications might break when used in unanticipated ways. QA testers hold institutional memory of the most common failures that occur and can remain watchful to ensure developers don’t introduce regressions.

QA testers engage with developers in a dialectical way, representing the realistic viewpoint that whatever can break will break. I’ve underrepresented the role of QA, since I come from the development side, but it suffices to say that realism lies somewhere in between the optimism of developers and the pessimism of testers. Therefore they should continue to work together to deliver the best results.

When to Do Manual QA and UAT

As mentioned, manual testing should be done on work that has passed all automated tests and can thus be considered a candidate for release. This allows testers to use their time more effectively to focus on exploratory testing.

User acceptance testing (UAT) is done once a team’s internal testers are satisfied that work meets specifications and may be ready for use. User acceptance testers are generally subject matter experts (SMEs) from the business team that has requested or will use the application. On large transformation projects, there will typically be a UAT phase of the project when SMEs work full or part-time to evaluate the systems that have been built to ensure that they behave correctly under realistic conditions and represent an improvement over what’s currently in use.

QA and UAT Environments

QA testing is a great candidate for shifting left and being done in a scratch org. This can allow for very fast feedback, since QA can provide feedback in the same scratch org that the developer is using or in a “Review App”—a scratch org spun up directly from the CI system. This requires that you have test data stored in your code repository that is sufficient to support the testing needs of the QA team. Although maintaining test data in a code repository may be a new process for QA teams who are accustomed to testing in a long-lived sandbox, it provides a powerful method to curate and improve a reasonable set of test data that is seen by both developers and testers and which is regularly reset.

A key concept in Lean software development is to enable every worker to pull raw materials to do their job whenever they have capacity. For QA testers, their raw materials are features or fixes under development. In the absence of automated deployments, QA teams are left waiting for deployments to happen, which is a massive source of waste. Automating deployments reduces this waste, and allowing QA testers to create their own scratch orgs to evaluate work in progress is an excellent example of workers being able to pull raw materials in.

UAT testing should be done in a production-like environment, a partial or full sandbox. This allows UAT testers to experiment with familiar data, including complex scenarios that they have to handle during their daily work. This also ensures that they are testing in an environment that is fully integrated with external systems. If functionality is automatically deployed to this production-like environment, and behaves properly in the face of realistic data and live integrations, then the same results can be expected in production.

Data Needed for QA and UAT

QA teams typically spend significant time creating, cloning, and updating collections of manual testing data that they can use in their tests. Traditionally, teams use a single QA sandbox, since this allows them to establish that testing data and share it across the testing team. There’s usually opportunity to make the process of creating test data more efficient.

Data management tools like OwnBackup and Odaseva allow you to anonymize and import collections of data from production that can be used by testers. Salesforce DX also includes mechanisms like data:tree:export to export collections of data into version control so that it can then be loaded into scratch orgs for testing.

Effective practices for managing test data are still evolving, but wherever possible it’s important to export test data in a form that can be reused, so that QA teams are not shackled to a single org that never gets refreshed for fear of losing manually curated testing data.

UAT data should match the data from the actual production org, which is why partial and full copy sandboxes are the right place to perform such testing. Data management tools give you the flexibility to selectively migrate data into developer sandboxes, but it’s almost certainly more efficient to just use a sandbox that includes a copy of production data.

The most important data for UAT is actually the configuration data that determines business logic and essential information such as Products and Pricebooks. After that, it’s critical that key Accounts, Opportunities, and so forth are created that match the production system. UAT testers are uniquely able to exercise edge cases that might be likely to fail. But to do this, they need to have familiar data from production.

QA and UAT Test Cases

For formal testing, it’s common to create test cases, which are sequences of steps needed to perform certain transactions. This is particularly important for QA testers since they may be unfamiliar with business needs. But even UAT testers can benefit from having explicit test cases generated by a senior member of their team so they can ensure they are testing all the features that are under development.

QA and UAT Considerations

There’s much more that can be said about these areas, and there are people who devote their careers to managing teams of testers and facilitating user acceptance testing. The reason for initiating this discussion here is to show where it fits in an increasingly automated process of delivery and testing.

How to Act on QA and UAT Feedback

The final result of testing is either approval and release or sending the work back to developers. In either case, the more time elapses from when features are first developed, the less efficiently developers will be able to implement any feedback. Passing UAT does not mean that features will actually be accepted and bug-free in the hands of large groups of users. Users always have the last word in testing, and they too need feedback mechanisms to express approval or to log issues.

Developers genuinely want to build the right things, and to build things right. There’s an enormous amount of creativity and effort that goes into building things, and developers are generally excited to share the results with users or to improve applications based on their feedback. But just as when giving feedback to pets or children, the more time elapses the less effective that feedback becomes.

I confessed earlier that developers just want to get work out the door. And this entire book focuses on helping teams get their work out the door more quickly. But that doesn’t mean that anyone benefits from shipping unreliable, half-baked functionality to production. The point of automating delivery is to get features to QA, UAT, and end users with the highest quality in the shortest time. Feedback from testers is the critical end result of expediting delivery and is the most effective way to improve the product and developers’ understanding of the real needs.

In the words of Jez Humble, “A key goal of continuous delivery is to change the economics of the software delivery process to make it economically viable to work in small batches. … Working in small batches … reduces the time it takes to get feedback on our work, makes it easier to triage and remediate problems, [and] increases efficiency and motivation.”34

Summary

Quality can be a moving target, challenging to define, and impossible to perfect. But by considering these various aspects of quality—functional, structural, and process—teams are enabled to be more effective in achieving a design that will meet both present and future needs. By keeping a focus on quality and adopting a discipline of continuous improvement, the goal of long-term user satisfaction becomes far easier to achieve.

Table 8-1 summarizes the different types of test described in this chapter.
Table 8-1

A summary of different types of test and their characteristics

 

Automated

Environment

Speed

Purpose

Technology

Fast Tests for Developers

     

Linting

Yes

IDE

Real time

Coding style, common faults

PMD, SonarLint, ESLint

Quality gates

Yes

Desktop, CI engine, web application

Fast

Code issue overview, duplicate detection, trends

PMD, SonarQube, Clayton, Copado

Unit tests

Yes

Scratch org, Dev sandbox, or locally

< 5 min total

Fast feedback for developers

Apex, Jest/Mocha

Comprehensive Tests

     

Code-based acceptance tests

Yes

Scratch org, test sandbox, CI job

Minutes to hours (run in parallel)

Comprehensive regression tests

Apex, Jest/Mocha

UI tests

Yes

Scratch org, test sandbox

Minutes to hours (run in parallel)

Regression testing critical parts of complex applications

Selenium, Provar, Puppeteer, Tosca

Static analysis

Yes

CI job, static analysis tool

Fast

Tracking trends and identifying quality hotspots

SonarQube, Clayton, CodeScan, PMD, Codacy, CodeClimate

Security analysis

Yes

Security analysis tool

Minutes

Sophisticated identification of security flaws

CheckMarx, Fortify

Performance testing

Yes

Full or partial sandbox

Minutes to hours (run in parallel)

Occasional and targeted performance analysis

JMeter, LoadRunner, Loader.io

Code reviews

No

In-person or using pull requests

Real time or later

Code quality analysis, shared learning, collaboration

Fellow developers

Manual QA and acceptance tests

No

Testing sandbox

Indefinite

Exploratory testing and getting feedback from users

Mouse, keyboard, monitor, human

The purpose of testing is to ensure quality, and the process of testing is facilitated by promoting features and fixes to progressively higher environments for progressively more sophisticated testing. This process of promoting features is called deployment and is the heart of continuous delivery. In the next chapter, we discuss mechanisms and techniques to make your deployments as fast and painless as possible, thus allowing your testing and release process to proceed as smoothly as possible.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.187.199