Chapter 13. Visual Regression Testing

Tell me if this is a familiar scene: you’ve been working on your website contact form for the last few weeks, trying to tweak and nudge the form fields until they look exactly like the Photoshop mockup. You’ve meticulously compared every margin, padding, border, and line height. Your lead generation tool is now “pixel perfect,” and the product owner agrees: “This is the contact form to end all contact forms.” With this code securely committed, you move on to your next task and stop having recurring nightmares about browser drop-down rendering discrepancies.

Weeks later, you are surprised to find a familiar sight in your ticket queue: the contact form. Some designer, business analyst, or quality assurance engineer took a ruler to your design and compared it to the latest Photoshop comp, and found a list of discrepancies.

But why?! How?! Did someone break your code? Did someone change the design? Tracking down the culprit is a luxury you do not have time to pursue. So you sit down with a list of tweaks and get to work hoping that this is the last time you’ll have to touch this contact form, but resign yourself to the fact that you’ll probably see it a few more times before the site launches.

The Usual Suspects

My favorite sound in the world is the cry of a decision maker as they scream how “feature X is totally broken!” Translated into developer terms, this usually means that a few of the font styles are incorrect, or that some vertical rhythm needs fixing. It doesn’t matter if the feature had been signed off and agreed upon, there is a difference between what is live and the specific version of the Photoshop file the decision maker has been poring over for the past week.

Having had this happen to me over and over again, let me explore a few of the common reasons that this merry-go-round has so much trouble stopping and letting you off.

Unknowing Developers

Any code that you can write without defect can be broken by just a few errant lines from another developer. Someone else, working on some other form component, didn’t realize that the classes they were styling were shared with your contact form. These changes could have happened in the weeks since your code was committed, or they could have been written at the exact same time as you were working.

Small cosmetic changes to unrelated pages are often overlooked by the QA team. With dozens or even hundreds of pages to test, there is no way that they would catch a two-pixel change to a label’s font size.

Inconsistent Designs

Allow me to let you in on a dirty little secret about Photoshop. When a designer changes the font size of a form label in one file, it doesn’t magically change in all of the designer’s PSD files. Sadly, there isn’t a single sheet prescribing all of the element styles in a cascading fashion. Even if all of the designers communicate this font size change, this doesn’t magically update every PSD trapped in an email thread, Basecamp conversation, or Dropbox folder.

Depending on which designer, business analyst, or QA engineer is reviewing the contact form, and which version of whatever PSD they happen to be looking at, there is a 9 in 10 chance on any given day that your form has a defect (and therefore is totally broken). A new story is created to address these defects, and you can only hope that the changes you are making aren’t going to make even more work for you the next time a designer takes a peek at the contact form.

Waffling Decision Makers

According to the law of infinite probability, given enough features, pored over by enough decision makers, there is a 100% chance that someone will find something that they want to change.

Change is inevitable, and given the proper development model, it is completely acceptable. But when change masquerades as defects (or a distinction is never made), developers end up spending a ton of time building features that are nothing more than prototypes.

There is nothing wrong with prototyping a feature before releasing it to the public—in fact, it’s generally a really good practice! But prototypes need to consist of quickly iterated designs ending in a final, agreed-upon product. Asking a developer to create a single prototype every sprint cycle, and then revising it every other sprint, is not only a great way to hobble a developer’s productivity, but is a horribly inefficient way to prototype.

A Tested Solution

While each of these scenarios highlights some deeper, organizational issues, they can all be mitigated by a single thing: proper test coverage. For this type of coverage, we aren’t testing the valid response of a JavaScript function, but rather we’re capturing the visual appearance of an approved design system and validating that we have not deviated from that system. Capturing these visual regressions before they are committed is the key to maintaining a sustainable design system.

Visual regression testing allows us to make visual comparisons between the correct (baseline) versions of our site and versions in development or just about to be deployed (new). The process is nothing more than taking a picture of the baseline and comparing it to the new, looking for differences in the pixels.

With these baseline images either committed to the repo, or marked approved in some testing database, we now have a record of the exact signed-off, agreed-upon pixels that make up any particular feature (in our case, a contact form). Before any line of code is committed back to the master branch, visual regression testing gives us a method to test every feature in our site, and make sure that nothing unexpected has visibly changed.

We will also be guarded against bug reports that are nothing more than inconsistencies from one PSD to another. With a signed-off baseline committed to our codebase, we can run our tests and confidently reply that our code is correct, and that the error must be in the Photoshop file. In the same way, we will be able to discern between actual bugs and YACR (yet another change request).

The Many Faces of Visual Regression Testing

Visual regression testing comes in many different flavors, using a variety of technologies and workflows. While there are new tools being released into the open source community all the time, they typically include a combination of a small set of features. Here are a few of the categories that most tools fall into:

Page-based diffing
Wraith is a good example of page-based diffing. It has a simple YAML setup file that makes it very easy to compare a large list of pages between two different sources. This approach is best used when you aren’t expecting any differences between the two sources, such as when you are comparing pages from your live site with the same pages in staging, just about to be deployed.
Component-based diffing
BackstopJS is a great tool for doing component- or selector-based diffing. Instead of comparing images of the entire page, a component-based tool allows you to capture individual sections of a web page and compare them. This creates more focused tests and removes the false positives when something at the top of the page pushes everything else down, and everything comes back as changed.
CSS unit testing
Quixote is an example of a unique class of diffing tools that look for unit differences instead of visual differences. Quixote can be used to set TDD-style tests where the test describes values that should be expected (title font-size is 1em, sidebar margin is 2.5%) and checks the web pages to see if those assertions are in fact true. This is a great approach for testing trouble areas such as the width of columns in a layout that keeps breaking. Or it can be used to assert that branding protocol has been followed and the logo is the correct size and distance away from other content.
Headless browser driven
Gemini is a comparison tool that can use PhantomJS, a headless browser, to load web pages before taking screenshots. PhantomJS is a JavaScript implementation of a WebKit browser. This means that it is incredibly fast, and consistent across various platforms.
Desktop browser driven
Gemini is unique in that it also supports running tests using traditional desktop browsers. To do so, Gemini uses a Selenium server to open and manipulate the OS’s installed browsers. This isn’t as fast as a headless browser, and is dependent on the version of the browser that happens to be installed, but it is closer to real-world results and can catch bugs that might have been introduced in just a single browser.
Includes scripting libraries
CasperJS is a navigation and scripting library that works with headless browsers like PhantomJS. It allows tools to interact with the pages opened in the browser. With CasperJS, you can move the mouse over a button, click on the button, wait for a modal dialog, fill out and submit a form, and finally, take a screenshot of the result. CasperJS even lets you execute JavaScript on the pages within PhantomJS. You can hide elements, turn off animation, or even replace always-changing content with consistent, mock content to avoid failures when the “newest blog post” gets updated.
GUI-based comparison and change approval
Projects like Diffux store test history, and provide test feedback inside of a web-based graphical user interface. Baseline images are stored in a database, and any changes to those baselines must be approved or rejected inside of the app. These types of tools are great when you have nontechnical stakeholders needing to make the final decision on whether the changes are correct or not.
Command-line comparison and change approval
PhantomCSS is a component-based diffing tool, using PhantomJS and CasperJS, that runs solely in the command line. Test runs are initiated via a terminal command, and the results, passing or failing, are also reported in the terminal. These types of tools work especially well with task runners like Grunt or Gulp, and their output is well suited for automation environments like Jenkins or Travis CI.

We’ll cover PhantomCSS and how to integrate it into your project in the next chapter, which takes a look at Red Hat’s approach to testing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.82.154