Chapter 4. The Test Engineering Manager

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4. The Test Engineering Manager

As test engineers (TEs) and software engineers in test (SETs) labor to support the user and developer, respectively, there is one role that ties them together, the test engineering manager (TEM). The TEM is an engineering colleague taking on important work as an individual contributor and a single point of contact through which all the support teams (development, product management, release engineers, document writers, and so on) liaise. The TEM is probably the most challenging position in all of Google, requiring the skills of both the TE and SET. He also needs management skills to support direct reports in career development.

The Life of a TEM

The cadence of a Google testing project doesn’t just happen with every TE and SET doing his job as we’ve described in this book. The TEM has a leadership and coordination role of the TEs and SETs who report to him. In general, TEMs report to the director of test who may have many such reports.¹ Test directors all report to Patrick Copeland.

The technical, leadership, and coordination roles of the TEM are generally something that Googlers grow into, and most of the TEMs at Google come from its own ranks as opposed to outside hires. External people are generally (but not always) hired into individual contributor roles. Even James Whittaker, who was hired as a director, didn’t get any direct reports for almost three months.

Among current TEMs, well over half came from the TE ranks, which isn’t surprising given the broad focus of that role. TEs manage projects and keep a projectwide focus, so it is a smaller step into managing the people who work on a project. TEs understand a broad swath of the functional space of their application and come into contact with far more of the engineers working on it than the average SET. However, there is no correlation between success as a TEM and success in either the TE or SET role. At Google, success is a collective issue and we work hard to choose the right managers and we work hard to ensure they are successful.

The first advice toward this end is know your product. Given any question about how to use the product, the TEM should be an expert. If you are the TEM for Chrome, I should be able to ask you how to install an extension, change the browser’s skin, set up a sync relationship, modify proxy settings, view the DOM, find where cookies are stored, find out how and when it gets updated to a new version, and so on. These answers should come so quickly to a TEM’s lips that whatever thought it took to retrieve them from memory is instantaneous. From the UI to the guts of the backend data center implementation, a TEM should know his product cold.

I recall asking the Gmail TEM once why my mail was loading slowly and got an explanation about how the Gmail server parts work and about how a problem in a remote data center happened that weekend. It was way more detailed than I wanted. But it was clear the guy knew how Gmail worked and he was up to date with the latest field information that affected its performance. This is what we expect from all our TEMs at Google: product expertise beyond everyone else associated with the project.

A related second piece of advice is to know your people. As a manager, a Google TEM is a product expert and understands the work that needs to get done but plays only a small role in actually performing that work. It’s the SETs and TEs who make up the TEM’s report structure that gets the work done. Knowing these people and their skills as individuals is crucial in making sure this work gets done quickly and efficiently.

Googlers might be smart, but they are not plentiful. Every TEM we’ve ever hired from outside of Google makes the comment that their project is understaffed. Our response is a collective smile. We know and we’re not going to fix it. It’s through knowing your people and their skills well that a TEM can take a small team and make them perform like a larger team.

Scarcity of resources brings clarity to execution and it creates a strong sense of ownership by those on a project. Imagine raising a child with a large staff of help: one person for feeding, one for diapering, one for entertainment, and so on. None of these people is as vested in the child’s life as a single, overworked parent. It is the scarcity of the parenting resource that brings clarity and efficiency to the process of raising children. When resources are scarce, you are forced to optimize. You are quick to see process inefficiencies and not repeat them. You create a feeding schedule and stick to it. You place the various diapering implements in close proximity to streamline steps in the process.

It’s the same concept for software-testing projects at Google. Because you can’t simply throw people at a problem, the tool chain gets streamlined. Automation that serves no real purpose gets deprecated. Tests that find no regressions aren’t written. Developers who demand certain types of activity from testers have to participate in it. There are no make-work tasks. There is no busy work in an attempt to add value where you are not needed.

It is the job of the TEM to optimize this equation. If a TEM knows his product well enough, then he can identify the work that is the highest priority and the parts that get appropriate coverage. If a TEM knows her people well enough, then she is able to apply the right skill sets to the parts of the testing problem that need it the most. Obviously, there will be parts that don’t get done. If a TEM is doing her job right, those will be the parts that are the lowest priority or are straightforward enough that they can be contracted out or left to crowd source and dogfood testers.

Obviously, a TEM can be wrong about any of these things and because the role is so crucial, such mistakes can be costly. Fortunately, the community of TEMs is a fairly close-knit set of people who know each other well (another benefit of scarcity is small enough numbers of people so that knowing each other on a first name basis and meeting regularly are actually possible) and share experiences that improve the collective.

Getting Projects and People

Project mobility is a hallmark of being a Google engineer. A general rule of thumb is that a Googler is free to change projects every 18 months, give or take a quarter. Of course there is no mandate to do so. If an engineer loves mobile operating systems, putting her on You Tube is unwise. Mobility enables engineers to sample various projects. Many choose to stay on a project for years or for their entire careers, whereas others revel in getting to know a little about everything Google does.

There are a number of benefits in this culture that a TEM can take advantage of. Namely, that there is a good supply of broadly experienced Googlers available for recruiting at any given time. Imagine being a TEM for Google Maps and being able to draw from the Chrome and Google Docs talent pool! There is a wealth of engineers with relevant experience and a fresh perspective to add to your team at any given time.

Of course the downside of this is losing an experienced Googler to another team. It puts the requirement on the TEM to avoid creating dependencies on people. A TEM cannot afford to simply take advantage of a rock star tester. Whatever it is that makes that tester a rock star needs to be embodied in a tool or packaged in a way to be used to turn other testers onto a path of similar superstardom.

TEMs manage a process at Google that we call allocation. Allocation is supported by a web app that any TEM can post job openings to and any TE or SET can browse for new opportunities. There are no formal permissions from either a current or future manager that must be obtained before doing so and any engineer with an 18-month longevity on a project is free to leave. Obviously, there are agreements about how fast such a transition occurs so as to not adversely affect a ship date or crucial project milestone, but we’ve never seen any controversy associated with a re-allocation in our experience.²

Nooglers are also allocated with the same web app and process. TEMs can view the resume and interview scores of a Noogler and place a “bid” on the allocation. In heavy hiring seasons, there are generally multiple candidates to be allocated to multiple projects. Competing bids are the norm, and TEMs get to argue their case during an allocation meeting attended by a quorum of test directors who vote with Patrick Copeland or someone he designates breaking a tie. The priorities for allocation are, in general:

• A good skill set match between the Noogler and the project. We want to set the employee up for success.

• The Noogler’s wishes. If we can give someone her dream job, she is liable to be a far happier and more productive engineer.

• Project needs. Projects of strategic or economic importance are sometimes given priority.

• Past allocations. If a project hasn’t had any new allocations recently, then they might be due.

Allocation is not something to fret over too much. If a TEM doesn’t get a Noogler one week, he’s liable to the next. On the other hand, there isn’t a lot of downside to the Noogler if he is misallocated because transfers are easy.

Getting new projects is also something a TEM must consider. As a TEM’s experience and reputation grows, that person can be groomed for a directorship by being given multiple large efforts to manage and have not only individual engineers as reports but also more junior TEMs.

Such recruitment is often done by the development organization that schedules a meeting with a reputable TEM and pitches their project in the hopes the TEM will sign on to build up a test team for them. Of course, as the uber Director, Patrick Copeland has the charter to simply assign such work on projects that are deemed strategically important by executives.

The general rule in the case where a TEM can choose to take a project is simply to avoid toxic projects. Teams that are unwilling to be equal partners in quality should be left on their own to do their own testing. Teams unwilling to commit to writing small tests and getting good unit-level coverage should be left to dig their graves in peace.

Projects that seem like pet projects—that is, those that have a poor chance of success or are simple enough that the developers can act as their own testers—must be left alone. There is no good reason from Google’s perspective, our users’ perspectives, or the career of the TEM in question to staff such projects.

Impact

Perhaps the one thing that sets Google apart from other software development companies is the insistence on impact. At Google, you hear the word impact all the time. An engineer is expected to make an impact on the team and his work should be impactful for the product. A test team as a whole is expected to be impactful. The collective work of the team should stand out and the team and product should be better off for it.

The goal for any individual engineer should be to create impact. The goal of a test team should be to create impact. The person responsible for ensuring that a test team is impactful is the TEM.

Promotion decisions revolve around the impact an employee has had on a project. During annual reviews, managers are encouraged to describe their direct reports’ contribution in terms of their overall impact. As engineers progress through the higher levels of their careers, the expectation is that their impact will grow. Because TEMs manage these engineers, they are responsible for their growth and that means ensuring that they can measure their impact.

Managing impact for a team of TEs and SETs is the job of the TEM.

It’s important to note that we are tasking a test team only with impact. We are specifically not asking the TEM and her team to ensure that the product is high quality. We are not asking the TEM and her team to make the product ship on time. We are not going to blame testing if the product is unsuccessful or unloved by users. At Google, no one team is responsible for these things. However, each team is responsible for understanding the project’s goals and schedule and ensuring members do their part to impact all of these things positively. There is no greater compliment at Google than to be called an impactful engineer or, in the case of a TEM, to be known as someone who builds impactful teams.

Impact is a major topic during annual reviews and in promotion decisions. Junior engineers are expected to to their part of the product, more senior engineers are expected to show impact teamwide and product-wide show impact, and to get to the higher levels of the job ladder, it is important to have impact across Google (more about that later).

It is the job of the TEM to build a team capable of such impact and to ensure that every engineer is creating impact appropriate to his or her job level and skill set. Google managers are not meant to micromanage every aspect of the testing process. They are not present during every moment as an ACC is built. They are not going to review every line of code in the test infrastructure. They are going to ensure that each of these artifacts is in the hands of capable engineers who understand how to build them, that they are being used correctly, and that the development team understands their purpose and takes their results seriously. A TEM builds his team and assigns the work as he sees fit, then steps back and lets those engineers do their jobs. When the work starts, it’s the job of the TEM to ensure that it is impactful.

Imagine a test team where every engineer is capable of doing impactful work. Imagine a testing process that has been streamlined to the point where every unit of work performed is purposeful and impactful. Imagine a development team that understands this body of testing work and participates in seeing it through. The job of the TEM is simply to take this imaginary world and make it real.

There is one final job of a TEM that deals with cross-team collaboration. A good TEM, especially an experienced one, is not so insular that he cannot see beyond the boundaries of his own product. Google is a company of some dozens of products all undergoing simultaneous development, testing, and use. Each of these products has one or more TEMs (depending on their size and complexity) and each tries hard to ensure its teams are impactful. As a collective, TEMs must be diligent about identifying the best practices happening in their organizations and be aggressive about communicating those practices to their peers. There is no truer way to show that a practice or tool is impactful than to see it applied successfully to multiple products.

Google is known as an innovative company and test teams at Google certainly bear that out. The number of test practices and tools that have been created and used inside Google (and many of them shipped to the outside world) shows that it is this innovative spirit that helps tie our testing teams together. TEMs don’t collaborate because of some corporate mandate. They don’t collaborate simply because there is a monthly meeting on their calendar. They collaborate because they don’t want to miss out on the chance to adopt some awesome innovation that has emerged from another team. Who wants to be the only tester not using some impactful new tool? Who wouldn’t want to work smarter?

Of course, the same can be said of the exporter of innovation. As good as it feels to have some innovation on your team work for your product, it feels even better to see another team adopt it and then another until eventually the innovation becomes part of the testing fabric of the entire company. Cross-team collaboration must be built on innovation or it will not stand the test of time.

An Interview with Gmail TEM Ankit Mehta

Ankit Mehta is a TEM who rose through the ranks as a TE doing mostly SET-oriented work. He spent his early years knee deep in code and test automation and his first big assignment as a manager was none other than Gmail.

Gmail is not an assignment for the weak at heart. It is a big product with lots of moving parts. It has integration with other Google properties including Buzz, Docs, Calendar, and so on. It also has to interoperate with mail formats from any number of well-established competitors. There is a large backend data center component; remember that Gmail is completely in the cloud and a UI served through any of the major web browsers. As if any additional complexity were necessary, Gmail has an installed base of many hundreds of millions of users who expect it to work when they fire up their browser. They expect fast, reliable, and secure service without a great deal of spam to deal with. Adding new features requires respecting the legacy and this complicates testing a great deal. When Gmail fails, the world learns of it quickly. That is a lot of egg to go on a lot of Googlers’ faces and none are quite so culpable as the primary, in-the-trench test manager.

The authors sat down with Ankit to get the scoop on how Gmail is tested.

HGTS: Tell us how you approach a new testing project. What are the first tasks you perform and the first questions you ask?

Ankit: When I join a project, the first few weeks are spent listening and not talking. It’s crucial to get an understanding of what’s going on and to learn the architecture of the product and the dynamics of the team. I won’t work with a doctor who simply prescribes me antibiotics in the first five minutes of an office visit and I don’t expect a test team to work with me if I just go in with solutions straightaway. You really have to learn before you are in a position to prescribe.

HGTS: We’ve worked together before and don’t take you for the quiet type so I imagine that once you start talking you have a lot to say!

Ankit: Oh yeah! But there is a pattern to it. Over the years, I have learned that the most powerful question you can ask is, “Why?” and that’s pretty much where I go from the listening phase. Why are you running these tests? Why did you write that specific test? Why did you choose to automate this task over this other one? Why are we investing in this tool?

I think people get set in their ways and do things just because they saw someone else doing them or they test a particular feature for no other reasons than that they know how. If you don’t ask them why, they won’t bother questioning themselves because what they are doing has simply become a habit.

HGTS: So what answers to why do you accept as good ones?

Ankit: Two answers: 1) that it helps improve the quality of the product or 2) that it helps improve the productivity of the engineers building the product. Anything else just isn’t as high a priority.

HGTS: The Gmail team is known for its focus and productivity so I guess that’s where it comes from. But beyond limiting their focus to quality and productivity, what other advice do you have for test managers to build a robust work culture?

Ankit: Well, team dynamics are crucial. I sincerely believe the quality of a product is tied to the quality of the testing team. You have to have people with the right skills and the right attitude and you have to get these people doing the right things. This is especially critical for the senior people on the team as so much of the culture and momentum comes from these people. For Gmail, it took me three to six months to build the team I actually wanted, a team that was cohesive and understood each other’s role. Once you get a good team, it’s resistant to one or two personalities who don’t quite fit.

An important part of this team dynamic is the relationship between the dev and test teams. This was not so good when I joined; the teams operated as completely separate and the value proposition of the test team was pretty much lost on the dev team. It wasn’t healthy.

HGTS: Let’s talk about this because it’s clearly something you solved. Can you walk us through what you did to fix this culture issue?

Ankit: When I first joined Gmail, the test team was fixated on a bunch of WebDriver tests that ran with each build. They’d watch the tests run and change from green (passed) to red (failed) and made this herculean effort to fix the tests so they all ran green. The development team really didn’t question this behavior too much because the tests usually found a few important problems that justified their existence. But there were weeks when a lot of code changed and the tests couldn’t be fixed fast enough. The whole process was flaky and not resilient to changes to Gmail. It was an overinvestment in something that was too much work for its ultimate impact.

Because I was new to the project, I could see things that perhaps the others couldn’t and it seemed to me that latency was the biggest issue for Gmail. Seriously, the primary attribute of Gmail from a customer’s point of view is its speed. I figured that if we could solve this problem for the SWE team, it would earn us enough respect that we could begin to establish ourselves as equals.

It was a hard problem. We had to measure Gmail’s speed across old and new versions and notice when a new build got slower. Then we’d have to sift through all the code changes in the new version to figure out which one caused the latency and then drive a fix. It was painful and time-consuming and lots of trial and error.

I worked with one of the SETs on the team to figure out a way to slow down Gmail so we could better observe and control the communication between the frontend and the data center and find regressions that affected performance. We ended up collecting a bunch of old machines from anywhere we could get them and stripping out any high-end components until we were left with a roomful of boxes with 512MB of RAM, 40-GB disks, and low-speed CPUs. We slowed Gmail down enough that we could distinguish the signal from the noise and then started long-running tests that would further stress the system. The first couple of months were tough. We had few results and some false positives. We were busy setting up the infrastructure and not producing any impact. But then the regressions started rolling in. We were able to measure the regressions on the order of milliseconds and back it up with data. SWEs were able to find latency regression in hours instead of weeks and debug them while they were still fresh as opposed to weeks later. This earned immense respect for the test team and when we embarked on the next couple of high-priority tasks (fixing end-to-end tests and getting an effective load test infrastructure running), the SWEs actually volunteered to help. The whole team saw the value of effective tests, and Gmail releases went from quarterly to weekly and then to daily releases for a subset of our production customers.

HGTS: So that’s the lesson: Pick something hard and solve it to get respect. I like it. But then how do you follow up after that?

Ankit: Well, there are always hard problems to solve! But yeah, the general idea is to keep the focus on what is important. We identified the top Gmail problem and we came together and solved it. Then we went down the list and once we came together as a team, nothing was too hard. But I am a big believer in keeping things focused. Whenever I see a team trying to do too much, say working on five things and doing only 80 percent of all five, I have them step back and prioritize. Drop the number of things you are doing to two or three and nail those 100 percent. This gives the team a sense of accomplishment without unfinished work hanging over their heads and if all this works ends up impacting the quality of the product, then I can feel pretty good about it, too.

HGTS: Google managers have a notoriously high number of direct reports and are still expected to contribute technically. How do you balance this? Can you tell us about your own engineer work?

Ankit: All those directs and all that time spent coordinating others’ work makes for a lot of distractions. There are really two things I have done to stay technical and contribute at the level of an engineer.

First, as I brought the SWE and SET teams together, there was always enough work to go around that I could save a chunk for myself. I was actively involved in the design phase of my vision and stayed with the project long enough to write tests myself.

Second, and really this is the crucial part, if you are going to do technical work, you have to get away from the management distractions. At first, I took a couple of days a week to work on things myself. One of my projects was to integrate Google Feedback into Gmail and this gave me a great SWE perspective on testing. Whenever I ran into a flaky test or some part of the test infrastructure that slowed me down, I understood how the full-time SWEs viewed our work. But still, whenever I was in Mountain View, people would manage to find me so I took to visiting the Gmail team in Zurich. There is a lot of peace and quiet to be found nine time zones away and as I was no one’s manager there, I could blend into the engineering team without a lot of notice. I got a lot of work done in Zurich!

HGTS: Do you have any advice on staffing a testing project? What’s a good developer to test ratio? What about SET versus TE?

Ankit: Well, on hiring, it’s easy. Never compromise. Hiring the wrong person just to get a headcount is always worse than waiting for a better fit. Hire only the best. Period. Google doesn’t release ratio numbers, but early on we were more densely populated with testers and had more than the norm. But after we solved a lot of the initial problems and got the SWE participation up, we fell back to numbers around the Google standard. In terms of the mix of skills, what worked for Gmail was about 20 percent of the testers doing exploratory testing. Any product where user experience is important must undergo exploratory testing. Another 30 percent were TEs who focused more holistically on the product and worked with the SETs to ensure maximum impact of their work, and 50 percent were SETs working on adding relevant automation and tools that would keep the codebase healthy and improve developer productivity. I can’t say that will be the formula I’ll use on my next Google assignment, but it worked in Gmail.

HGTS: We understand you’ve taken the reins for testing Google+. What are the lessons from Gmail that you’re finding are the most valuable for your new product?

Ankit: Well first, don’t put all your energy into the frontend. Gmail has what might be the largest distributed backend out there and there are some juicy testing problems I didn’t get to. Other than that, there are a lot of lessons:

• Write tests in the same language the application is written in.

• Make sure the person writing the feature is the person responsible for ensuring the tests for that feature get executed. Tests that get ignored become a liability.

• Focus on infrastructure that makes writing tests seamless. Tests needed to be easier to write than they are to skip.

• 20 percent of the use cases account for 80 percent of the usage (give or take!). Automate the 20 percent and don’t bother with the rest. Leave them for the manual test passes.

• This is Google, speed matters. If users expect one thing from us, it’s speed. Make sure the product is fast. Profile it so you can prove it to everyone else.

• Collaboration with development is crucial. If you don’t have it, you are nothing more than damage control and that is not a good place to be.

• Innovation is part of Google’s DNA. The test team has to be seen as innovators, too, seeing important problems and devising innovative solutions.

HGTS: Have you noticed any traps that engineering teams fall into?

Ankit: Yes, assuming we know what users want and rolling out massive changes or new features without pushing them as small experiments first. What good is a bunch of great test infrastructure for a feature that is withdrawn from lack of user interest? Put out a version to a few users and get some feedback before investing in a lot of test automation.

Also, trying to build the perfect solution so that you take too long and the market moves past you. Iterate quickly and show incremental progress.

Finally, you have to find the sweet spot for testing. Write them too early and the architecture changes under you negating all your work. Wait too long and you might delay a launch for lack of proper testing. TDD is the way to go.

HGTS: And what about individuals? Have you noticed any traps that young TEs or SETs fall into on new projects?

Ankit: Yes, they get in over their head. They write lots of tests without really thinking about their full purpose or how they fit into the overall testing process. By writing them, they often don’t realize they signed up to maintain them. SETs need to always remember that testing is the job of the developer and they should concentrate on getting testing into the workflow of the developer. We write tools to support this so that it is the developers who maintain the tests as they maintain the code. SETs can then focus on making the tests run faster and produce better diagnostics.

TEs sometimes fall into the same trap and act like SETs. We want TEs to take a more holistic approach. Get the entire product under control. Focus on tests from a user’s perspective and help SETs and SWEs ensure that all tests and test infrastructure is actively and effectively used. Tooling and diagnostics written by TEs should impact the entire product.

HGTS: Besides the latency automation you spoke about earlier, what are some other general testing achievements that paid huge dividends for Gmail?

Ankit: JavaScript automation. We baked in an automation servlet in Gmail itself, which enabled developers to write end-to-end tests in the same language the frontend was written in. Because it used a lot of the same methods and libraries used, developers were familiar with how to write tests. No learning curve. They could easily write tests to see whether their new feature broke Gmail functionality and could also better protect their own features from being broken by any other developer. Every feature of Gmail now comes with at least one test written for this servlet. The nice part of this is that I am using this on my new job in Social. We’ve already seen some 20k automated tests!

Another is load testing. You simply can’t get by without it at Google as all our applications are heavily used and our data centers can be busy places. We basically had to mirror a typical production setup complete with typical user traffic loads. We spent months analyzing production usage to build a representation user model. Next, we reduced variance in the data by running the load test on exactly the kind of data center machines Gmail runs on. We then launched two parallel test runs: test and control. We monitored them to watch for differences. We detected a lot of regressions and narrowed down the issue for developers.

Finally, the focus on bug prevention versus bug detection paid big dividends. We pushed automated tests early in the presubmit process and kept a lot of poor code from the build. This keeps the test team ahead of the curve and working on builds that are high enough quality to present a good challenge for our exploratory testers.

HGTS: So now that you’ve done this a couple times and are moving into Social, what will you look for in new hires to that test team?

Ankit: I want to find people who don’t get overwhelmed with complexity and when faced with a hard problem can translate it into concrete steps that need to be solved. And then, of course, solving them! I need people who can execute and are energized by aggressive timelines rather than overwhelmed by them. I want people who can achieve the right balance between innovation and quality and have ideas beyond simply finding more bugs. And on top of all this, I want to see passion. I want people who really want to be a tester.

HGTS: And that brings us to our last question. What drives your passion for the testing space?

Ankit: I like the challenge of fast iterations and high quality, two seemingly conflicting goals of equal importance. It’s a classic struggle and one that drives me to optimize them both without breaking either myself or my team! Building products is easy. Building them fast and of high quality is the grand challenge that makes my professional life challenging and fun.

An Interview with Android TEM Hung Dang

Hung Dang is an engineering manager at Google who was hired into his position as a manager with the express purpose to lead the Android test team. Prior to joining Google, he rose through the ranks as an engineer at Apple and Tivo.

The strategic importance of Android is hard to overstate. In many ways, it is as big a hit as our Search and Ads business, another one of the Google collection of product home runs. As such, it gets significant executive attention and draws a lot of top-class talent. It’s also a codebase that is growing fast and it’s nothing for a single build to have hundreds of new code changes in a single day. It’s an operating system, a platform, runs thousands upon thousands of apps, and has a thriving developer ecosystem. The number of devices that run Android is increasing quickly and includes phone handsets and tablets from any number of manufacturers. Everything from testing device compatibility, power management, and checking that the apps work falls to Hung and his team.

The authors sat down with Hung to get the scoop on how Android is tested.

HGTS: Tell us about the origins of Android testing. It must have been a lot easier in the early days before handset and app proliferation!

Hung: Unfortunately, that’s not really the case. When I took over leadership of Android, the team was new and many of them hadn’t tested an operating system much less a mobile device, and task number one was building the team and test infrastructure. I find the earliest days of a testing project the hardest. No matter how complicated and diverse a product might ultimately get, if you have the right team and the right process and infrastructure, then you make it easy. It’s before all that happens that you find the hardest work!

HGTS: So let’s talk about that because a lot of test managers out there struggle with this project startup and hard work right now. Can you walk us through what you did in the early days of Android?

Hung: Big challenges in those days. The first was really to get our head around the product. One thing I need all my testers to be is product experts. Everyone on my team must know every aspect of the product stack. Period. Once you get that level of knowledge, you understand what the hard testing problems are and you can start building your team around those needs. Hire people (at Google, this means drawing great talent from other teams as well) who understand the hardest problems. Android has such a deep stack from the hardware through the OS, to the framework, to the app model, and to the market. There are a lot of moving parts that demand specialized testing expertise. The first thing I do is figure that out and build a team that can take on those hard problems.

After the team is assembled, I give them simple marching orders: add value! Preferably, add value in a repeatable way. From dev to product management, testing must be seen as an enabler. Anything less and you are in the way. In the early days, adding value meant helping get the product to a successful launch. I am sure we did things back then that were out of scope for testers, but we added value and we shipped a good initial product. We enabled daily builds and got everyone on the team synchronized on the same build. No one was working at cross-purpose and the workflow was organized. Everyone was trained to report, triage, and manage bugs the same way. It really was about the right people doing the right things and working as a team.

HGTS: Okay, this is sounding pretty intense. How did the team handle it?

Hung: Well, I lost a lot of the old team, perhaps as much as 80 percent. But the cool thing is that the people who stayed became the tech leads who provided leadership for the new talent we attracted. Not everyone is fit for every project, and Google is such a big company that if Android isn’t the right project, well, there is probably another one that is. Google has turned this bounty of projects into a pretty healthy testing ecosystem.

HGTS: And the positive?

Hung: The positive is definitely the focus on value. Everything we did had to have a purpose. We questioned everything. Every test case, every piece of automation, and much of what we were doing didn’t stand this kind of scrutiny. If the automation didn’t provide clear value, we abandoned it. Everything was value-driven and the organization was there. If I gave advice to new test managers, that’s what I would tell them: Add value in everything you do and make your process for adding value repeatable.

HGTS: This is clearly not easy, even though you make it sound that way! But give us more detail about this organization you talk about. We get the bugreporting process and workflow, but surely there is more to it here. How do you organize the work itself?

Hung: Well, I like to think about Android in terms of “pillars.” I know you called them different things when you tested Chrome OS, but I like the idea of testing pillars and all my testers identify with testing one of the pillars. For Android, we’ve organized around four pillars: the system pillar (the kernel, media, and so on), the framework pillar, the apps pillar, and the market pillar. Ask any of my testers what they are testing and they will name one of these pillars.

HGTS: That actually makes a lot of sense. I like the way the pillars line up with the skill set of testers. You have some good at the low-level stuff and some at the high-level stuff and the pillars reflect these strengths. Very nice. So what about automation? Where does that fit into your mission?

Hung: I’ve learned to be skeptical of automation. Testers can get so passionate about some grand vision of automation for a product and spend months creating it only to have the product or platform change and negate everything they did. There is no greater waste of resources than writing automation that doesn’t stand the test of time. In my mind, it needs to be written quickly, executed quickly, and solve a very specific problem. If I don’t understand the purpose of an automated test immediately, then it is too complicated. Make it simple, contain its scope and above all, make sure it adds value. For example, we have a suite of automated tests to validate a build to see if it is good enough to move from Canary channel to Droid channel.

HGTS: Whoa, whoa, whoa! What’s Droid channel?

Hung: Oh, sorry. Like I told you before we started, in Android we do things a bit different! Out channels are not the ones you use in Chrome. We have Canary, Droid-food (think dogfood except with just the Android team), Experimental, Google-food, and Street-food, which is a build we send out externally. It’s the same idea as for other teams; we just call them something different.

HGTS: Okay, so you have an automated suite to identify bugs that prevent a Canary build from being qualified into the Dev channel. I get it, small in scope. Do you have dedicated SETs for this automation?

Hung: Not on your life. None of my testers specialize. Specifically, everyone does manual testing and I mean everyone. Exploratory testing is the best way of digging into the product and learning it well. The last thing I want is an SET to be a framework writer. I want them to be more involved with the product and know how to use it. Every tester must empathize with the user. They must be expert users and know the ins and outs of the whole product. On my team, we leave things like stability testing, power management, performance, stress, and quick checks on third-party apps to automation. For example, no human is going to find a memory leak in the camera or verify a single feature across multiple platforms—these things require automation. Things that are too repetitive for a manual test or require machine precision that humans aren’t good at, that’s the place you want to automate.

HGTS: So then are you staffed with more TEs than SETs?

Hung: No, just the opposite. I have about twice as many SETs than TEs. It’s just that every SET is expected to act like a TE when necessary and vice versa. I don’t pay much attention to titles. It’s all about adding value.

HGTS: Okay, then let’s talk about manual testing because you clearly take it seriously (and I admire you for that!).

Hung: I am a big believer in focused manual testing. Sitting down hacking away is unproductive. We take a close look at the daily build we’ll be testing and analyze what’s in it. What has changed? How many CLs have been added or modified? What new or changed functionality are we dealing with? Which developers submitted CLs? How extensive were the changes from yesterday’s build? This helps us focus and instead of exploring the entire product ad hoc, we are able to focus on the changes from day to day and make ourselves far more productive.

This also means coordination among the team is important. Because I insist that everyone be an exploratory tester, it’s important to minimize overlap. We have daily coordination meetings to identify things that are important to test and ensure someone is covering them. Manual testing, to me, is about focus and coordination. Give me those two things and I will be satisfied that the effort spent exploring is worthwhile and valuable.

HGTS: Do you create any documentation for your manual testing or is it all just exploratory?

Hung: It’s exploratory and we create documentation. There are two cases where we document manual test cases. The first is where we have a general use case that applies to every build or is part of every major test pass. We write these down and put them in GTCM, so they are available to every tester or test vendor to pull out and use. The second case is that we document the testing guidelines on a per-feature basis. Each feature has its own unique properties, and manual testers document these as guidelines for other testers who might pick that feature as a target in some future build. So we have general, system-level use cases and feature-specific guidance that we take the time to document.

HGTS: Tell me about your demands on developers. Do you require specs? Do you make them do TDD? What about unit tests?

Hung: I suppose there is some fairytale world where every line of code is preceded by a test, which is preceded by a specification. Maybe that world exists. I don’t know. But in the innovative and fast-paced world I live in, you get what you get. Spec? Great! Thank you very much, I will put it to good use. But being realistic, you have to find a way to work within the system that exists. Demanding a spec won’t get you one. Insisting on unit tests won’t make those unit tests valuable. Nothing a spec writer or a unit test can do (besides finding an obvious regression bug) will help us find a problem that a real user will encounter. This is my world, a tester’s world. You get what you get and you use it to provide value to the product and the team.

In my experience, all engineers have good intentions. None of them want to create a buggy product. But innovation cannot be planned so precisely. Schedules and competitive pressures aren’t subject to my whining about quality. I can whine or I can provide value. I choose the latter.

I live in a world where daily CLs number in the many hundreds. Yes, I said daily. Brilliant and innovative developers must be met with an equal amount of brilliant and innovative testers. There isn’t time for demands or whining, just value.

HGTS: Hung, we are glad you are on our side! Now, a couple of quick answers if you are up for it. Given an extra day to test an Android release, what would you do?

Hung: If I had an extra day, I’d be given an extra build! There is no such thing as an extra day!

HGTS: Touche! Okay, what about regrets? Can you describe a bug that got through in a release channel that caused some user grief?

Hung: Well, first of all, every tester on the planet has experienced that. No software leaves any company without some bugs. It’s the nature of the beast. But for me, all of them hurt.

HGTS: Come on, you’re not getting off that easy! Name one!

Hung: Okay, there was this one active wallpaper bug a few releases ago. In certain circumstances, this wallpaper crashed on launch. The fix was easy and we pushed it quickly so that hardly any users were affected. But it still doesn’t make it right. We wrote some tests for that one, believe me, we wrote some tests for that one!

An Interview with Chrome TEM Joel Hynoski

Joel Hynoski is a TEM who has been with Google from the early days of the Seattle-Kirkland office and has run a variety of groups over the years. He currently is in charge of all client products including Chrome and Chrome OS. Joel is known around the office as the cranky Aussie, a moniker at peace with his tester soul but at odds with his high ratings as a manager.

The authors recently sat down with Joel for a chat about his ideas on testing and his experience on Chrome and Chrome OS.

HGTS: Quick, tell us what kind of computer you use!

Joel: [Holding up a laptop] Chromebook baby!

HGTS: Can we search your backpack for other hardware you might be packing?

Joel: Ha! I have a cell in my pocket and a tablet around here somewhere, but I believe in using what I test and I use this Chromebook for everything and when I find something it won’t do, I file a bug.

HGTS: So you are one test manager here who manages a range of products from toolbars to installers to Chrome to Chrome OS and everything else we build that runs as or on a client-operating system. That’s a lot of dev teams to manage. How do you balance all these things?

Joel: Well testing itself is a balancing act. On one hand, we have to get the product out the door and test each release and do the required checks to make sure it’s ready. On another hand, we need to build good automation and invest in our frameworks and infrastructure for automation. On yet another hand, we need to plan and put structure around the development-build-test-release process. On yet another hand, there are testing gurus everywhere telling the world about the newest way they have developed to get testing done and if you don’t experiment with the new stuff you feel like you’re stagnating.

HGTS: So are you a juggler or an octopus? Seriously, where do you stand on the ideal mix of all these things?

Joel: I try to be practical about it. I have software to ship and there are some things that have to give to make that happen. There are always tradeoffs. We may be an agile team, but we still implement a last-mile validation. We may do exploratory testing, but we still have to track the multiple release and platform streams. I don’t believe in absolutes.

Truth is, there’s no one model that works for any project team. Even in a single company, you get variation. Case in point: My Chrome team uses different processes than my Chrome OS team and they sit in the same building! Is one better than the other? It depends and all I can do is help the two teams communicate about what is working and what makes our job as Test more productive. Bottom line is my test team has to be ready for anything and we have to be cognizant of what’s working and when something doesn’t work, we have to be ready to abandon it quickly. Until I have all this totally figured out, I am going to favor a hybrid approach, a mix of developer testing, scripted testing, exploratory testing, risk-based testing, and functional automation.

HGTS: Uh, oh, it sounds like another Google testing book is on the horizon.

Joel: Yeah, give me a year and then we’ll compare sales or Amazon ratings, or heck, we’re Google, we’ll measure relevancy!

HGTS: Okay, give us the dish on Chrome and Chrome OS testing. We went out of our way in this book to discuss the common test infrastructure Google has for web app teams, but you are in the client space and none of that applies right?

Joel: Correct, and that’s what makes it such a challenge. Client is not mainstream Google. We’re a web company and we know how to build and test web apps so the challenge with all my client products is to translate that expertise and tools back to the client machine. It’s a real challenge when the Google infrastructure isn’t there to help.

Chrome itself grew up as a small experiment; a few SWEs get together and decide they can build a better browser and they put it out for the world to use (and because it’s open source, some modify it). In the early days, testing was done by the developers and a few hardcore testers who were early adopters. But with tens of millions of users, you better have a damn good test team in place.

HGTS: We know those guys; that’s a good description. So now that you have a team in place, what’s the biggest challenge?

Joel: The Web! Seriously, it’s always changing and Chrome can’t break it. There is a constant stream of add-ons, extensions, apps, HTML versions, Flash, and so on. The number of variables is mind boggling and all of them have to work. If we release a browser that won’t render your favorite website or run your favorite web app, you don’t have to look far for an alternative browser. Yes we have to run on a lot of operating systems, too, but they are far fewer in number and easier to test with our virtualization infrastructure here. But for my ulcers, it’s the Web that I worry about.

HGTS: Yeah, variety sucks for testers. So we realize you’re going to write your own book so we won’t steal all your thunder, but give us two technologies or solutions that help you tame the Web.

Joel: Two? Hmm. Okay, I’ll talk about application compatibility and UI automation because those have both been big wins. Everything else I will save for the next book, you know, the more relevant one?

Application compatibility is a big deal for browsers. We’re trying to answer the question, “Is Chrome compatible with the applications and sites on the Web?” In other words, does Chrome render pages correctly and run web apps correctly? Obviously this is impossible to validate in its entirety because we can’t possibly render every page and run every app. And even if we could, what do we compare them to? Well the way we answer this is by testing the most popular sites (we’re Google, that information is easy to determine!) against reference versions of Chrome and even competing browsers. We have automation that renders thousands of sites and compares them for points of similarity. We do this for every daily build, so we catch regressions quickly. Any site that renders differently gets some human attention to figure out what is wrong.

But that’s only part of it. We still have to be able to drive the sites and apps, and we do this with what we call UI Automation. Chrome has an API called the automation proxy that you can use to launch the browser, navigate to a URL, query the browser state, get tab and window information, and so on. We put a Python interface on this so you can script the browser with Python (a language a lot of testers at Google are proficient in). It makes for powerful functional automation, and we’ve built up a large library of “PyAuto³” tests written by developers and testers alike.

HGTS: Okay, so you kick butt on Chrome and because Chrome OS is just Chrome melted into a laptop, testing it is easy I would imagine?

Joel: As in I tested Sarfari so Mac OS is also tested? I tested IE so Windows is good? Yeah right! Just the fact that there is such a thing as Chrome OS means that Chrome is harder to test because I have yet another platform to add to my app compat automation!

But I’ll give you this: We do own everything on this stack, which is a nice place to be. Google is in control of everything on the system, from board components all the way up to the UI. Now from the UI down, everything looks good—lots of overlap with Chrome testing. I can use PyAuto and build some really nice automation suites with significant reuse from the Chrome team. However, there’s this firmware and then there’s this kernel, and uh, oh yeah, a GPU, a network adapter, a wireless modem, 3G ... we can fit a lot into these little boxes these days! These are all places where automation is hard to squeeze in. They are labor-intensive testing tasks that are fundamentally incompatible with the high Google dev-to-test ratio. We were getting systems from prototype stage onward and having to mount circuit boards on cardboard boxes.

We have a system where none of our pre-existing test tools work. Chrome OS is open source and sits outside of the normal Google development systems. We’ve had to (almost literally) reinvent the wheel on a lot of the testing tools, redefine processes to account for the ways that the tools work, and provide these tools to external developers who are contributing code. We release the OS on up to five different platforms on three channels (dev, beta, stable) on a six-week schedule. Good thing I am Australian or I would go insane!

So we had to get creative. Where can we push back on developers to write tools? How much testing can we demand from the partners and manufacturers? How do we school our test team on how to test hardware effectively? What tools and equipment can we create that will help us reduce the manual test load? And how does that translate into running on actual devices?

HGTS: The fact that you are stating these as questions is making us worry that we are going to have to wait for your book to get answers!

Joel: Well you are going to have to wait because I don’t even have all the answers yet and it’s my job to get them! We are still on the first release and we’ve had to come up with an effective approach that marries manual testing and automation together. Autotest,⁴ an open-source test tool designed initially for testing the Linux kernel, was repurposed to drive a comprehensive suite of automated tests on real Chrome OS hardware. The team working on that had to extend it a lot to deal with the issues of the platform and made a ton of contributions, which were all made open source. Autotest runs our pre-flight test queue, the smoke suite, and build verification tests for the system on both real hardware and virtual machines. And the, of course, we use PyAuto extensively to drive automation through the Chrome browser running on Chrome OS.

HGTS: You and James are well known throughout Google as the go-to guys for test hiring. You two trained our sourcers and recruiters and we know for a fact that candidates who were not sure they wanted to be testers (or testers at Google) were sent to you guys for the final sell. What’s the magic here?

Joel: James and I shared an office for quite a while and we are both passionate about the testing discipline. So it is natural that we teamed up on this. But James is the loud voice, conference-speaking guy who just seems to know everyone. He succeeds because of his network. I succeed because I get people excited about the discipline. It’s completely different, you know like the difference between luck and skill!

I’m kidding of course but I am really passionate about testing, and another aspect of this is about hiring the right people to be Google engineers. Chrome, specifically, has been a challenge to hire for because the tendency is to throw people at the problem. Have an issue with validating the CR-48, Samsung Chromebook, and new experimental platforms over three channels in a week? Hey, throw 30 vendors at the problem! You need to validate a Chrome S stable build in 24 hours? With 18 manual testers, that’ll be a breeze!

I don’t want to manage a team of testers who just blindly follow test scripts. That’s boring. I want to manage a team that is doing ground-breaking test development, creating new and innovative tools, and using a lot of creativity in their day-to-day work. So as we add people to the mix, I want to make sure that the technical level remains very high. This is the challenge with hiring. How do you find people with the technical skill to get into Google and help them find their inner passion for testing? Well, James knows a lot of those people, but eventually his network will run dry. I take a more comprehensive approach and dig into what makes testing a genuinely interesting and challenging problem. You’d be surprised how many people want to be developers and when they see what testing is all about, they are intrigued enough to give testing a try. Once they find out what a challenge testig is and how much fun it is, you have yourself a great tester.

HGTS: Okay, give us the pitch. Why choose a career in test?

Joel: Test is the last frontier of engineering. We’ve solved a lot of the problems of how to develop software effectively, but we’ve still have a green field of opportunity to attack the really meaty problems of testing a product, from how to organize all the technical work that must get done to how we automate effectively, and responsively, and with agility without being too reactive. It’s the most interesting area of software engineering today, and the career opportunities are amazing. You’re not just banging on a piece of software any more, you’re testing the GPU acceleration of your HTML5 site, you’re making sure that you’re optimizing your CPU’s cores to get the best performance, and you’re ensuring that your sandbox is secure. That to me is exciting and invigorating, and why I’m really pleased that I’m in the test organization at Google and working on one of the hardest problems we have.

The Test Engineering Director

Google test engineering directors are creatures all their own. It would be difficult to write a “Life of” chapter because each one is granted full autonomy and most of them take that to heart. There are only a few things they have in common. They all report to Patrick Copeland. They use common Google infrastructure and they meet weekly to discuss their respective domains, but unlike the engineering managers from the last section (who report to directors), directors have carte blanche to guide their various products teams the way they see fit.

Directors approve hires and transfers and generally control all aspects of test staffing. They own significant budget for things like morale events, off-sites, and the buying of “schwag” (Google-branded gear, backpacks, t-shirts, jackets, and such). It’s an accepted practice to compete when ordering the coolest gear for your testing troops; it’s also a courtesy to order enough to share. The Google test brand is strong in the company and such schwag helps sustain that. Indeed, sometimes schwag goes viral. Whittaker’s team ordered t-shirts with the edgy logo “The Web Works (you’re welcome)” on them and they were so popular that even developers were wearing them around Google campuses.

There is no attempt made with schwag or otherwise to force the teams to be in complete synchronization. There is no attempt made to minimize rework across the domains. Innovation is expected from all the teams and competition to develop automation or tools makes teams strong. However, there are special rewards in the vein of spot bonuses and peer bonuses that encourage collaboration and a tester’s 20 percent time is often spent working with a completely different team under a director different than the one she reports through. In fact 20 percent time is the way most directors manage a tester who wishes to transfer to a new team: 20 percent with the new group for a few weeks followed by 20 percent with the old group for a few weeks after that.

If there is one thing about Google that preserves this collaborative and whole-company spirit despite the natural competitive human tendency, it is the open transfer process. Google engineers are encouraged to move teams every 18 months or so. Notice this is “encouraged” and not “required.” Directors have to keep good relationships with other directors because we work with a shared labor pool and the ebb and flow of employees eventually favors us all.

A director’s job is leadership. They must build strong teams and keep them focused on the goal of shipping high-quality, useful, and industry-changing software. They must be technical enough to gain the respect of their engineers, innovative enough to keep pace with the fast-paced Google work style, and good enough at management to keep people productive. They must be a master of Google tools and infrastructure so that no time is wasted when working through what it takes to ship every day.

What does it take to accomplish all this and be successful at Google as a test engineering director? We decided that the best way to explain it is to hear from the people who actually do it.

An Interview with Search and Geo Test Director Shelton Mar

Shelton Mar is a test director, a title equivalent to a vice president at many companies. He’s also one of the longest running Google testers in the company and he pre-dates Patrick Copeland’s arrival back in the days before the Engineering Productivity when the group was called Testing Services. Shelton was promoted through the ranks at Google from test manager of small teams to a directorship over search, infrastructure, and maps. Shelton is now the test executive for a product area Google calls Local and Commerce, which includes all location-based products including Google Earth and Maps.

The authors sat down with Shelton to get some of the inside scoop on Google’s past and to catch up on what he has done to test Google Search.

HGTS: Shelton, you’ve been around Google long enough to remember the Testing Services days that Patrick wrote about in the preface. Tell us what testing was like in the early days pre-Patrick!

Shelton: Things were certainly different back then and a lot has changed in a short period of time, but one thing has always been consistent: Google has always been able to run at a very fast pace. But in the early days we were lucky. The Internet was simpler, our apps were smaller, and lots of smart people giving their best effort was good enough. We suffered a lot of last-minute fire drills, but the problem was still manageable enough that a few heroes could pull it off. Products took dependencies on the system level and end-to-end testing was both manual and scripted. The more we grew, the more those dependencies caused problems.

I am not saying anything bad about this kind of testing; it’s a necessary part of validation and ensuring an integrated system works correctly, but over dependency on late cycle testing makes debugging a lot more difficult when things don’t go well.

We were struggling with this problem around the time Pat appeared.

HGTS: I guess it was worse with backend systems where “end-to-end” is harder to identify.

Shelton: Exactly! We often couldn’t release backend systems as fast as we would have liked because it was so hard to be sure about quality. Backend systems have to be right because they affect so many different product verticals. You get BigTable wrong, for example, and lots of apps suffer. It’s the ripple effect from updating a backend system because of issues that couldn’t be found using just end-to-end tests.

HGTS: So you went from an overinvestment in end-to-end validation to hard core validation of the backend server infrastructure. Tell us about the path to get there.

Shelton: Well we started by changing the composition of the team. We redefined the SET role and then focused on hiring technically strong candidates. Once we had the skill set in place, we got to work on building a better backend testing solution. We focused our work on component-level automation. Imagine a bunch of smart engineers with development and testing skills swarming our backend infrastructure and you are getting close to what happened.

HGTS: Any specific key to your success?

Shelton: Yes, it was crucial to have SWE support and buy-in. Our SETs worked closely with the development partners (notice I use the terminology “partner” as this really was a collaborative effort that test cannot assume full credit for) to build better testing practices at the developer level. Whatever we could have done on our own was amplified by this close partnership. At times, we would realize something wasn’t possible at the component level and work with them to solve it at the unit level. This collaboration transformed the dynamic in the team so that the entire project team (dev + test) owned quality at the component level with test engineering focusing its time on improving process, framework, tooling, and integration testing.

HGTS: You made some pretty tough calls, such as hiring SWE quality people to test. What prompted that? Any regrets? What impact did it have on the testing culture?

Shelton: That’s probably the most important transformation we’ve done for Google. We realized that there were three things we needed to change at Google early on:

• Push testing upstream so the entire team (dev + test) owns the quality of the deliverables.

• Test engineering is an embedded part of the project team. Thus, we need strong engineers who can understand the challenges and the technology involved.

• Apply computer science and engineering rigor to testing.

You can’t accomplish these things without having smart and strong software engineers who “get” (or at least can be taught) test engineering. Once we started looking at the challenge differently, we realized we can hire and motivate the best engineers to solve difficult problems in testing. The fact that we’ve built up such a large team of them shows that they enjoy the work.

HGTS: During your tenure at Google, you’ve worked on a lot of products including Search, which is Google’s mainstay. What’s the hardest part of testing Search?

Shelton: Deciding on what to focus on! Often when engineers start looking at testing Search, they start talking about what Google returns from the search queries. While that’s definitely an interesting area to explore, search quality requires a lot more than that. To provide consistent, reliable, and fast response to our users, we need to validate the complex and sophisticated distributed software system that actually returns results. You have to understand indexing and search algorithms. You have to understand the plumbing and how the system is built so you can validate all these actions when and where they actually occur. We focus on all this from the beginning. In reality, what we did was separate search quality from the actual plumbing of the solution. We focused on the latter and left search quality to the search quality experts on the product team. We validated the infrastructure and system that are used to handle the processing, update, and serving of Google search results, and they made sure the results Google was producing were the best results.

HGTS: When you get a new project, what is your general approach? What are the first things you usually do? From a team-building perspective? From a technical infrastructure perspective? From a testing process perspective?

Shelton: Talking in general terms, one of the first things I ask my team to evaluate is, “What’s really critical for the system under test?” Performance is important for search, freshness is big for news, and comprehensiveness is critical for maps. Every app has its important attributes. Similarly for system infrastructure, data integrity is critical for storage, the ability to scale is important for networking, and utilization is key for job management systems. Once you’ve identified what’s critical for the specific product you’re testing, then focus the majority of your energy on validating that the core capabilities of the system satisfy those attributes. Worry about the easy stuff (UI tweaks and bells and whistles) only after the important stuff is right. Also focus on areas (such as design for performance) that are core attributes and difficult to change, and spend less time on things that can be updated easily. So if you start logging bugs about fonts too early, I am going to worry about your priorities.

HGTS: There is a historical tension between manual and automated testing and it seems like Google swung from lots of manual testing to lots of automated testing. Where do you stand now? What is the right split? How do you know when you have gone too far in one direction or the other?

Shelton: I feel that you automate as much as possible. We have the concept of continuous builds where manual testing just gets in the way. Validation at the component level and integration tests that execute around the clock play a role that manual testing cannot. However, automation is not resilient to change and it requires maintenance. As technology continues to evolve and change at a rapid rate, automation has to be redeveloped to keep pace.

Having said that, there are some things manual testing can do that automation cannot. An example is mobile apps where there’s an explosion of hardware, displays, form factors, and device drivers that cause variance in rendering and display. In such cases, we have to resort to some level of manual testing to help in doing validation. The key is still to automate that process as much as possible. Have a machine do 90 percent of the work and apply human intelligence only to the last 10 percent (“the last mile” we call it) of the validation cycle. Automaton can do things like capture all the screens from devices, allowing us to do rapid side-by-side human inspection after our comparison algorithm filters out variance in display. Human intelligence is too valuable to waste on something a computer can do and should be applied where intelligence and human judgment is best; a repetitive task is definitely not one of those areas.

HGTS: Describe a bug that got away, one that embarrassed you after you shipped.

Shelton: Oh, you had to ask about that didn’t you? Has anyone ever shipped a perfect product I wonder? Not me, unfortunately! The bugs I regret most in production were around our failure to thoroughly test data center configuration changes. In one case, a new version was pushed out into production without going through any kind of validation. That configuration change affected the quality of the search results served to end users in a negative way. We learned just how important configuration changes are to search quality. Since then, we have included configuration change as part of the qualification process and we have a set of automated tests we run before data or configuration changes are pushed to production.

HGTS: How did you define these automated configuration tests?

Shelton: By being watchful! Every time we found a configuration that negatively affected search results, we wrote tests for those configurations and any variations that might produce similarly poor results. It wasn’t long before we had a good set of problematic environments added to our test suite. We then used automation to generate as diverse a set of data as we could to test those environments against. These kinds of bugs are definitely less common now because of this practice. This is definite automation that gives us a lot more confidence when we push changes into production.

An Interview with Engineering Tools Director Ashish Kumar

Google lives and dies by its tools, and the person in charge of those tools is Ashish Kumar. He is responsible for everything from the IDEs developers use to code review systems, build systems, source control, static analysis, common test infrastructure, and so on. Even Selenium and WebDriver teams report up through him.

The authors recently sat down with Ashish to get his take on this particular piece of Google magic.

HGTS: There is a general mystique about automation at Google, perpetuated by the popularity of GTAC, and you are the man behind it. Can you describe the general set of capabilities your tool set provides to Google engineers?

Ashish: My team is called the Engineering Tools team. We are responsible for building 90 percent of the tools that developers use on a day-to-day basis in order to write, build, and release quality software at Google. It’s 90 percent because we don’t support some of our open-source product teams yet, but we have plans in the works to support them as well.

Google is unique in that there is a significant focus on providing very capable (and scalable) infrastructure for our developers. People outside are generally familiar with technologies like MapReduce and BigTable that are used regularly by Google developers, but our developer tools infrastructure is also a significant part of that investment.

HGTS: How about some specifics?

Ashish: Okay, you asked for it! The toolset includes:

• Source Tools: Tools to make it easier to create workspaces, submit code changes, and enforce style guidelines. Tools to browse hundreds of millions of lines of code and easily discover code to prevent duplication. Tools to enable indexing and refactoring at scale in the cloud.

• Development Tools: Plugins for IDEs that allow those tools to scale to Google code and connect with services in the cloud. Tools to allow fast and high-quality reviews of code by embedding relevant signals at code review time.

• Build Infrastructure: This system allows us to shard builds for cross-language code across tens of thousands of CPUs and so much memory and storage that it makes my brain hurt just thinking about it! This build system works for both interactive and automated use, providing results in seconds where it should have taken hours in many cases.

• Test Infrastructure: Continuous integration at scale. This means running millions of test suites every day in response to every check-in by every developer. Our goal is to provide instant (or as close to instant as possible) feedback to the developer. The flip side of this scale is web testing at scale. This means running hundreds of thousands of browser sessions testing various browser-platform combinations against every Google product every day.

• Localization Tools: Continuous translation of strings created by developers to allow for localized versions of our products being ready at the same time as our English versions.

• Metrics, Visibility, and Reporting: Managing bugs for all Google products, keeping track of all development (code, test, and release) metrics for all developers in a central repository and providing actionable feedback to teams on how to improve.

HGTS: Okay, that’s a lot of balls in the air. Surely it took a lot of innovation to get where you are today. But how do you balance keeping all this work going and pursuing new development at the same time? Your team isn’t that big!

Ashish: We don’t try to do it all is the simple answer. My team is a central Engineering Tools team that serves all of Google. Often product teams work on tools that are specific to their needs. Sometimes those tools become generally usable, and we evaluate them for a fit to making them central (and hence, allow for all of Google to use them). Other times, an engineer on my team comes up with an idea he thinks is cool. This is Google and we try our best to foster these types of organically grown startups. Our criteria for inclusion of a tool in our central team is in two parts: It has to have the potential for high impact on productivity and it has to be applicable for a large number of Google developers. After all, we are a centralized tools team so the big impact, broad audience tools are where we play. If a tool applies to only one product team, then that team should own it.

But we do start a lot of experiments as well. In order for us to get the big wins next year, there should always be a few experiments going on with one- or two-person teams working on them. Many of these start out as 20 percent efforts and I won’t say no to anything done on a 20 percent basis. Individual engineers own their 20 percent time, not me. Some of these efforts fail, but the ones that succeed generally make up for all the failures put together. Dream big, fail fast, and keep trying!

Some tools’ projects are enablers, so it’s difficult to directly measure their impact on productivity, but all tools’ projects are required to move the needle significantly in terms of productivity gains for Google.

HGTS: Is there a tool idea you thought wouldn’t succeed but then did?

Ashish: Yes! Continuous integration at scale. It’s such a large problem that on the surface looks completely intractable. We used to have thousands of machines running these continuous integration loops. Someone on the team suggested building infrastructure that would centralize this for all projects at Google and that this infrastructure would just poll source control for changes, manage a massive cross-language dependency graph for every CL in memory, and automatically build and run affected tests. I wasn’t the only one voicing concern that this was too massive and that resource utilization would put our servers in the red. The skeptics were right about that; resources used are very high. But one by one our engineers overcame these technical hurdles and the system is running now and it works. We did the same thing with this project like we do with others like this. Start small, prove incremental value, and then double down as the projects begin showing value.

HGTS: Is there a tool you built that you thought would succeed but didn’t?

Ashish: Yes again! Remote pair programming. Google has highly distributed development setup; many teams follow pair programming and other agile development techniques. Many times the code that you are working on might have been authored by someone in another office, and if you had a quick question, there’s a delay involved that affects productivity.

One of the experiments we started was to build a “remote pair programming” plugin for our IDEs. The goal was to have tight integration with Google Talk (and video) so that as a developer has a question about modifying some code, he or she can start a chat with the author through functionality directly embedded in their development environment, The other developer would then get to see the workspace and be able to perform the edit as a pair while watching each other over video. Pair Programming without the body odor!

Unfortunately, we launched an early version (with simply the collaborative editor and no Google Talk integration), but when we didn’t get the early adopter usage metrics that we hoped to see, we discontinued that experiment. Developers just didn’t seem interested. Perhaps we weren’t aligned with an important enough pain point for them.

HGTS: What advice would you give to a company trying to build an automation pipeline? What tools do you start with?

Ashish: Focusing on the development environment that a newbie developer would have to work with on your team is really critical. Make it dead easy to get started to check out code, edit code, test code, run code, debug code, and then deploy code. Take away the pain from each of these steps and your developers will be productive and you will produce high-quality software on time.

In order to make this happen, it is really important to define dependencies cleanly, make them explicit, and set up a continuous integration system “that just works” and provides feedback quickly to developers. If the feedback takes more than a few minutes, add more machines. CPU hours are cheaper than developer context switches and idle time. Make it as easy as typing a command to run and debug code, as well as to deploy it. If you are a web company, make it easy to launch partial deployments.

HGTS: What do you look for in an engineer for your team? It seems like just any old developer won’t really cut it as a tools guy.

Ashish: Tools development requires a special love for computer science fundamentals, things like language development, compilers, systems programming, and the satisfaction of seeing your work used by other very smart developers who in turn produce greater value for the company. Really it is about finding people who want developers as their customers.

HGTS: Speaking of customers, how do you convince people to adopt your tools?

Ashish: Googlers are a unique bunch. They generally don’t need a lot of selling. We hold what we call Engineering Productivity Reviews weekly and demo our tools. Engineers come, they ask questions, and if the tool solves a real problem they have, then they try it out. In general, you get good adoption for the tools that solve real problems and poor adoption for those that do not. The secret is in avoiding the latter completely or at least be willing to cancel a project that fails to show value (and do so as early as possible).

HGTS: Have you ever seen a tool that just got in the way or did more harm than good?

Ashish: Yes, but I try not to stare at them for too long. Those are projects that we very quickly disinvest in. The purpose of tooling is to automate the process and make it easier. Sometimes tool projects automate bad behavior. If developers are doing the wrong thing manually, why make it easier with a tool? The creators should have taken a step back and decided if they should be doing something else altogether rather than just automating what people were doing today.

HGTS: What’s on your plate now? What are the new tools your team is cooking up?

Ashish: Well, first off, let me say that there is a lot of “keeping up” that needs to be done. The Web is changing so fast that all our tools around its technology are in constant development mode. Sometimes the changes make us do rewrites of tools and sometimes the changes allow completely new functionality. It’s a constant challenge and a constant opportunity. But because a lot of what we do is internal, I can’t talk about them here, but think scale, scale, and more scale and you are getting pretty close!

An Interview with Google India Test Director Sujay Sahni

An important aspect of the Google testing culture includes leveraging talent in various distributed offices by creating regional and global hubs. Hyderabad, India was the first global hub for Test Engineering established because of the talent available in India. The Googlers at this center work on key Google products that help fuel the needed change of direction and engagement from manual testers (or Testing Services) to a test engineering organization. Sujay Sahni is the test director who founded and runs the India Engineering Productivity team.

HGTS: India is a long way from Mountain View. How did Engineering Productivity get over there and evolve it into a critical path engineering office?

Sujay: Engineering Productivity teams follow a model similar to the overall Google engineering teams that place teams in centers around the world where we could find the appropriate talent. India was not a cost-based decision but solely based on the ability for us to hire exceptional talent. We placed teams in India that were big enough to have critical mass. India is actually one of a number of regional centers in which developers and testers are co-located; the others include London, New York, Kirkland, Bangalore, and some smaller offices.

We also have regional hubs that cater to a particular region like Europe. Europe’s regional center was established in Zurich, Asia Pacific was Hyderabad, and the East Coast was New York. These centers worked on efforts across the region and brought together engineering efforts in the smaller Google Engineering offices in that region. This allowed for better time and talent management.

But Hyderabad is also what we refer to as a global hub as a source of talent for all of Google and engineering solutions for the testing teams. In the early years of testing at Google, Hyderabad was the biggest center for SET talent and we worked on any number of strategic projects. The Googlers at this center worked on key Google products that helped fuel the needed change of direction and engagement from Testing Services to Engineering Productivity.

HGTS: What role did India play in the evolution of testing at Google?

Sujay: The Hyderabad Center, which we abbreviate as HYD, was the first among the regional centers to be established. While we set up a center in Bangalore to be co-located with the engineering teams there, Hyderabad Center quickly became the global hub for test engineering. In the early days, the HYD center was a mix of test engineers, software engineers in Test, and a large contingency of temps and vendors. They worked on numerous important and recognizable Google products (like Search, Ads, Mobile, Gmail, and Toolbar to name a few) in various capacities and roles. Primarily, the Googlers worked on developing the critical testing infrastructure and frameworks that enabled engineering teams to automate their tests for faster releases. In 2006–2007, the HYD team constituted roughly half the SET pool at Google. One interesting anecdote: It is said that the SET role was created as a result of efforts of the first test engineer hired in HYD! Whether you give us this much credit or not, we at least indirectly paved the way from Testing Services to become Engineering Productivity.

By late 2007, we created a change in leadership with the goal to develop the team into new strategic areas, reduce fragmentation, and develop a more senior talent pool to help lead the growing number of young engineers. By early 2008, we had started establishing more leadership and regional hubs that now allowed for engineering teams to have local (or closer proximity) test teams, and this created an opportunity for HYD to focus more on areas where Google test teams had yet not ventured like advanced latency detection tools; backend. cloud performance, and stability tools; regression detection mechanisms; and client-testing tools.

Another change that began around this time was to start investing in the cloud testing and engineering tools infrastructure. This included aspects like Cloud Code Coverage infrastructure, developer IDEs, Cloud Scalable Test Infrastructure, Google Toolbox, and other experimentation efforts, most of which led to production tools. The team provided critical services and tools not only to the global engineering teams within Google, but also core infrastructure that is shared with the developers in the open-source community. Google engineers from HYD contributed to App Engine, Selenium, Eclipse, and IntelliJ plugins and various other code to the open-source community.

HGTS: These are some good and important projects. Can you give an example of one done completely in HYD?

Sujay: Yes. The Google Diagnostic Utility was developed solely out of the Hyderabad Engineering Productivity team. It allows our support teams to work with customers to diagnose issues they face with Google products by helping them identify technical specifications and configurations of their systems and computers.

There are others too. The HYD Engineering Productivity team focuses on developing engineering infrastructure, tools, and tests for all of Google. This includes developer tools like IDEs, core engineering infrastructure in the cloud for code compilation, developer testing, code complexity, code coverage, and various kinds of static analysis. For testing tools, the HYD team owns the development of test infrastructure for load and performance analysis for various Google cloud applications as well as test tools and testing for core products like Search, Enterprise, Gmail, and Ads.

HGTS: Okay, I want to follow up on some of these tools. Even the names sound interesting. Can you tell me about the code coverage tool you mentioned? Code coverage always gets a lot of attention on the Google Testing blog!

Sujay: Code coverage is a globally accepted metric to measure the effectiveness of tests for a given code base. In a traditional paradigm, each team would set up dedicated resources (engineering, hardware, and software) to measure code coverage for its project codebase. However, at Google there is a central team based in India that ensures that all Google Engineering efforts get code coverage metrics seamlessly. To get this, teams need to follow a few simple steps to enable the functionality, a one-time investment of less than five minutes. After it is set up, teams get coverage metrics for all their projects and builds with a central reporting location to view and analyze the reports.

Coverage is supported for thousands of projects, supporting all major programming languages and millions of source code files. The coverage infrastructure is tightly integrated into Google’s cloud infrastructure to compile and build code and scales to the massive complexity of constant code changes (measured per minute!) and tens of thousands builds each day. It is set up to scale with a fast-growing Google codebase.

We also have supporting infrastructure that provides smart ways to provide test prioritization to detect relevant tests that need to be run based on the specific code changes. This provides targeted test coverage, better confidence in code quality, and faster feedback, saving Google a huge amount of engineering resources.

HGTS: Sounds like code coverage done right. Now tell me a little about the Diagnostic Utility you mentioned.

Sujay: The Diagnostic Utility was conceived and built through 20 percent efforts of Hyderabad’s Engineering Productivity SETs. It bridges the gap between the technical knowledge of an average computer user and a Google developer’s need for technical data when debugging a user issue.

To understand the reports submitted by Google users, sometimes technical data about the state of Google software is needed. This might include data as trivial as the OS, locale, and so on, to more complicated details such as application versions and configurations. Gathering this information in an easy and expedited fashion can be a challenge because the user might not be aware of such details.

The Diagnostic Utility helps simplify this. Now when Google receives an issue where additional data is needed, our support team simply creates a new configuration for this tool, outlining what specific information needs to be gathered. The users can then be contacted via email or be pointed to a unique link off of google.com and its support where they download a small (less than 300KB) Google-signed executable. This executable can diagnose a user’s machine and gather only the configured data that is specific to that request and display to the user a preview of the data that he can then opt in to send to Google. Obviously the executable cleans up and deletes itself on exit, and we take extreme care to ensure user privacy is maintained and data gathered is reviewed by the users and submitted with their consent.

Internally, that data is routed to the appropriate developer to accelerate debugging and resolution. This utility is used by Google customer support and is particularly useful to teams like Google Chrome, Google Toolbar, and other client applications. Additionally, it makes getting help from Google much easier for our users.

HGTS: You mentioned performance and load testing a few times. What’s the story here? I understand you guys were deeply involved in the Gmail perf testing?

Sujay: Google has a wide range of web applications, so ensuring a low latency user experience is an important goal. As such, performance testing (focusing on the speed of JavaScript execution and page rendering) is one of the key verifications done before any product release. Historically, latency issues could take days or weeks to be identified and get resolved. The India Engineering Productivity team developed Gmail’s frontend performance-testing infrastructure to cover critical user actions and ensure that the things users did the most were the subjects of intensive performance testing. This is deployed and tested using instrumented server binaries, and tests are run in a controlled deployment environment that helps identify regressions while maintaining a low variance.

There are three parts to this solution:

• Submit queues: Allow engineers to run tests (and gather performance latency metrics) before submitting their code change. This allows faster feedback and prevention of bugs being checked into the codebase.

• Continuous builds: Syncs test servers to the latest code changes and runs relevant tests continuously to detect and trap any regressions. This enables the team to reduce regression detection from days or weeks down to hours or minutes.

• Production performance latency detection: Used to identify the particular code change that caused the production latency regression. This is done by bisecting the range of changes and running tests at various checkpoints.

This solution has helped identify many critical bugs before our products are released and has driven quality upstream as developers can easily launch these tests themselves.

HGTS: Any new innovative efforts (technical and nontechnical) that you experimented with? What did you learn from them?

Sujay: Some experiments we are working on include feedback-driven development tools that focus on gathering the right metrics and providing data to our engineering teams to improve their productivity. This also includes code visualization tools, code complexity metrics, and some other things a little more out there. Another area is advanced development environments that include improving the way engineering teams use IDEs and metrics to improve code quality and their release velocity. Other tools being developed include a Google-wide, post-mortem tool that unifies release data and makes it actionable.

HGTS: Any final words about global test engineering based on your experience working from India for a globally distributed software company?

Sujay: It’s not easy, but we’re proof it can work. The key lessons for me have been:

• The “follow the sun” model works well if you choose the right teams and the right projects. We have been able to work through various challenges as a globally distributed team but not without a few missteps. Having a good model for work hand-offs from one time zone to another is key. Also, pick your team and projects carefully. You have to have people who are passionate about the products you build and people who are good collaborators.

• Crowd testing is another thing that works well for us. We leverage the vast pool of talented vendors in the test community in India and take advantage of the time differences using the crowd source model, and it works well.

Hiring great talent and putting that talent to work on critical projects is the bottom line. Google didn’t just go after low-cost countries because there are cheaper places than India. We tried to keep the quality of hires and opportunities that the Googlers work on extremely high. We make a big impact at Google and our TEs and SETs have fulfilling careers. We all win.

An Interview with Engineering Manager Brad Green

“Shoeless” Brad Green has been a test engineering manager for many products at Google including Gmail, Google Docs, and Google+, and he is now a development manager for Google Feedback and also experiments with Web Development Frameworks in a project called Angular. He is known around the office as the guy with big ideas and no shoes!

HGTS: We understand that you come from a development background at Apple. What made you switch to Google and why Engineering Productivity?

Brad: Linus Upson helped bring me in. He and I worked together at NeXT and he seemed really energized by Google. But somewhere in the six-month interview process, Patrick Copeland tricked me into joining Engineering Productivity. It was a good trick. I’ve learned so much in the role and now that I am back being a development manager again, I am the better for it. Perhaps every developer should do a tour of duty in test!

HGTS: What was the most surprising thing about the testing culture at Google when you arrived?

Brad: This was early 2007 and the transformation Patrick spoke about in the preface wasn’t fully baked. I was amazed at how much testing know-how was here. Every time I met with a new group, I found some test expert who surprised me. The problem was that this expertise was so unevenly applied. In the same way that Eskimos have hundreds of words for snow,⁵ Google had as many terms for styles of tests. I felt like I had to learn a new set of testing terms whenever I dug into another team’s practices. Some teams had great rigor, some did not. It was clear we had to change.

HGTS: What has changed the most in testing since you arrived at Google?

Brad: Two things. First, the average developer has become much more involved in the testing process and in writing automation. They know about unit, API, integration, and full-system tests. They instinctively create more tests when they see problems that have slipped through the continuous build. Most of the time you don’t even have to remind them! This has led to much higher initial quality and the ability to ship quickly. Second, we’ve been able to attract hundreds of top-notch engineers into test-specific roles. I like to think that the two are related. It’s a culture where testing matters and doing testing is something you get appreciated for.

HGTS: Let’s talk about what it is like being a manager at Google. What are the most difficult, easiest, or interesting aspects of being a manager at Google?

Brad: Google hires extraordinarily self-motivated folks. Getting things done “because I said so” might work once, but these smart folks will go off and do what they think is best after too much of that. I’ve been most successful by helping to guide, provide insight, and open doors for my folks to do what they’re most passionate about. If I have to give a direct order, I feel that I’ve failed to equip them with the skills they need to make good decisions. The word manager is in my title but I do as little of it as possible. It’s really a job that requires the leadership of wicked smart and very passionate engineers. Managers beware.

HGTS: Your team has done a lot of work on metrics for developer testing. What metrics work? What data are you tracking? How has it impacted quality?

Brad: Lots of motion, little forward progress to be honest. I’ve been failing in this space for four years now, so I feel I’ve learned a lot! I say failing because we’ve poured tremendous work into finding magic metrics of code and test quality that teams can use as an absolute guide. But metrics are hard to generalize because context is very important. Yes, more tests are better, except when they’re slow or unreliable. Yes, small tests are better than large ones, unless you really need that system test to validate how the whole thing is wired. There are useful measurements, but test implementation is nuanced and is as much an art form as the application code itself.

What I have learned is that the social aspects of testing are much harder than the computer science aspects. Everyone knows they need good tests, but for most of us, it’s hard to get our teams to execute on writing them. The best tool I know is competition. You want more small tests? Compare your team against another. Hey look at Team X; they have 84 percent unit test coverage and their full test suite runs in under five minutes! How can we let them beat us? Measure everything you can, but put your trust in people. But still you have to be skeptical, no matter what testing you accomplish in-house, you still have to release it to users and there is a lot about that environment you cannot predict or replicate.

HGTS: So that explains your involvement in Google Feedback! Can you tell us a little about the purpose of Google Feedback? What problem was it built to solve?

Brad: Feedback allows users to report problems they see with Google’s products. Easy, you say. Just slap a submission form on the page and let users go at it, right? Well, many teams had tried this only to find that they couldn’t handle the volume of reports—in many cases, thousands per day. They also often had problems debugging issues because the information users would submit was always incomplete and sometimes inaccurate. These are the problems Feedback addresses.

HGTS: Can you tell us a bit about how Google Feedback works?

Brad: Feedback starts by gathering everything it can from a user while still protecting their privacy. Browser, operating system, plugins, and other environmental information is easy to collect and essential for debugging. The real trick is in getting a screenshot. For security reasons, browsers can’t grab an image of their contents. We’ve re-implemented the browser’s rendering engine in JavaScript in a secure way. We take the screenshot and let the user highlight the area on the page where they’re experiencing problems. We then ask users to describe their problem in text. It’s impossible to train users on how to submit good bug reports. These screenshots have made ambiguous descriptions clear time after time.

HGTS: But with all the users out there, don’t you get the same bug reported over and over? It seems like a volume problem.

Brad: To solve the volume problem, we do automated clustering to group similar ones. If a thousand users all report the same issue, we put in a bucket. Finding these thousand issues and grouping them by hand would be an impossible task. We then rank the groups by volume to find which issues affect most users, giving a clear signal as to what we need to address most urgently.

HGTS: How big is the Google Feedback team?

Brad: The team has 12 developers and three project managers. This is a large number of project managers for a typical team at Google, but we’ve found it necessary with all the horizontal coordination across products.

HGTS: What was the biggest challenge, technically or otherwise, in delivering Google Feedback?

Brad: On the technical side, creating the screenshot was certainly a monumental undertaking. Many folks thought we were crazy for attempting it. It works amazingly well now. The automated issue clustering was and still is a challenge. We do pretty well on issues created in different languages, but we still have a way to go.

HGTS: What is the future of Google Feedback? Any chance it could be available for non-Google websites someday?

Brad: Our goal is to give our users a way to have a conversation with us about the issues they find in our products. Right now, the conversation is one-way. I think the future for us is in completing the conversation loop. We don’t currently have plans to release this for external products, but that sounds like a good idea to me.

HGTS: What do you see as the next big leap in software testing in general?

Brad: I’d like to see a development environment with testing as a first-class feature, rather than one bolted on later. What if the language, libraries, frameworks, and tools all knew that you wanted tests with your feature code and helped you write it? That would be hot. As it is today, we have to cobble test frameworks together. Tests are hard to write, hard to maintain, and unstable when they run. I think there’s a lot of value to be gained by baking in “test” at the lowest levels.

HGTS: Do you have any dirt on Dr. James Whittaker that the world should know about?

Brad: Other than the unfortunate incident in the Little Bo Peep outfit, what happens among the leadership ranks stays in the leadership ranks!

An Interview with James Whittaker

by Jason Arbon and Jeff Carollo

We turned the tables on James and cornered him in his office for an interview. James came to Google amidst a great deal of fanfare and is among our more recognizable test personalities. His posts dominate our blog and his appearances at GTAC bring in the crowds, and there is that whole expert witness, keynote circuit thing. His enormous personality certainly dominates the Seattle and Kirkland offices and reaches across the company like no one else, except, perhaps, Patrick Copeland. Pat might be the boss, but if Google has an intellectual leader in test, that leader is James.

HGTS: You came to Google in 2009 from Microsoft, which you announced on your Microsoft blog without naming the company you were going to. Care to explain why? Were you just creating mystery?

James: A hard ball as the first question? Gads! You promised this would be easy!

HGTS: And you promised you’d answer our questions, so get to it!

James: Well, mostly I used the MSDN blog to broadcast my departure to the largest number of people all at once. I was being kind of a coward about quitting Microsoft and because no one was on Twitter then, I decided to quit on the mass market forum of the day. I wanted to tell as many people as I could without having to walk the halls for a bunch of face-to-face “I’m leaving” meetings. Turns out more Microsofties read my blog than bothered with my emails! It was the best way to get the word out.

The people who knew I was leaving spent a lot of time trying to talk me out of it and it’s hard leaving a company that you enjoy working for. It’s hard leaving the people I spent years working with. I like Microsoft and I respect the engineers who work there. It felt awful to leave and it was bad enough second-guessing myself about the decision. Truthfully, I might have been talked out of it if I had allowed more of those guys the opportunity to try. I really wanted to work for Google, though, and didn’t want to give them the chance to stop that.

HGTS: Why? What was it about Google that drew you to us?

James: You know, it’s weird. I spent my early career as a consultant professor and I founded a startup and did everything except the big company thing. When I plucked up the courage to work for a big company, I wanted big! The bigger, the better. The more sexy the work, the better. The more users I can reach with my work, the better. I wanted to see how successful I can be in industry so why not at the top companies? That’s what attracted me to Microsoft long ago and it attracted me to Google later. I want big companies. I might as well work at the best big companies.

But really on top of all that was Google’s emergence as the coolest testing company to work for that got me. Microsoft held that position for a long time and I credit Patrick Copeland for wresting that title from them. Google just seemed to be a cooler place to be a tester than anywhere else.

It was my interviews that sealed the deal. Patrick Copeland, Alberto Savoia, Brad Green, Shelton Mar, Mark Striebeck (and more who aren’t mentioned)... all these people were on my interview loop and the conversations we had were amazing. Alberto and I filled up a whiteboard during my “interview.” He even remarked afterward that he forgot to actually ask me any questions. Shelton and I actually disagreed about things we discussed, but he was so open-minded to my opinions, even though they were different, that it impressed me a lot. We agree more now than we did in the interview! And Brad? Well he’s just cool. I mean he had no shoes on during the interview (in February). Even his attitude was kind of a bare feet sort of vibe. Mark spent most of the interview trying to convince me to come to Google. Ideas were flying in all these interviews. It was a rush, like a drug.

I was exhausted afterward. I remember falling into my cab and thinking I had found the best company to work for but worried I didn’t have the energy to actually work there. I was really worried about being able to contribute. It was intimidating; it was outside my comfort zone. I like a challenge and the idea that my success would not come easy topped it all off. Who wants an easy job?

HGTS: And have we lived up to that reputation?

James: Well, yeah, the job is not easy at all! But I think you mean the passion part. I’ll be honest, there is test passion and smart people at Microsoft too, in quantity. The difference at Google is that it is a lot easier to follow those passions. Alberto and I were never on the same team, but we could work together through 20 percent of the time and Brad and I are still collaborating on IDEs and things like automated bug reporting (Brad through Google Feedback and me through BITE). Google is good about providing outlets for such collaboration and you can actually get credit for it as part of your day job.

HGTS: We work with you in Kirkland and we’ve seen a big difference in morale and the pace at which the whole team here gets work done. What’s your secret?

James: I will concede that Kirkland is a much better place now than before I joined, but I will stop short of taking credit for it. Part of it had to do with critical mass. My arrival preceded a massive hiring surge and we brought in some amazingly talented people. I think in the first few quarters, we more than quadrupled our testing staff. This allowed me to build bigger teams and co-locate people doing similar work even if they were working on different products. Instead of lone testers embedded with their development teams, we had groups of testers sitting together and feeding on each others’ ideas. It worked wonders for productivity and morale.

The additional people also allowed me to move the veterans like you two off of your current projects and give you bigger challenges. Jeff was writing a presubmit queue for Google Toolbar. Ga! What a mismatch in talent. Jason, you were testing Google Desktop. I tell you the secret to being a good manager is simply matching the person to the project. You get that right and your job is mostly done. The people are happier and the projects are better off. But it was critical mass that allowed me that luxury.

The other thing critical mass did is that it gave us some breathing room in terms of how we spent our 20 percent time and the fact that I had extra people to do experiments. I was able to staff some risky experimental projects. We started doing tools and experiments that had nothing to do with shipping software and more to do with following our passions. I’ve found that there is nothing like tool development to spur creativity and improve the morale of testers. Frankly, I think it is the most satisfying part of the job. Maybe I am more of a tools guy than I am a testing guy at heart.

HGTS: What is it that you like most about the organizational structure at Google?

James: Softball! I actually explain this to candidates when I need to sell them on Google: Testers report to testers and testers determine their own fate. These are the two things I like best about Google. Testers aren’t beholden to anyone. Test has our own hiring committee, our own review committee, our own promotion committee. The U.N. recognizes Test as a separate country!

We don’t have a subservient culture here. And if you want another reason, it would be scarcity of the test role. No dev team is granted testing head count without having to earn it by putting their own skin in the quality game. And testers have to be clever. We are so understaffed that we have to get good at prioritizing. We have to get good at automating. We have to get good at negotiating with developers. Scarcity is the precursor to optimization. Pat did a lot of things right, but scarcity, I think, is the one thing that forced a lot of the culture change.

HGTS: How long did it take you to acclimate to the Google culture given that you came from Microsoft, which doesn’t have a centralized test organization?

James: Pat Copeland gave me two pieces of advice when I started. The first was to take some time to just learn. That was crucial. It takes a while to learn the ropes at a big company and being effective at Google was a different skill set than being effective at Microsoft. I took the first couple of months to do as Pat suggested—listen instead of speak, ask instead of try, and so on. I took him up on it. In fact, I actually think I stretched it for a couple of additional weeks!

HGTS: You said two pieces...

James: Oh yeah, sorry, got carried away. Occasionally Pat says wise things. I guess I was savoring one of them! The second I didn’t like so much, but it was even better advice as it turns out. He took me aside and basically said “Look dude, I know you have a reputation outside Google, but inside, you haven’t accomplished anything yet.” Pat is rarely subtle; you don’t have to waste a lot of time looking for nonvisual cues from him. His messages are always clear and this time, the message was that Google didn’t care what I did before I got here. I had to succeed at Google as a Googler or none of the other stuff mattered. You can’t get ahead at Google by walking around and proselytizing. His suggestion was that I take on something big and ship it, with some distinction if possible. I chose Chrome and Chrome OS and did just that. I was the first test manager ever for Chrome OS and after it shipped, I passed it on to one of my directs. Pat was right. It is easier to get things done after you’ve done something significant. My resume got me in the door, but success inside the walls of Google was crucial. The fact that I did it and contributed to a product people cared about made me someone to listen to. If I ever change jobs again, I’ll reuse this formula: Learn first, build street cred second, and then start looking for ways to innovate.

HGTS: So beyond product testing, were there any other big areas Pat asked you to pick up?

James: Yeah, he asked me to own the TE discipline. The SET role had been around for a while and was pretty well understood so we knew the job ladder pretty well. We understood expectations and knew how to manage SETs for review and promotion. But for TEs, we were still trying to figure it out. Pat timed the renewal of focus on the TE role for my arrival. I think he knew he was going to use me for this even before I started. I actually suspect that he thought the pendulum had swung too far toward the SET and wanted to breathe life back into the TE role. Mind you, he never told me that, it’s just my feeling.

HGTS: So what did you do for TEs?

James: Pat and I started a TE workgroup, which is actually still in existence. We met every other week, initially for two hours, and eventually down to monthly for an hour. Pat attended a few meetings and then left it to me to run it. The group was made up of about 12 TEs who were all hand-picked by Pat. I was too new to know any of them. For the first meeting, we started two lists: what rocked about the role and what sucked about it. Just making these lists was cathartic for many of the attendees. They were complaining about the same things and agreeing about what rocked. I was struck with how honest they were with Pat there. No one was sugar-coating anything! How many meetings did I attend during my career where everyone waited for the most important person in the room to talk and held their tongue because that person was there? Not at Google. No one cared what Pat thought, the meeting was about them, and if Pat couldn’t take it, then that was his problem. It was incredibly refreshing.

That workgroup did a lot to define the TE role and rewrote the job ladder used for TE promotion guidelines. We put the new ladder to an open vote among the entire TE community and it was approved. It was cool, I spot-bonused the whole workgroup to celebrate. It was a real grass roots effort and it succeeded. We also wrote a bunch of interview guidances that were sent around to SETs and SWEs to teach them how to interview proper testers. The same information went to recruiters as well. I think it is safe to say that the TE role is as well defined as the SET role now.

HGTS: You’ve been at Google long enough to learn its inner workings. Can you boil Google’s secret sauce down for us? What are the key ingredients to our testing magic?

James: Tester skill set (including computer science degrees), scarcity of testing resources as a forcing function to get developers to help and to get testers to optimize, automation first (so humans do only what computers aren’t good at), and our ability to iterate and rapidly integrate user feedback. Any company looking to mimic what we have done should start with these four things: skill, scarcity, automation, and iterate-integrate. That’s your recipe. Go cook up some Google testing stew!

HGTS: What about another book? Do you have another testing book in you?

James: I don’t know. I never plan my books in advance. My first book started its life as course notes that I used at Florida Tech to teach testing to my students. I didn’t plan on making it a book until I presented it at STAR and some lady came up to me and asked if I had ever considered writing a book. It was a loaded question. She was a book publisher and that’s where How to Break Software came from. I wrote every word of that book and it was a long and cumbersome process that burned me out on book writing. My next two books both had coauthors. Hugh Thompson wrote How to Break Software Security and I helped. Mike Andrews wrote How to Break Web Software and again my role was to help. Those are really their books. I was there as a writer and thinker and manager of getting it done. I do love to write and neither Hugh nor Mike would begrudge me saying that I am better at it than they are. Would you two? I didn’t think so. Neither of them would have written those books (although Hugh did go on to write another, still my conjecture holds!) if it hadn’t been for me. Ultimately, whatever is going on in my professional career ends up as a book and the people around me end up as coauthors. Deny it, I dare you.

HGTS: Uh, well, readers are kind of holding the proof of what you say in their hands! We decline denial!

James: I’m not completely incapable of writing solo and the Exploratory Testing book was the next one I did alone. That one also came out of presentations I was giving at conferences. I did a stint of tutorials and the material built up until it was enough for a book. I am not sure I would have written this book if you two hadn’t agreed to help. This was the first time the book was a total collaboration, though. I think the three of us contributed pretty equally.

HGTS: We did and we are both really happy to be involved. We may be able to code circles around you (and Jeff can code laps around you and most other people on the planet), but we’ll grant you the readability in the English language! But tell us, what is your favorite part of the book?

James: All of it. This book was really fun to write. It wasn’t so much about conjuring the material because the material was all there. All we had to do was document it. I guess if I had to pick, it would be the interviews. They were fun to give and fun to write up. I actually wish I had done these interviews when I first started. Hung Dang’s was a real highlight. He and I spent time touring his Android lab and debated testing philosophy and the interview was intense. I hadn’t taken notes that fast since I was in grad school. It was the most quality time I ever spent with him. I learned a lot about the people and processes here, much of which I really didn’t know until I had to write it down. I guess that’s the thing about being a journalist, you really get to know your subjects.

HGTS: What would you do if you weren’t into testing?

James: Well in the technology field I’d work in developer tools and developer evangelism. I’d like to make writing software easier. Not everyone is as good at it as Jeff CarolIo! I can’t believe we are still hand-coding applications. The developer skills I learned back in college in the 80s are still applicable. That’s crazy. The entire world of technology has changed and we’re still writing in C++. Why hasn’t software development gotten any easier? Why is it that writing crappy, insecure code is still the default case? Writing good code should be easier than writing bad code. That’s the problem I would work on.

The evangelism part is important too. I love public speaking and I love talking about technology to technologists. The idea that someone would hire me to interact with developers as my day job is even more exciting than testing. Let me know if you find any gig like that.

HGTS: And what if you got out of the technology field?

James: Well that’s harder because I don’t see my next career. I am still passionate about this one. But I think I’d like to teach management courses. You guys keep telling me I am a decent manager and one of these days I am going to sit down and think about why I don’t suck at it. Maybe that will be my next book, How Not to Suck as a Manager. I’d also like to do work for the environment. I like this planet and think it’s worth keeping nice.

Oh, and I like my beer. I very much like my beer. I can see doing the whole Norm from Cheers thing one of these days. I can picture it now: I walk into my local and everyone shouts “James!” and whoever is on my bar stool makes way.

Now that’s what I call a job well done. I’d peer bonus Norm out of sheer respect.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4. The Test Engineering Manager

Create new playlist

Sign In

Sign Up

Chapter 4. The Test Engineering Manager

The Life of a TEM

Getting Projects and People

Impact

An Interview with Gmail TEM Ankit Mehta

An Interview with Android TEM Hung Dang

An Interview with Chrome TEM Joel Hynoski

The Test Engineering Director

An Interview with Search and Geo Test Director Shelton Mar

An Interview with Engineering Tools Director Ashish Kumar

An Interview with Google India Test Director Sujay Sahni

An Interview with Engineering Manager Brad Green

An Interview with James Whittaker

Table of Contents for
Chapter 4. The Test Engineering Manager