Chapter 10. Code Review for Security

We’ve looked at how to deal with security in planning, in requirements, and in design. Now it’s time to deal with security at the code level.

At least half of security vulnerabilities are introduced in coding, by developers making simple programming mistakes, by not being careful, missing a requirement, or misunderstanding or misusing the language, libraries, and frameworks.

There are two different basic approaches for catching problems, including security issues, in code:

Testing

Whether automated or manual, including black-box scanning for security vulnerabilities.

Code reviews

Including pair programming and peer reviews, code audits, and automated code scanning.

We’ll look at the strengths and weaknesses of both approaches in the next two chapters. Let’s start by understanding how code reviews fit into Agile development, and how they can be used to find important problems in development.

Why Do We Need to Review Code?

Code reviews are done for many different reasons:

Governance

Peer reviews can play an important step in change control by ensuring that at least one other person is aware of and, implicitly or explicitly, approved the code change.

Transparency

Code reviews provide team members with information about what is happening in the project, creating awareness of how the system works and how it is changing. By shining a light on every change, reviews also minimize the threat of a malicious insider planting a logic bomb or back door or trying to commit fraud.

Compliance

Code reviews are required by compliance regulations such as PCI DSS.

Consistency

Code reviews help reinforce coding conventions, style, and patterns across the team.

Learning

Code reviews provide an opportunity for less experienced developers to learn about good practices and tricks of the trade. Both authors and reviewers can learn from each other.

Sharing

Reviewing each other’s code creates a shared sense of ownership. If reviews are done regularly, developers become less protective about their code, and more open to change and feedback.

Accountability

Knowing that somebody else is going to look closely and critically at your work encourages developers to be more careful and thoughtful and take fewer shortcuts, which means better code.

But what’s most important for our purposes is that code reviews are an effective way to find bugs, including security vulnerabilities, as long as they are done properly.

Types of Code Reviews

There are many different ways to review code:

  • Formal inspections

  • Rubber ducking (self-review)

  • Pair programming

  • Peer code reviews

  • External code audits

  • Automated code reviews

Let’s look at the strengths, weaknesses, and costs of each approach.

Formal Inspections

Formal code inspections are done in meetings by a review team (involving the author of the code, a code reader who walks through the code, one or more reviewers, and a moderator or a coach) sitting around a table carefully looking at code printouts or code projected onto the wall. They are still done by some teams, especially in high-risk, life-critical systems development. But they are an expensive and inefficient way to find problems in code.

There are several recent studies which prove that setting up and holding formal code inspection meetings significantly adds to development delays and costs without adding significant value. While it can take hours to do all the planning, paperwork, and follow up, and weeks to schedule a code inspection meeting, less than 5% of defects are actually found in the meeting itself. The rest are all found by reviewers looking through code on their own while preparing for the meeting.1

Rubber Ducking or Desk Checking

The term “rubber ducking” is based on the idea that if you don’t have anyone else to review your code, walking through the code and explaining it to an imaginary other person (or a rubber duck) is better than not reviewing the code at all.

Self-reviews don’t meet governance, compliance, transparency, and information sharing requirements. But if they are done in a disciplined way, self-reviews can still be an effective way to find defects, including security vulnerabilities.

In one study on code reviews at Cisco Systems, developers who double-checked their work found half of the defects that other reviewers found, without any help.

Take a Break Before Reviewing Your Own Code

If you can afford to wait a few days between coding and reviewing your own code, you will be more likely to see your mistakes.

Pair Programming (and Mob Programming)

Pair programming, where two developers write code together, one at the keyboard driving, and the other navigating and assisting (like a couple on a road trip), is a fundamental practice in Extreme Programming (XP). Pair programming provides immediate, continuous code reviews: developers work together closely, share ideas, look at problems, and help each other write better code. Organizations such as Pivotal Labs are famous for their success with pair programming.

Pair programming is about joint problem-solving, driving, and navigating toward a solution. It’s a great way to bring a new team member up to speed or to debug a tricky problem. Pairing tends to result in cleaner, tighter code, with clearer abstractions, and fewer logical and functional errors. But unless you are pairing developers with a security expert or a technically strong experienced developer, pair programming won’t necessarily result in more secure code.

Pair programming also comes with costs. While two heads are better than one, it’s obviously more expensive to have two developers working on the same piece of code together. Pairing is also an intensive, highly disciplined practice, which many people find socially straining, mentally exhausting, and difficult to sustain. Few developers pair up more than a few hours a day or a few days in a week.

A more extreme version of pair programming is mob programming, where an entire team works together on the same piece of code, using a single computer. This approach emphasizes collaboration, team problem-solving, and learning. Only a small number of organizations have had success working this way, but the benefits should include complete transparency and even better quality code.

Peer Code Reviews

Lightweight, informal peer code reviews are common practice in many Agile and DevOps teams. Organizations like Google, Microsoft, Facebook, and Etsy all encourage, or insist on, peer reviews before code can be released to production. Code reviews are also a key part of the workflow for making changes in most large open source projects, such as the Linux kernel, Apache, Mozilla, and Chromium.

Peer reviews are either done by people sitting side by side (in over-the-shoulder reviews), or by requesting a code review through email, through a Git pull request, or using a collaborative code review tool like Phabricator, Gerrit, Review Board, Code Collaborator, or Crucible. Using these tools, reviewers—even in distributed teams—can share feedback with the author and with each other, comment or annotate code that should be changed, and open discussion threads. This automatically creates records that can be used for governance and compliance.

Where peer code reviews are already part of the engineering culture and practices of a team, you can take advantage of this in your security program by teaching the team to include checking for secure coding practices and security-related risks and issues.

Code Audits

While pair programming and peer reviews are generally done as part of the day-to-day job of writing code, code audits are separate and outside of development. In a code audit, an expert security reviewer (or a small team of reviewers) from outside of the team examines the code base for security vulnerabilities, or at least as much of the code base as they can within the time allowed. These reviews are often required for compliance reasons or to manage important risks.

Code audits usually take several days of reviewer time, as well as time from the team to help the reviewer(s) understand the design and context of the system and the structure of the code, and then more time from the team to understand what the reviewers found (or think that they found) and to put together a plan to address the findings.

Code auditors bring specialized security knowledge or other expertise that most developers won’t have. But because this work can be so mentally fatiguing, and because reviewers usually have limited time and limited familiarity with the code, they may miss important issues. A successful audit depends on the reviewer’s experience, understanding of the language and technology platform, ability to come up to speed with what the system does and how it does it, and mental stamina.

We’ll look more at code audits in Chapter 12, External Reviews, Testing, and Advice.

Automated Code Reviews

Code scanning tools can be used to automatically review code for bad coding practices, and common coding mistakes and vulnerabilities. At a minimum, automated reviews using code scanning tools act as a backstop to manual reviewers, by catching careless and often subtle errors that are hard to see in the code.

We will look at the strengths and weaknesses of automated code reviews in more detail later in this chapter.

What Kind of Review Approach Works Best for Your Team?

To be effective and sustainable in a fast-moving environment, code reviews need to be Agile: lightweight, practical, inexpensive, and fast.

Formal inspections are rigorous, but they are expensive and too slow to be practical for most Agile teams.

As we’ve seen, getting developers to carefully review their own work can catch bugs early. But self-reviews do not meet compliance or governance requirements, they don’t provide increased transparency into changes or help with sharing information and ideas across the team, so this approach will not “do,” even if it is better than not doing a review at all. Unless of course there’s only one of you.

Code audits sit outside of the development cycle: they are a point-in-time risk assessment or compliance activity. You can’t rely on them as part of day-to-day development.

Pair programming isn’t for everyone. People who like it, like it a lot. People who don’t like it, don’t like it one bit. While pair programming is a great way for developers to share ideas and solve hard problems together, mentor new team members, and refactor code on the fly, it is not necessarily an effective way to find and prevent security issues.

This leaves peer reviews. Practiced properly, asking team members to review each other’s code is probably the most effective approach for Agile environments. This kind of review can be done often, in small bites, without adding significant delays or costs. We’ll focus on how to incorporate peer reviews into development, and how to include security checking into peer reviews, in this chapter.

When Should You Review Code?

There are different points in development where you can and should review code: before the code is committed to the mainline, before code changes are released, and after problems are found.

Before Code Changes Are Committed

The most natural and the most valuable time for code reviews is before the change is committed to the code mainline. This is how many Agile teams and open source teams, especially teams using collaborative code review tools like Gerrit, work today.

Modern source code management systems such as Git make this easy to do. In Git, engineers create a pull request when they want to push code changes to a source repo. The pull request tells the rest of the team about the change and gives them an opportunity to review and discuss the change before it can be merged. Repo administrators can enforce rules to require approvals and set other contributing guidelines.

In many environments, enforcing code reviews up front is the only way to ensure that reviews get done at all: it can be difficult to convince developers to make code changes after they have already checked code in and moved on to another piece of work.

This is also a good place to add some automated checking. For example, ThoughtWorks has built a pre-commit tool for teams using Git, called Talisman, which looks for suspicious things in code, like secrets, and blocks the code from being checked in. You could extend this tool (or build your own) to implement other checks that are important in your environment.

Gated Checks Before Release

Security and compliance can insist that code reviews are completed at least for certain stories, or for high-risk fixes before the the code can be considered done. This means that the team cannot move forward and can’t deploy changes until these reviews have been completed and any issues found during the reviews are resolved.

Your goal should be to work with the team and keep it moving in a safe way. Instead of leaving these checks to the end as a final check before release, build steps into the team’s workflow so that the security team is immediately notified when these reviews are required and can do them as early as possible. Force the team to stop work only when necessary to keep risks under control.

Postmortem and Investigation

Another important time to do code reviews is in a postmortem situation, after a security breach or outage, or if nasty surprises are found in an external audit or a pen test. We will look more at postmortems and how to do them properly in Chapter 13, Operations and OpSec.

A postmortem code review is usually done by senior members of the team and may include outside experts, depending on how bad the problem was. The goals of these reviews are to ensure that you understand what went wrong, that you know how to fix it, and to help with root cause analysis—to dig deep into why the problem happened and find a way to prevent problems like this in the future.

There will also be some kind of paperwork required, to prove to somebody—senior management, auditors, regulators—that the code review was done properly, and to document follow-up actions. This can be expensive, serious work. Make sure that people learn as much as they can from the “opportunity” that a bad problem presents.

How to Review Code

The team, including security and compliance, the Product Owner, and management, needs to agree on how code reviews are done.

This includes rules of conduct that guide the team on how to give and accept feedback constructively without hurting people and pulling the team down.

Giving critical feedback isn’t easy for developers. Accepting critical feedback can be even harder. Even if you are careful not to tell somebody they are stupid or did something stupid, they might take your feedback that way. Enforce a “no asshole rule”: be critical, but polite. Encourage reviewers to ask questions rather than pass judgments whenever possible. Remember, you are reviewing the code, not the person. Instead of saying, “You should not do X,” try, “The code should do Y instead of X.”

Another important rule of conduct is that reviewers must commit to responding to review requests in a reasonable time frame. This is important in scheduling and to ensure that the team doesn’t lose momentum.

Take Advantage of Coding Guidelines

Coding guidelines are important in Agile teams to support collective code ownership, the principle that anybody on the team should be able to take on any coding assignment or fix any bug (within his or her ability), and everyone can refactor each other’s code. In order to do this effectively, the code has to be consistent and follow a common style and conventions.

The team’s code guidelines should prescribe surface issues like element naming and code indentation and formatting, as well as more important underlying issues: proper use of frameworks and libraries, auditing and logging patterns that need to be followed, and banned practices and banned function calls.

There are some good freely available coding guidelines that you can use:

Even if your team and your code are wildly outside compliance with these guides, it’s worth picking one as a target and defining a plan to iterate toward that target. This will be easier than trying to create your own guide from scratch.

By following one of these guides, you should end up with cleaner, more consistent code that is easier to understand and review, and code that is more secure by default.

Using Code Review Checklists

Code review checklists are an important tool in reviews. But these checklists can be, and should be, short and focused.

You don’t need a checklist item to tell the reviewer to check whether the code is understandable, or that it actually does what it is supposed to do. Basic rules of programming or language-specific rules don’t need to be on a checklist. Automated static analysis tooling, including checkers in the developer’s IDE, should catch these issues.

Note

Checklists are used by airline pilots and ICU nurses and surgeons to remind them about small, but important things that are easy to forget while they are trying to focus on the main task.

For more on checklists, read Dr. Atul Gawande’s excellent book, The Checklist Manifesto (Metropolitan Books).

Use checklists to remind people to look for things that aren’t obvious. Remind reviewers to look out for things beyond code clarity and correctness—for things that are important, but easy to overlook and forget, and things that tools won’t find for you. As we’ll see in this chapter, this should include things like correct error handling, looking out for secrets in code and configuration, tracing the use of private or sensitive data, and watching for debugging code or test code left in by accident. Build your checklists up from issues that you find in production or in pen testing so that you don’t repeat the mistakes of the past.

Don’t Make These Mistakes

There are some common mistakes and anti-patterns that you need to avoid when it comes to code reviews:

  1. Believing that senior people don’t need their code reviewed. Everyone on the team should follow the same rules, regardless of seniority. In many cases, senior team members are the ones who take on the most difficult coding tasks and problem-solving work, and this is the most important code to review.

  2. Not reviewing legacy code. A study on the honeymoon effect in software development proves that there is a honeymoon period after new software features and changes are deployed before attackers have the chance to understand them and attack them. Most successful attacks are made against code that has been out long enough for bad guys to identify vulnerabilities and to find a way to exploit them. This is especially the case for popular third-party and open source libraries and platform code. Once attackers have found a vulnerability in this code and learned to fingerprint it, the risk of compromise increases significantly.2

    So while you need to review changes as they are being made, because it is much easier and cheaper to fix the mistakes that you find right away, it is still important to look at older code, especially if you have code written before the team got training on secure development.

  3. Relying only on automated code reviews. Automated code scanning tools can help you to find problems in code and identify where code needs to be cleaned up, but as we’ll show later, they are not a substitute for manual code reviews. This is a case of AND, not OR.

Avoiding these fundamental mistakes will help you improve the security and quality of your code.

Review Code a Little Bit at a Time

Another common mistake is forcing team members to review large change sets. Experience backed up by research proves that the effectiveness of code reviews negatively correlates with the amount of code reviewed. The more files and lines of code that a reviewer has to look at, the more tired he will get, and the less he will be able to find.

A reviewer who is forced to look at 1,000 lines of code changes might comment on how hard it is to understand or how something might be made simpler. But she won’t be able to see all the mistakes that are made. A reviewer who only has to look at 50 or 100 lines of code will have a much easier job of finding bugs, especially subtle mistakes. And he will be able to do this much faster.

In fact, research shows that the effectiveness of code reviews starts to drop off after around 200 lines of code, and that the number of real defects that reviewers find falls off sharply when they are asked to review code for more than an hour (see the “Modern Code Review” chapter of Making Software [O’Reilly]).

This is an unavoidable problem in code audits, where auditors need to scan through thousands of lines of code each day, often for several days at a time. But it should be much less of a concern for Agile teams, especially teams following continuous integration, because they tend to make small, iterative changes, which can be reviewed immediately on each check-in.

What Code Needs to Be Reviewed?

The team needs to agree on what code has to be reviewed, and who needs to be involved in code reviews.

In a perfect world, all code changes should be reviewed for maintainability, for correctness, and for security. This demands a high level of engineering discipline and management commitment.

If you aren’t ready for this yet, or can’t convince your management or your customer to give you the time to review all code changes across the board, you can still get a lot of benefit by taking a pragmatic, risk-based approach. Most changes in continuous integration are small and incremental, and carry limited security risk, especially code that will be thrown away quickly when the team is experimenting and iterating through design alternatives.

For many of these changes, you can lean on automated code scanning tools in the developer’s IDE and in continuous integration to catch common coding mistakes and bad coding practices. As we’ll see later in this chapter, there are limitations to what these tools can find, but this might be good enough to contain your risks and help you to meet your compliance requirements (regulations like PCI DSS require all changes to be reviewed, but also allow for automating code reviews).

But some code, such as the following, needs to be looked at more closely and carefully, because the risk of making and missing a mistake is too high:

  • Security features like authentication and access control, crypto functions, and code that uses crypto functions

  • Code that deals with money, or with private or confidential data

  • APIs with other systems that deal with money or with private or confidential data

  • Web and mobile UIs (i.e., big changes or new workflows)

  • Public network-facing APIs and file imports from external sources (i.e., code that is frequently exposed to attack)

  • Framework code and other plumbing

  • First-of code using a new framework or design approach, until the team gets comfortable with how to do this properly

  • Code written by new team members (at least if they aren’t pairing up with a more experienced team member) to make sure that they follow the team’s coding patterns and guidelines

Changes to frameworks or security features or public APIs should also be reviewed in design, as part of threat modeling, which is discussed in Chapter 8, Threat Assessments and Understanding Attacks.

High-Risk Code and the 80:20 Rule

The 80:20 rule reminds us that most problems in a system tend to cluster in specific parts of the code:

80% of bugs are found in 20% of the code

Some studies have found that half of your code might not have any bugs at all, while most bugs will be found in only 10–20% of the code, likely the 10–20% of the code that is changing most often.3

Tracking bugs can help you to understand where to focus your reviews and testing. Code with a history of bugs should also be a strong candidate for careful refactoring or rewriting.

Although it will usually be obvious to the team what code changes carry risk, here are some steps to identify and track code that should be reviewed carefully:

  • Tagging user stories for security features or business workflows which handle money or sensitive data.

  • Grepping source code for calls to dangerous function calls like crypto functions.

  • Scanning code review comments (if you are using a collaborative code review tool like Gerrit).

  • Tracking code check-in to identify code that is changed often: code with a high rate of churn tends to have more defects.

  • Reviewing bug reports and static analysis to identify problem areas in code: code with a history of bugs, or code that has high complexity and low automated test coverage.

  • Looking out for code that has recently undergone large-scale “root canal” refactoring. While day-to-day, in-phase refactoring can do a lot to simplify code and make it easier to understand and safer to change, major refactoring or redesign work can accidentally change the trust model of an application and introduce regressions.

Netflix has an interesting way to identify high-risk code. It wrote a tool called Penguin Shortbread, which maps out call sequences for microservices. Services that are called by many other services or that fan out to many other services are automatically tagged as high-risk dependencies that need to be reviewed.

At Etsy, as soon as high-risk code is identified through code reviews or scanning, its developers hash the code and create a unit test that automatically alerts the security team when the code hash value has been changed through a check-in.

Finally, you can make the team responsible for watching out for risks in code, and encourage developers to ask for a code review whenever they think they need help.

Who Needs to Review Code?

Once you agree on what code needs to be reviewed and when reviews should be done, you need to decide who needs to be involved in code reviews, and how many reviewers need to be involved.

Can anyone on the team act as a reviewer, or do you need to include somebody who has worked on the code before, or a subject matter expert? When do you need to get more than one person to review the code?

How Many Reviewers?

Reviews (and pair programming) are based on the reasonable principle that two people will find more problems than one person can on her own. So if two heads are better than one, why not have more reviewers, and get even more eyes on the code?

At Google and Microsoft, where they’ve been doing code reviews successfully for a long time, experience shows that two reviewers seems to be the magic number. Most teams in these organizations require two reviews, although there are times when an author may ask for more input, especially when the reviewers don’t agree with each other.

Some teams at Microsoft specifically ask for two different kinds of reviews to get maximum value from each of the reviewers:

  1. A review before the code is checked in, focused more on readability and clarity and maintainability

  2. A second review (done before or after check-in) to look for risks and bugs

Studies have found that a second reviewer will only find half as many new problems as the first reviewer. Beyond this point, you are wasting time and money. One study showed no difference in the number of problems found by teams of three, four, or five individuals, while another showed that two reviewers actually did a better job than four.4

This is partly because of overlap and redundancy. More reviewers means more people looking for, and finding, the same problems (and more people coming up with false positive findings that the author has to waste time sifting through). You also encounter a “social loafing” problem. Complacency and a false sense of security set in as you add more reviewers: because each reviewer knows that somebody else is looking at the same code, they are under less pressure to find problems on their own.

But what’s even more important than getting the right number of reviewers is getting the right people to review your code.

What Experience Do Reviewers Need?

A new team member will learn a lot by having an experienced team member review his code, reinforcing his understanding of how the system and the team work. By reviewing code for other people on the team, a new developer will gain exposure to more of the code base and learn something about the system. But this is an inefficient way to learn. And it misses the main point of asking for a code review, which is to find and correct as many problems with the code as early as possible.

Research backs up what should be obvious: effective code reviews depend heavily on the reviewer’s skill and familiarity with the problem domain and platform, and her ability to understand the code. A study on code reviews at Microsoft found that reviewers from outside of the team, or who were new to the team and didn’t know the code or the problem area, could only do a superficial job of finding formatting issues or simple logic bugs. Like other areas in software development, the range of individual performance can be huge: top performers are 10 times more effective in finding problems and providing valuable feedback.

This means that your best, most experienced developers will spend a lot of time reviewing code, and they should. You need reviewers who are good at reading code and good at debugging, and who know the language, frameworks, and problem area. They will do a much better job of finding problems and can provide much more valuable feedback, including suggestions on how to solve the problem in a simpler or more efficient way, or how to make better use of the language and frameworks. And they can do all of this much faster.

Note

Regulations may also dictate how much experience reviewers require. For example, PCI DSS 6.3.2 requires that reviewers must be “knowledgeable in code review techniques and secure coding practices.”

If you want new developers to learn about the code and coding conventions and architecture, it will be much more effective to partner them up with an experienced team member in pair programming or pair debugging exercises than asking them to review somebody else’s code. If you have to get inexperienced developers to review code, lower your expectations. Recognize that you will need to depend a lot more on automated static analysis tools and testing to find real problems in the code.

Automated Code Reviews

Automated static analysis scanning tools should be part of your code review program for the following reasons:

  • Automated code scanning is the only practical way to ensure coverage on a large legacy code base and to provide a level of insight into the security and quality of this code.

  • Static scanning can be done on a continuous basis against all of your code. The tools never get sick or tired of looking for the same kinds of problems and are much more consistent than manual reviewers.

  • Unlike human reviewers, good static analysis tools won’t get held up by bad element naming or indentation or other cosmetic issues.

  • While as we’ll see, automated code scanners may miss finding many important vulnerabilities, they are good at finding certain bugs that are important for reliability and security. This is especially valuable if you don’t have strong security skills in-house to do effective manual reviews.

  • Automated code review tools are an accepted alternative to manual code reviews in regulations such as PCI DSS. For many teams this is a practical and cost-effective way to meet compliance requirements.

Some static analysis tools scan byte code or binaries. These are the easiest to set up and get started with, because you can point the scanner at a deployed package and run it, or upload binaries to a scanning service like Veracode and let them scan the code for you. This approach is especially useful for checking code that you don’t have source for.

Other tools scan source code directly, which means that you don’t need to wait to compile the code before you scan it, and you can scan individual pieces of code or change sets. But in order to get a complete scan of the system, you need to understand the application’s library structures and all the code dependencies.

Automated tools will usually point you to the specific line of code where problems are found (and often showing you how this statement was reached). Some tools do simple pattern matching or linting, while other tools build up an abstract model of the application, map out data flows, and walk code execution paths to find vulnerabilities like SQL injection and cross-site scripting. Because this can take a long time for large code bases, some analysis engines will save the abstract model, scan only the code that has been changed, and update the model incrementally.

Code analysis tools can catch common but important coding mistakes and can help enforce good coding practices, so that reviewers can focus on other important problems. They are generally good at finding the following issues:

  • Sloppy code, including code that is poorly structured, dead or unreachable code, copy-and-pasted code sections, and code that violates recognized good coding practices.

  • Subtle coding mistakes that compilers should catch but don’t: the kind of mistakes that are hard to find in testing and in manual code reviews, like errors in conditional logic, buffer overflows and memory corruption problems in C/C++, null pointers in Java.

  • Missed data validation and injection vulnerabilities, where programmers fail to check the length of input strings or pass unsanitized “tainted data” on to an interpreter such as a SQL database engine or a browser.

  • Common mistakes in security functions like applied crypto (weak ciphers/hash functions, weak keys, weak random number generators).

  • Common configuration problems such as insecure HTTP headers or cookie flags.

These tools should be a part of your code review program, but they shouldn’t be the only part of your code review program. To understand why, let’s start by looking at the different types of tools and what they are good for.

Different Tools Find Different Problems

For most environments you have a choice of static analysis tools, designed to look for different problems.

Compiler warnings

Static code analysis should start with checking for compiler warnings. The compiler writers put these warnings in for a reason. You don’t need to buy a tool to tell you something that your compiler will already find for you up front. Turn the warning levels up, carefully review the findings, and clean them up.

Code style and code smells

Tools that check for code consistency, maintainability, and clarity (PMD and Checkstyle for Java, Ruby-lint for Ruby) help developers to write code that is easier to understand, easier to change, easier to review, and safer to change. They can help to make sure that all of your code is consistent. They will also point out areas where you could have bad code, including code that doesn’t follow safe conventions, and common mistakes in copy-and-paste, or merge mistakes that could be serious.

But unless you are following a recognized style guide from the beginning, you will need to customize some of the coding style rules to match your team’s coding conventions.

Bug patterns

Tools that look for common coding bugs and bug patterns (tools like FindBugs and RuboCop) will catch subtle logic mistakes and errors that could lead to runtime failures or security vulnerabilities.

Security vulnerabilities (SAST)

Tools that identify security vulnerabilities through control flow and data flow analysis, heuristics, pattern analysis, and other techniques (Find Security Bugs, Brakeman, Fortify) can find common security issues such as mistakes in using crypto functions, configuration errors, and potential injection vulnerabilities.

These tools are sometimes referred to as “SAST” for “Static Analysis Security Testing,” a classification popularized by Gartner, to differentiate them from black-box “DAST” tools like ZAP or Burp, which are used to dynamically scan a running application. We will look at DAST scanning tools in the next chapter.

Custom greps and detectors

Simple, homemade tools that grep through code looking for hardcoded credentials, unsafe or banned function or library calls (such as gets and strcpy and memcpy in C/C++, or eval in PHP and Javascript), calls to crypto libraries, and other things that the security team or the development team want to watch out for.

You can also write your own custom checks using extensions provided by other tools, for example, writing your own bug detector in FindBugs, or your own PMD coding rule, although only a few teams actually do this.

Catching mistakes as you are coding

You can catch some issues automatically in a developer’s IDE, using plug-ins for Eclipse or Visual Studio or IntelliJ, or the built-in code checkers and code completion features that come with most modern development toolsets.

These tools can’t do deep data flow and control flow checking, but they can highlight common mistakes and questionable code as you are working on it, acting like a security spell-checker.

The following are free IDE plug-ins:

Other IDE plug-ins for tools like HPE Fortify take results from batch scans done previously and present them to the developer in the IDE as they are working on those areas of code. This makes it easier for developers to see existing problems, but they won’t immediately catch any new mistakes.

Vulnerable dependencies

Tools like OWASP’s Dependency-Check, Bundler-Audit for Ruby projects, or Retire.JS for JavaScript will inventory your build dependencies and check to make sure that they do not contain any known vulnerabilities.

You can decide to automatically fail the build if the checks find a serious security vulnerability or other issue. We reviewed these tools in more detail in Chapter 6, Agile Vulnerability Management.

Code complexity analysis and technical debt metrics

Other tools can be used to report on metrics such as code complexity or other measurements of technical debt, identifying problem areas in the code (hot spots and clusters) or trends. Mapping code complexity against automated test coverage, for example, is a way to identify potential risks in the code base.

SonarQube, a popular code quality and security analysis platform, includes a technical debt cost calculator as well as other code quality and security measurements in its dashboard. It calculates technical debt by assigning weightings to different static analysis findings (coding best practices, dead code, code complexity, bugs, and security vulnerabilities) and gaps in test coverage, and calculates the cost of remediating all of these issues in dollars. Even if you don’t agree with SonarQube’s costing model, the dashboard is useful for tracking technical debt over time.

What Tools Are Good For, and What They’re Not Good For

Make sure that you and the team members understand what they are getting out of a static analysis tool and how much they can rely on them.

Some tools are much better at finding some problems than others, and this can depend a lot on your use of language, frameworks, and design patterns.

There are several good open source and commercial static analysis tools available today for mainstream languages like Java, C/C++, and C#, using common frameworks like Struts and Spring and .NET, and for other popular development environments like Ruby on Rails.

But it’s difficult to find good tool support for hot new languages such as Golang or Swift or F# or Jolie, and it’s especially difficult to find tools that can catch real problems in dynamically typed scripting languages like Javascript, PHP, and Python, which is where you need the best checking. Most code analyzers (at least the open source ones) for these languages are still limited to linting and basic checking for good practices, which helps to make for better code, but won’t ensure that your code is secure or even that it will run without crashing.

IAST or RASP: Alternatives to Static Code Analysis

IAST (Interactive or Instrumented Application Security Testing) and RASP (Runtime Application Self-Protection) are new technologies that offer an alternative to static code analysis. These tools instrument the runtime environment (for example, the Java JVM) and build a model of the application as it is executing, inspecting the call stack and variables to identify vulnerabilities in running code.

However, like static code analysis tools, language and platform support varies widely, and so does the effectiveness of the rules. The quality of coverage from these tools also depends on how thoroughly you exercise the code in your tests.

A static analysis tool can tell you when code makes unsafe library calls, but it can’t tell you if somebody forgot to make a call that he should have. Tools can tell you if you made a basic mistake in applied crypto, but can’t tell you if you forgot to encrypt or tokenize sensitive data, or forgot an ACL check, or if the code accidentally exposes sensitive data in exception handling. Only an experienced human reviewer can do this.

There is important research that can help us to understand the effectiveness, and limits, of automated static analysis tools.

In one study, researchers ran 2 different commercial static analysis tools against a large application that contained 15 known security vulnerabilities (found earlier in a structured manual audit done by security experts). The tools together found less than half of the known security bugs: only the simplest problems, the bugs that didn’t require a deep understanding of the code or the design.5

The tools also reported thousands of other issues that needed to be reviewed and qualified, or thrown away as false positives. Some of these findings included runtime correctness problems, null pointers, and resource leaks that probably needed to be fixed, and code quality issues (dead code, unused variables), but they did not uncover any other security vulnerabilities.

NIST has run a series of benchmarks to assess the effectiveness of static analysis tools called SAMATE. In its most recent analysis (2014) NIST tested 14 different static analysis tools on C/C++ and Java code with known vulnerabilities. This is what was found:

  1. Over half of the vulnerabilities were not detected by any of the tools.

  2. As the complexity of code increased, the ability of tools to find problems decreased significantly. Many tools simply gave up. This was also shown in the case of the Heartbleed bug in OpenSSL, which could not be found by any of the automated code analysis tools available, partly because the code was too complex.

  3. NIST also found a significant and disappointing lack of overlap between tools: less than 1% of the vulnerabilities were found by all the tools.6

More recently, OWASP’s Benchmark Project was started to create a test suite with known security vulnerabilities, designed to evaluate the effectiveness of different static analysis and application scanners. This project scores tools by subtracting false positive findings from true positives. The average score for the commercial SAST tools that they evaluated was only 26%.

While tools continue to get better—more accurate, faster, easier to understand, and with better support for more languages and frameworks—they can’t replace reviews done by smart people. But they can act as an effective backstop to manual reviews by catching common mistakes; and, by enforcing good coding practices and consistency, they can make manual reviews easier and more effective.

Getting Developers to Use Automated Code Reviews

You want to reach a point where engineering teams treat static analysis findings like they do unit tests: when they check in a change, and a tool reports something wrong, they will fix it immediately, because they have learned to trust the tool and know that they can rely on it to find important problems for them.

It should be easier to introduce static analysis checking for security vulnerabilities into teams that have already had good experience with static analysis tools for code quality and have learned to depend on them. Start with getting the team to use a good bug-finding tool, and the team has accepted it and successfully made it part of its day-to-day work, introduce security checking.

Dropping off a report on a developer’s desk with thousands of static analysis warnings isn’t going to get buy in from the team. Try taking an incremental, low-friction approach.

Take some time to understand how the tool works and how to configure the rules and checkers properly. Many teams rely on the default setup, which is almost never the right thing to do. Some tools come with conservative defaults, which means that they don’t apply rules and checkers that could be important for your application. Many other tools want to be as thorough as possible, enforcing checks that aren’t relevant and flooding developers with false positives.

Install the tool and run a set of scans with different rules and settings. Measure how long the scans take, and how much CPU and memory the tool uses. Review the results, and look for the sweet spot between maximizing true positive findings and minimizing false positives. Find the rules and checkers that have the highest confidence. Turn the other rules off, at least to start.

Check the Rules In

Make sure to check in the rules and settings that you are using to a repo so that you can review the rules that were in effect at any point in time. You may need to show this to an auditor to prove that you aren’t just performing security theater, by trying to get away with checks that are too easy on developers and that miss too many real problems in the code.

Establish a baseline. Run a complete scan, go over the results with the team, or maybe just a couple of the strongest developers to start, to explain what the tool can do and show that it works. Then mark everything that the tool found so that these findings won’t show up in future scans by default. This is security debt that you will pay off later.

Get the team agree to take a zero bug tolerance approach to any findings moving forward: from this point on, if the tool finds any serious problems, the team will agree to review them and fix all true positives.

Instead of having to sift through pages of review results, team members will see only a handful of findings each day, or each time that they check in code—however often you decide to run the scanner. This shouldn’t add much to their work, and eventually will become just another part of their day-to-day workflow.

After the team has gained confidence in the tool and learned how to work effectively with the results, it can schedule time to go back and review and fix the problems found in the baseline scan.

Rinse and Repeat

Because of the limited overlap between what each tool finds, to get effective coverage, you’ll have to use more than one static analysis tool. This means you you’ll need to go through all of these steps more than once. While it should get easier each time, be prepared for the costs.

Of course if you take this approach, you will be leaving some bugs in the system and vulnerabilities open for attack, at least for a while. But now at least you know what the problems are and where they are, and you can work with the Product Owner and the team and management on a plan to reduce the risks.

Self-Service Scanning

In many large enterprises, code scanning is done by the security team, in what Dr. Gary McGraw at Cigital calls a “scanning factory” model. The security team schedules and runs scans across different projects, reviews and triages the results, and reports the results back to development.

Working this way introduces unnecessary delays in feedback to the team. It can take days to complete the scans, review the results, and prepare reports. By the time that developers learn about problems, they may have already moved on to other features, which means that they have to stop what they are working on, switch back to their earlier work, restore the context, find the problem, fix it, test it, build it, and then switch back again. This is the kind of waste that Agile and Lean methods are intended to avoid.

Another disadvantage of this model is that it makes scanning “somebody else’s problem,” taking away any sense of ownership from the developers.

A more effective, more scalable, and more Agile way of working is to make code scanning available to developers on a self-service basis while they are working, in ways that make natural sense to them.

Instead of choosing standardized tools that are convenient for the security team to use across multiple projects, let the team choose tools that work best for its specific needs and that fit into its workflows.

The first place to include self-service scanning is in each developer’s IDE, using plug-ins or the IDE’s built-in rules to check for problems as they are coding, or when they save changes or compile code in their IDE. It’s especially important to have high-confidence, high-impact rules in place here, rules that highlight real problems; otherwise developers will quickly learn to ignore all alerts and highlighting.

But keep in mind that there’s no visibility to the rest of the team at this point into what problems were found, and no way to prove that developers are fixing these problems up front. So you will still need to add incremental code analysis and fast code scanning each time that the code is built, to catch any problems that might get past.

Most scanning tools have APIs that you can use to push results directly back into each developer’s IDE or automatically create bug reports in backlog tracking tools like Jira or VersionOne. Or you can use one of the application vulnerability management tools that we looked at in Chapter 6 to help with this.

Provide High-Fidelity Feedback

If you want developers to take the results of scanning seriously, you have to provide them with actionable feedback, highlighting real problems that they need to fix. Any time that a developer spends chasing down false positives or findings that aren’t relevant to her project is time wasted. If too much time gets used up this way, you will lose her cooperation, and her manager’s support.

Try to be ruthless when making trade-offs between completeness against speed and accuracy of results. Keeping the feedback loop tight and high fidelity is generally more important than completeness in continuous integration or continuous delivery.

As part of the feedback loop, make it simple for developers to report back when they see false positives or noise. Twitter provides a “bullshit” button which allows developers to report and suppress false positive findings.

Self-service, fast-feedback scanning requires tools that run quickly, and that provide clear context for each finding so that developers don’t need a security expert’s help to understand what the problem is, how serious it is, where it was found, how it was found, or how to fix it.

If you have the resources, you can still run deep, full code scans out of band in your security scanning factory, review and qualify the results, and feed them into the development backlog to get fixed. While this could take hours or days to run on large code bases, these scans should only pick up a small number of important exceptions that make it through the earlier, faster, high-certainty checking in development. So the cost of reviewing these issues and the impact on the team’s velocity to come back and deal with them should be minimal, and the risk of missing an important problem should be low.

Reviewing Infrastructure Code

Today’s systems are built using automated programmable configuration management tools like Ansible, Chef, Docker, and Puppet. Chef recipes, Puppet manifests, Ansible playbooks, Dockerfiles, and AWS CloudFormation templates should follow the same life cycle as your other code, from check-in, manual review, automated build checks, lint checks and other static analysis checks, and automated unit and integration and acceptance testing.

Code reviews for recipes and playbooks are done for many of the same reasons as application code:

  • Ensure that all changes to system configuration have been checked by at least one person (governance).

  • Check for maintainability and compliance with code conventions.

  • Check for correctness.

  • Make sure that test configurations are not accidentally pushed to production.

  • Enforce operational and security policies.

Reviewers can use the same code review methods, workflows, and code review tools that application developers use, such as Gerrit or Review Board. They should pay attention to style and structure problems, proper separation of code and data, reuse, and readability. Reviewers should also check to make sure that there are good tests written for each module. But most important, they should be on the lookout for configuration mistakes and oversights that could leave the system vulnerable to attack.

Engineers can’t lean too heavily on static analysis tools for this type of code. Tools like Foodcritic for Chef or Puppet-lint for Puppet do basic syntax checks and look for common coding mistakes, which is important in these scripting languages, where you want to find problems before runtime. Out of the box they won’t find serious security issues for you. You will need to write your own custom rules to do this.7

We’ll look more at how and where security comes in when working with these tools in Chapter 13, Operations and OpSec.

Code Review Challenges and Limitations

There are a lot of coding problems that a good reviewer will catch, if she has the skills and time. A good reviewer will find logical mistakes and oversights that automated code scanning tools and testing will often miss, such as the following:

  • Inconsistencies (the author changed a, b, and d, but forgot to change c).

  • Common coding mixups, like using < instead of < = or sometimes even > in comparisons.

  • Off-by-one errors.

  • Using the wrong variables in calculations or comparisons (buyer when they should have used seller).

Code reviewers can also find mistakes or weaknesses in design as they read through the code—if they know the system design well. But as we discussed in Chapter 8, Threat Assessments and Understanding Attacks, separate reviews should be done at the design level to examine trust assumptions, threats, and protection against threats.

It’s easy for code reviewers to find test or debugging code left in by accident, and they are likely to trip on redundant code and checks that don’t seem to be necessary, and other signs of copy-and-paste or merge mistakes.

Code reviews are probably the best way to find concurrency and timing problems (threading, locking, time of check/time of use), and mistakes in cryptography. As we’ll see later, they’re probably the only way to find back doors and time bombs.

But experience and research show that instead of finding bugs, reviewers end up spending most of their time finding and reporting problems that make the code harder to understand and harder to maintain. They get held up on things like poor element naming, misleading or missing comments, poorly structured code, and dead and unused code.8

There are some good reasons that code reviews don’t find as many bugs as they should:

  • Reviews take time to do properly.

  • Understanding somebody else’s code is hard (especially in complex or diverse technical environments, or large code bases).

  • Finding security vulnerabilities in somebody else’s code is even harder.

Let’s look at each of these issues in some more detail.

Reviews Take Time

First, it takes time to review code properly. The team, including and especially the Product Owner, as well as management, all need to understand that code reviews will take up a significant part of each developer’s day, and that waiting for code reviews will slow down delivery.

At an organization like Microsoft, developers spend on average between two and six hours a week reviewing code changes. Developers wait a day on average for code review feedback. But in some cases, reviewers may take days, or even weeks, to get back with their findings. These code changes can’t be checked in to the mainline, can’t be tested, and can’t be factored into delivery timelines.

So it is critical that teams build up the discipline and a collaborative and supportive culture, ensuring that everyone is prepared to step up and help other team members by getting their reviews done. And it is even more important that they have the discipline to take reviews seriously, and are willing to put the necessary work in to do a good job. Because, as we’ll see, doing code reviews properly isn’t easy.

Understanding Somebody Else’s Code Is Hard

One of the main reasons that reviewers don’t find as many bugs as they should or could is because in order to find problems, they need to understand what the code does and how and why it was changed.

Understanding all of this takes most of a reviewer’s time, which is why most review comments are about readability (naming, commenting, formatting) and how to make code simpler or clearer, instead of about more fundamental problems.

Tip

This is where code analysis tools that focus on enforcing coding style and conventions can help, by making code cleaner and more consistent.

If reviewers aren’t familiar with the code, they will also need to read more code to establish context, which means that they may run out of time before they can find anything materially wrong.

This is another reason why it’s important to have code reviews done by experienced team members. People who have worked with the code before—changed it themselves or reviewed it previously—have a clear edge over someone who has never seen the code before. They can work much faster, and provide more effective feedback.9

Finding Security Vulnerabilities Is Even Harder

Finding functional or logical defects in somebody else’s code is hard, as we’ve seen. A reviewer has to understand the purpose of the change, the context, and understand the code well enough to find mistakes in implementation or logic errors.

Finding security vulnerabilities is even harder. You start off with the same challenges—the need to understand the purpose and context, the structure and the coding style, and so on—and the reviewer also needs to understand secure coding and what to look out for.

Let’s look at some more research to see how hard reviewing for security vulnerabilities really is.

In a 2013 study, 30 PHP developers (including some security experts) were hired to do independent manual security code reviews of a small web app with 6 known vulnerabilities. The reviewers were given 12 hours to complete the reviews, and were not allowed to use static analysis tools.

None of the reviewers found all the known bugs (although several found a new XSS vulnerability that the researchers hadn’t known about), and 20% of the reviewers didn’t find any vulnerabilities at all. Reviewers with more coding experience didn’t necessarily find more security bugs—because even experienced developers don’t always understand what to look for in a security code review.10

We can see how difficult, and how important, code reviews are in security by looking closely at one of the most high-profile security vulnerabilities of the past few years: the OpenSSL Heartbleed vulnerability. This critical weakness in SSL handshaking was caused by not checking for a buffer over-read in C, a simple, low-level coding mistake.

What was surprising was that even though the original change was reviewed by someone who had worked on the OpenSSL code before, and the code was open source and available to anyone to read, it took more than two years before a security team at Google found the bug in a code audit.11

The author of the change, the original reviewer, several static analysis tools, and almost everyone in the open source community who downloaded and used the code all missed the bug because the code was simply too complicated to understand, which also meant that it was too complicated to change safely. This should never be the case for security-critical code.

Understanding code, making it simpler and cleaner and clearer, emphasizing coding guidelines and continuous refactoring, isn’t just about making code easier to change for the next person. It’s also about making the code more secure.

Adopting Secure Code Reviews

What should you expect from secure code reviews? How can you make them effective?

Introduce secure code reviews into a team in a low-overhead, incremental way. Work inside of Agile practices, and work in an Agile way. Start with small steps and keep reinforcing good ways of working. Continuously learn and improve. It takes time to change the way that people think and work—but this is what Agile and Lean development is all about.

Build on What the Team Is Doing, or Should Be Doing

If the team is already doing pair programming or peer reviews, help it understand how to include security in reviews, using the guidance in this book. Work with managers and the team’s Product Owner and Scrum Master to make sure that the team members are given enough time to learn, and time to learn about security as they are working.

Make sure developers are trained in the fundamentals of secure coding, so that they will write safer code, and so that they know what kinds of problems to look for when they are reviewing code. Take advantage of SAFECode’s free training courses and public information from OWASP. Be practical when it comes to training. You don’t need to train everyone on the team on the low-level details of how to do encryption properly. But everyone needs to understand when and how to correctly call crypto functions.

Don’t rely on reviews by inexperienced team members. Insist on experienced, qualified reviewers, especially for high-risk code.

If the team isn’t doing pair programming or peer reviews:

  • Find people that the team trusts and respects, senior developers who care about writing good code and who like to take on challenges.

  • Do everything you can to convince these people (and their managers) that it is an important and valuable use of their time to review code, starting with high-risk framework code and security features.

  • Make sure that they have the information and training to know what problems to look for.

  • If you work in a regulated industry, use compliance as a stick, if you have to.

Make code reviews as easy as possible

Introduce static analysis into the team, using some of your security budget, to help developers write better code. If you don’t have a budget for tools, take advantage of the open source alternatives mentioned earlier. Follow the guidelines that we’ve already laid out on how to get engineers to use these tools effectively.

If the team is using a collaborative code review platform like Gerrit or Review Board, take advantage of the team’s workflows and the data that collected, to:

  • Ensure that high-risk code is being reviewed.

  • Spot-check reviews to ensure that they are being done responsibly.

  • Use the comment information as an audit trail for compliance.

  • Feed static analysis results back into code reviews as comments, to help reviewers find more problems.

Build on collective code ownership in Agile

If the code is open to anyone on the team to review or work on, make your security people part of the team. This means that they shouldn’t need a reason to look at code to find problems. And, if they know what they are doing and are trusted by the rest of the team, they shouldn’t need permission to fix vulnerabilities either, as long as they follow the team’s conventions and workflows.

Note

Make sure that reviewers also look at tests, especially for high-risk code. Code without tests is dangerous code that could be broken. If you care about the reliability and security of code, you need to check on the quality and coverage of automated tests, and ensure that negative tests are being written. We’ll look at this more in Chapter 11, Agile Security Testing.

And if you care about the long-term sustainability of the system, you should also review tests to make sure that they are reasonably well-written, understandable, and maintainable.

Refactoring: Keeping Code Simple and Secure

Code that is clean and clear and consistent is less likely to have bugs (including security bugs) and easier and safer to change and fix. And much easier to review: if reviewers can’t understand the code, they will waste time trying to figure out what is going on. And they will miss the opportunity to find bugs.

This is why, for example, post-Heartbleed, the OpenSSL team spent months reformatting the code and deleting unused code and doing other simple cleanup work. The team needed to do this before it could take on more important, and more fundamental improvements.12

Fortunately, many Agile teams, especially teams following Extreme Programming, understand the importance of writing clean code and enforcing consistent coding guidelines. If you are working on one of these teams, your job as a reviewer will be much easier—and more valuable.

If not, you should encourage the team to take advantage of refactoring, another common Agile practice. Refactoring—a well-defined, consistent set of patterns and practices for cleaning up and restructuring code in small, incremental steps—is built in to most modern IDEs. Developers can rename fields and methods and classes, extract methods and classes, change method signatures, and make other, more fundamental structural changes safely and predictably—especially if they have a good regression test safety net in place, something we will look at more in the next chapter.

Learning More About Refactoring

For more on writing clean code and how to do refactoring, there are a few important books that all developers should read:

  1. Clean Code, a Handbook of Agile Software Craftsmanship by Bob Martin (Prentice Hall)

  2. Refactoring: Improving the Design of Existing Code by Kent Beck and Martin Fowler (Addison-Wesley Professional)

  3. Working Effectively with Legacy Code by Michael Feathers (Prentice Hall)

Using refactoring tools in their IDEs, reviewers can do their own quick-and-dirty, throwaway “scratch refactoring” until they understand the code better, suggest refactorings in review comments, or submit refactoring changes in a separate patch back to the author. But if reviewers need to reformat code in their IDE or refactor the code before they can review it, the team is doing something wrong.

Fundamentals Will Take You a Long Way to Secure, Safe Code

If you can get past the point of understanding, either because the code is clean, or because you cleaned it up yourself, or because you have worked with it enough that you can make sense of it, then you can start to review what the code actually does—or doesn’t do.

As part of code reviews, even without a strong understanding of secure coding, reviewers can also look out for:

  1. Hardcoded passwords or other credentials, hardcoded paths, hardcoded magic numbers, and other hardcoded things which could be dangerous.

  2. Test code or debugging code left in by accident.

  3. Ensuring that sensitive data is encrypted or tokenized or otherwise handled safely—whether encryption is done properly is a security specialist problem, but whether it is done at all is a business problem that can be handled by anyone on the team who is familiar with the requirements.

  4. Mistakes in error handling—good error handling is another part of defensive programming, and because error handling is hard to test and rarely tested, it is important to check that errors will be handled properly during code reviews.

  5. Access control checks, making sure that they are applied and maintained correctly and consistently. Who can do what or see what in the system are business rules, not a security problem for expensive pointy-hat security wizards.

  6. Consistent auditing for add/change/delete actions.

  7. Consistent use of approved frameworks and standard libraries—especially when dealing with security functions like crypto or output encoding.

  8. Thread safety, including time-of-check/time-of-use and deadlocks: these problems are hard to find in testing, so it is particularly important to look out for concurrency and timing and locking problems in code reviews.

None of the things in the list requires specialized security knowledge. You don’t need to bring in a security auditor to make sure that your team is writing clean, solid, and safe defensive code.

All Input Is Evil

When it comes to security, the most important thing that reviewers need to look out for is making sure that data is safe to be used. Michael Howard at Microsoft, coauthor of Writing Secure Code (Microsoft Press), said that if there was one rule that developers should follow, it is to understand that “all input is evil.”

Checking that data is validated for type, length, and range of values is part of defensive programming. It’s tedious, but every developer should have learned how important it is, either in school or through painful experience when he had to troubleshoot a crash or a breach. Luckily, this is one of the areas where static analysis tools can help: tools that do taint analysis and data-flow checking can catch missing validation checks.

But data validation isn’t enough to make modern web or mobile applications secure, because the fundamental technology that we rely on to get things done, web browsers and database engines and XML interpreters, have problems clearly differentiating between instructions and data. This is something that attackers have learned to take advantage of in cross-site scripting attacks and SQL injection and other dangerous attacks.

In addition to data validation, you also need to encode or escape data, or otherwise template it, to make it safe for a browser or before writing it to a database. Using Prepared Statements in SQL will prevent SQL injection by clearly laying out the commands from the data variables. Modern web frameworks like Angular.js, Ember, React, Rails, and Play provide some built-in protection for XSS and other injection attacks, as long as you use them properly—and as long as you make sure to keep the frameworks up to date if vulnerabilities are reported.

For defense-in-depth protection, you can also take advantage of secure HTTP headers such as Content Security Policy (CSP). Check out the Secure Headers that Twitter’s engineering team has contributed at: https://github.com/twitter/secureheaders.

Reviewing Security Features and Controls

Reviewing security features and controls is much harder than reviewing the rest of the application code. To find problems, reviewers need to understand the nuts and bolts and screws of security, as well as how to read the code. Luckily, this is code that isn’t changed much, and if you are doing things properly, most of this should be done using the built-in security capabilities of your web framework or mobile framework, or special-purpose security libraries like Apache Shiro or Google’s KeyCzar crypto toolkit.

Probably the best reference for a security code review checklist is OWASP’s ASVS project. Although ASVS is designed as an auditing tool, you can pull checks from it to ensure that reviewers—and coders—have covered all of the important bases, especially for security controls. Skip the boring parts up front about auditing yadda yadda and go straight to the checklists.

Let’s look at how to use the ASVS to guide code reviews for important security functions and considerations. In the authentication section, ASVS lists a number of things to look out for, including the following:

  • All pages and other resources require authentication by default, except for those made public to anonymous users.

  • Password fields do not echo the user’s password when entered.

  • All authentication controls are enforced on the server, not just on the client.

  • Authentication code fails securely to ensure that attackers cannot log in if there is an error (i.e., that access is denied by default, and only granted if all the steps pass).

  • All suspicious authentication decisions are logged.

ASVS covers session management, access control, handling of malicious input, cryptography at rest, error handling and logging, data protection, communications security, and more.

Reviewing Code for Insider Threats

The threat of a malicious insider planting a time bomb or a Trojan or some other malcode into your system, or tampering with application logic to commit fraud, is relatively low, but it is still real.

Fortunately, reviewing code to prevent honest mistakes can also help you to catch and contain many insider threats. Whether it is accidental and foolish, or deliberate and evil, you look for many of the same things, what Brenton Kohler at Cigital calls “red flags”:13

  1. Small, accidental or (accidental-seeming) mistakes in security functions, including authentication and session management, access control, or in crypto or secrets handling.

    As Bruce Schneier points out, trivial, but highly damaging mistakes, like Apple’s “goto fail” bug in SSL, could be a cause for concern:

    “Was this done on purpose? I have no idea. But if I wanted to do something like this on purpose, this is exactly how I would do it.”14

  2. Support back doors (or things that could be used as back doors), such as hardcoded URLs or IPs or other addresses, hardcoded user IDs and passwords or password hashes or keys in the code or in configuration. Hidden admin commands, hidden parameters, and hidden runtime options.

    While code like this is often intended for production support and troubleshooting purposes (or left in accidentally after debugging and testing), it could also be used for malicious purposes. In any case, back doors are potentially dangerous holes that could be exploited by attackers;

  3. Test code or debugging code or diagnostics.

  4. Embedded shell commands.

  5. Logic errors in handling money (for example, penny shaving), or mistakes in risk limits or managing credit card details, or in command and control functions, or critical network-facing code.

  6. Mistakes in error handling or exception handling that could leave the system open.

  7. Missing logging or missing audit functions, and gaps in sequence handling.

  8. Code that is overly tricky, or that just doesn’t seem to make sense, especially if it involves time-based logic, crypto, or any high-risk functions. A malicious insider is likely to take steps to obfuscate what they are trying to do. It should be obvious by now that even if code like this isn’t intentionally malicious, you don’t want it in your system.

  9. Self-modifying code—for the same reasons as above.

If you are concerned about potential collusion between developers, you can regularly rotate reviewers, or assign them randomly, and spot-check reviews to make sure that the team is taking this seriously. If the stakes are high enough, you could hire eyes from outside to audit your code, like the Linux Foundation Core Infrastructure Initiative is doing, paying experts to do a detailed audit of OpenSSL, NTP, and OpenSSH.

You also need to manage all the steps from check-in through build and test to deployment, to ensure that you are deploying exactly what developers checked in and built and tested, and that nothing has been tampered with along the way. Take the steps that we outlined in Chapter 13, Operations and OpSec to lock down your repos and build pipelines. Carefully manage secrets and keys. Use checksums or digital signatures and change detection tools like OSSEC to watch out for unexpected or unauthorized changes to important configs and code.

But you should be doing all of this anyway. These controls will minimize the risk of insider attacks, and will also help you to catch attackers who somehow compromised your network and your development or build environment.

Key Takeaways

Code reviews can be a powerful tool for catching security vulnerabilities early, as long as they are done properly:

  • Code reviews need to be done in ways that don’t hold the team up. Forcing reviews through the security team will create a bottleneck that developers will try to work around.

  • Peer reviews before code is checked in to the mainline, using Git pull requests or a similar workflow, are probably the most effective way to ensure that code reviews actually get done.

    The security team only needs to get involved in reviewing high-risk security functions and framework code—code that should rarely change.

  • Code review tools like Gerrit or Review Board or Phabricator can automatically enforce consistent workflows, and make reviews easier and more transparent, especially in distributed teams. Team members can see and build on each other’s feedback, and the electronic record that these tools create will make auditors happy.

  • Reviewing code takes time and is mentally exhausting. Reviews are more effective if they are done in small, frequent bites—luckily, this is the way that most changes are made by Agile teams.

  • Train developers in secure coding and provide brief checklists so that they understand what to avoid in coding, and what to look for in code reviews.

  • Developers need to feel safe in reviews: safe to ask for feedback, safe to provide feedback, safe to ask questions when they don’t understand. Make sure reviewers focus on the code, not the coder.

    Code reviews are a powerful opportunity for continuous learning and improvement. Use feedback from code reviews to improve the team’s coding guidelines and templates, and share what they learn in code reviews during team retrospectives.

  • Everyone’s code should be reviewed—including, and especially, senior team members, because they most often take on the harder, and riskier, coding problems.

    Help the team to identify high-risk code (tag high-risk stories in Jira or Pivotal Tracker or other story tracking tools), and make sure that this code is carefully reviewed by one or more experienced developers.

    Code needs to be reviewed even if it was developed through pair programming, because reviews tend to find different problems than pairing.

  • Automated static code analysis tools are not a substitute for manual reviews. This is a case of AND, not OR. Take advantage of static analysis tools to enforce good coding guidelines, and to catch insecure dependencies and dangerous functions and common coding mistakes. Running these checks before reviewers see the code will make code reviews easier, faster, and much more effective.

  • While automated static analysis tools look like a cheap and easy way to ensure that code gets reviewed (at least if you are using open source tools), they are not plug and play. You need to implement them carefully and correctly, and work with developers so that they understand what the tool does and buy in to using it consistently.

    Because these tools all look for different problems or work in different ways, to get good coverage you will need to run more than one tool against the code.

  • Developers hate false positives. Take time to understand and tune static analysis tools so that they provide efficient and reliable feedback.

  • Make static analysis checking available to developers on a self-service basis. Provide static analysis on the desktop so that developers can catch problems in the IDE as they are coding, and wire static analysis checking into CI/CD build pipelines.

  • Don’t forget to review configuration directives and tests: this is an important part of your code base, and needs to be treated like other code.

Code reviews and static analysis checking need to be part of the team’s Definition of Done: the contract between team members that determines when features or fixes are complete before they can move on to the next piece of work. The team needs to agree on what code will be reviewed (all code changes, or only high-risk code), how many reviewers need to be involved, when code reviews are done (before the code is checked in or after), what automated code review tools will be run in continuous integration, and how to deal with the findings.

1 See the article by Lawrence G. Votta, Jr., “Does every inspection need a meeting?”, 1993.

2 See the report by Rebecca Gelles, “The Unpredictable Attacker: Does the ‘Honeymoon Effect’ Hold for Exploits?”, 2/6/2012.

3 See the report by Forrest Shull, et al., “What We Have Learned About Fighting Defects”.

4 Chris Sauer, D. Ross Jeffery, Lesley Land, and Philip Yetton, “The Effectiveness of Software Development Technical Reviews: A Behaviorally Motivated Program of Research”, IEEE Transactions on Software Engineering, Vol 26, Issue 1, January 2000: 1-14.

5 James A. Kupsch and Barton P. Miller, “Manual vs. Automated Vulnerability Assessment: A Case Study” (2009).

6 Aurelien Delaitre, Delaitre, Aurélien, Bertrand Stivalet, Elizabeth Fong and Vadim Okun. “Evaluating Bug Finders — Test and Measurement of Static Code Analyzers.”, 2015 IEEE/ACM 1st International Workshop on Complex Faults and Failures in Large Software Systems (COUFLESS) (2015): 14-20.

7 For an example of security checks that you could add, see this plug-in for puppet-lint: https://github.com/floek/puppet-lint-security-plugins.

8 Mika V. Mantyla and Casper Lassenius, “What Types of Defects Are Really Discovered in Code Reviews?”, IEEE Transactions on Software Engineering, Volume 35, Issue 3, May 2009: 430-448.

9 Amiangshu Bosu, Michaela Greiler, and Christian Bird, “Characteristics of Useful Code Reviews: An Empirical Study at Microsoft”, Proceedings of the International Conference on Mining Software Repositories, May 1, 2015.

10 Anne Edmundson, et al., “An Empirical Study on the Effectiveness of Security Code Review”, ESSoS (2013).

11 David A. Wheeler, “How to prevent the next Heartbleed”, 2017-01-29.

12 Matt Caswell, “Code Reformat Finished”, OpenSSL Blog, Feb 11th, 2015.

13 Brenton Kohler, “How to eliminate malicious code within your software supply chain”, Synopsys, March 9, 2015.

14 Bruce Schneier, “Was the iOS SSL Flaw Deliberate?”, Schneier on Security, February 27, 2014.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.109.141