Chapter 6. Software Penetration Testing[1]

Software Penetration TestingPenetration testingTesting.Penetration testing; Risk-based security testing.Touchpointslist ofpenetration testingParts of this chapter appeared in original form in IEEE Security & Privacy magazine co-authored with Brad Arkin and Scott Stender [Arkin, Stender, and McGraw 2005]
 

You can’t make an omelet without breaking eggs.

 
 --ANONYMOUS

Quality assurance and testing organizations are tasked with the broad objective of ensuring that a software application fulfills its functional business requirements. Such testing most often involves running a series of dynamic functional tests late in the lifecycle to ensure that the application’s features have been properly implemented. Sometimes use cases and requirements drive this testing. But no matter what does the driving, the result is the same—a strong emphasis on features and functions. Because security is not a feature or even a set of features, security testing (especially penetration testing) does not fit directly into this paradigm.

Security testing poses a unique problem. A majority of security defects and vulnerabilities in software are not directly related to security functionality. Instead, security issues involve often unexpected but intentional misuses of an application discovered by an attacker. If we characterize functional testing as “testing for positives”—as in verifying that a feature properly performs a specific task—then penetration testing is in some sense “testing for negatives.” That is, a security tester must probe directly and deeply into security risks (possibly driven by abuse cases and architectural risks) in order to determine how the system behaves under attack. (Chapter 7 discusses how these tests can be driven by attack patterns.)

At its heart, security testing needs to make use of both white hat and black hat concepts and approaches: ensuring that the security features work as advertised (a white hat undertaking) and that intentional attacks can’t easily compromise the system (a black hat undertaking). That said, almost all penetration testing relies on black hat methods over white hat methods.[2] In other words, thinking like a bad guy is so essential to good penetration testing that leaving it out leaves penetration testing impotent.

In any case, testing for a negative poses a much greater challenge than verifying a positive. A set of successfully executed, plausible positive tests usually yields a high degree of confidence that a software component will perform functionally as desired. However, enumerating actions with the intention to produce a fault and reporting whether and under which circumstances the fault occurs is not a sound approach to proving the negative outcome does not exist. Got that? What I’m saying is that it’s really easy to test whether a feature works or not, but it is very difficult to show whether or not a system is secure enough under malicious attack. How many tests do you do before you give up and declare “secure enough”?

If negative tests do not uncover any faults, this only offers proof that no faults occur under particular test conditions; this by no means proves that no faults exist. When applied to penetration testing where lack of security vulnerability is the negative we’re interested in, this means that “passing” a software penetration test provides very little assurance that an application is immune to attack. One of the main problems with today’s most common approach to penetration testing is a misunderstanding of this subtle point.

As a result of this problem in testing philosophy, penetration testing often devolves into a feel-good exercise in pretend security. Things go something like this. A set of reformed hackers is hired to carry out a penetration test. We know all is well because they’re reformed. (Well, they told us they were reformed, anyway.) The hackers work a while until they discover a problem or two in the software, usually relating to vulnerabilities in base technology such as an application framework or a basic misconfiguration problem. Sometimes this discovery activity is as simple as trawling BugTraq to look up known security issues associated with an essential technology linked into the system.[3]

The hackers report their findings. They look great because they found a “major security problem.” The software team looks pretty good because they graciously agreed to have their baby analyzed and broken, and they even know how to fix the problem. The VP of Yadda Yadda looks great because the security box is checked. Everybody wins!

The problem? No clue about security risk. No idea whether the most critical security risks have been uncovered, how much risk remains in the system, and how many bugs are lurking in the zillions of lines of code. Finding a security problem and fixing it is fine. But what about the rest of the system?

Imagine if we did normal software testing like this! The software is declared “code complete” and thrown over the wall to testing. The testers begin work right away, focusing on very basic edge conditions and known failure modes from previous testing on version one. They find one or two problems right off the bat. Then they stop, the problems get fixed, and the testing box gets checked off. Has the software been properly tested? Run this scenario by any professional tester, and once the tester is done laughing, think about whether penetration testing as practiced by most organizations works.

Penetration Testing Today

Penetration testing is the most frequently and commonly applied of all software security best practices. This is not necessarily a good thing. Often penetration testing is foisted on software development teams by overzealous security guys and everyone ends up angry. Plus the focus tends to be too much driven by an outside→in approach. Better to adopt and implement the first two touchpoints (code review and architectural risk analysis) than to start with number three!

One reason for the prevalence of penetration testing is that it appears to be attractive as a late-lifecycle activity and can be carried out in an outside→in manner. Operations people not involved in the earlier parts of the development lifecycle can impose it on the software (but only when it’s done). Once an application is finished, it is subjected to penetration testing as part of the final preoperations acceptance regimen. The testing is carried out by the infosec division. Because of time constraints, most assessments like this are performed in a “time-boxed” manner as a final security checklist item at the end of the lifecycle.

One major limitation of this approach is that it almost always represents a too-little-too-late attempt to tackle security at the end of the development cycle. As we have seen, software security is an emergent property of the system, and attaining it involves applying a series of touchpoints throughout the software lifecycle (see Chapter 3). Organizations that fail to integrate security throughout the development process are often unpleasantly surprised to find that their software suffers from systemic faults both at the design level and in the implementation. In other words, the system has zillions of security flaws and security bugs. In a late-lifecycle penetration testing paradigm, inside-the-code problems are uncovered too late, and options for remedy are severely constrained by both time and budget.

Fixing things at this stage is, more often than not, prohibitively expensive and almost always involves Band-Aids instead of cures. Post-penetration-test security fixes tend to be particularly reactive and defensive in nature—adjusting the firewall ruleset, for example. Though these short-notice kludges may fix up inside problems temporarily, they can be likened to putting a Band-Aid on a laceration. Tracking down the source of the problem and fixing things there is much more effective.

The real value of penetration testing comes from probing a system in its final operating environment. Uncovering environment and configuration problems and concerns is the best result of any penetration test. This is mostly because such problems can actually be fixed late in the lifecycle. Knowing whether or not your WebSphere application server is properly set up and your firewall plays nicely with it is just as important to final security posture as is building solid code. Penetration testing gets to the heart of these environment and configuration issues quickly. (In fact, its weakness lies in not being able to get beyond these kinds of issues very effectively.)

The success of an ad hoc software penetration test is dependent on many factors, few of which lend themselves to metrics and standardization. The first and most obvious variable is the skill, knowledge, and experience of the tester(s). Software security penetration tests (sometimes called application penetration tests) do not currently follow a standard process of any sort and therefore are not particularly amenable to a consistent application of knowledge (think checklists and boilerplate techniques). The upshot is that only skilled and experienced testers can successfully carry out penetration testing. For an example of what happens when not enough attention is paid during a penetration test, see the next box, An Example: Scrubbed to Protect the Guilty.

Use of security requirements, abuse cases, security risk knowledge, and attack patterns in application design, analysis, and testing is rare in current practice. As a result, security findings are not repeatable across different teams and vary widely depending on the skill and experience of the tester(s). Furthermore, any test regimen can be structured in such a way as to influence the findings. If test parameters are determined by individuals motivated (consciously or not) not to find any security issues, it is very likely that penetration testing will result in a self-congratulatory exercise in futility.[4]

Results interpretation is also an issue. Typically, results take the form of a list of flaws, bugs, and vulnerabilities identified during the penetration testing. Software development organizations tend to regard these results as complete bug reports—comprehensive lists of issues to be addressed in order to make the system secure. Unfortunately, this perception does not factor in the time-boxed (or otherwise incomplete) nature of late-lifecycle assessments. In practice, a penetration test can identify only a small representative sample of all of the possible security risks in a system (especially those problems that are environmental or involve operational configuration). If a software development organization focuses solely on a small (and limited) list of issues, it will end up mitigating only a subset of the security risks present (and possibly not even those that present the greatest risk).

All of these issues pale in comparison to the problem that penetration testing is often used as an excuse to declare security victory and “go home.” Don’t forget, when a penetration test concentrates on finding and removing a handful of issues (and even does so successfully), everyone looks good. Unfortunately, penetration testing done without any basis in security risk analysis leads to the “pretend security” problem with alarming consistency.

One big benefit of penetration testing that is well worth mentioning is its adherence to a critical (even cynical) black hat stance. By taking on a system in its real production environment, penetration testers can get a better feel for operational and configuration issues often overlooked in software development. That’s why penetration testing needs to be adjusted, not abandoned. For more on black box testing and why it is useful as an attacker technique, see Chapter 3 of Exploiting Software [Hoglund and McGraw 2004].

Software Penetration Testing—a Better Approach

All is not lost—security penetration testing can be used effectively. The best approach bases penetration testing activities on security findings discovered and tracked from the beginning of the software lifecycle: during requirements analysis, architectural risk analysis, and so on. To do this, a penetration test must be structured according to perceived risk and offer some kind of metric relating the security posture of the software at the time of the test to risk measurement. Results are less likely to be misconstrued and used to declare pretend security victory if they are related to business impact through proper risk management. (See Chapter 2, which describes a risk management framework amenable to feeding security testing.)

Penetration testing is about testing a system in its final production environment. For this reason, penetration testing is best suited to probing configuration problems and other environmental factors that deeply impact software security. Driving tests that concentrate on these factors with some knowledge of risk analysis results is the most effective approach. Outside→in testing is great as long as it is not the only testing you do. The modern approach that I describe throughout the remainder of this chapter is much more closely aligned with risk-based security testing (see Chapter 7) than it is with application penetration testing as practiced by most consulting shops today. Be careful what you ask for!

Make Use of Tools

Tools (including the static analysis tools discussed in Chapter 4) should definitely be used in penetration testing. Tools are well suited to finding known security vulnerabilities with little effort. Static analysis tools can vet software code, either in source or binary form, in an attempt to identify common implementation-level bugs such as buffer overflows. Dynamic analysis tools can observe a system as it executes. These tools can submit malformed, malicious, and random data to a system’s entry points in an attempt to uncover faults—a process commonly referred to as fuzzing [Miller et al. 1995]. Faults are then reported to the tester for further analysis. When possible, use of these tools should be guided by risk analysis results and attack patterns. (See the following box, Tools for Penetration Testing.)

Tool use carries two major benefits. First, when used effectively, tools can carry out a majority of the grunt work needed for basic software penetration testing (at the level of a fielded system). Of course, a tool-driven approach can’t be used as a replacement for review by a skilled security analyst (especially since today’s tools are by their nature not applicable at the design level), but a tool-based approach does help relieve the work burden of a reviewer and can thus drive down cost. Second, tool output lends itself readily to metrics. Software development teams can use these metrics to track progress over time as they move toward a security goal. Simple metrics in common use today do not offer a complete picture of the security posture of a system. Thus it is important to emphasize that a clean bill of health from an analysis tool does not mean that a system is defect free (recall the discussion of badness-ometers from Chapter 1). The value lies in relative comparison: If the current run of the tools reveals fewer defects than a previous run, progress has likely been made.

Test More Than Once

As it stands today, automated review is best suited to identifying the most basic of implementation defects. Human review is necessary to reveal flaws in the design or more complicated implementation-level vulnerabilities (of the sort that attackers can and will exploit). However, review by an expert is costly and, for reasons just described, can be ineffective if the “expert” is not. By leveraging the seven software security touchpoints described in this book, software penetration tests can be structured in such a way as to be cost effective and give a reasonable estimation of the security posture of the system.

Penetration testing can benefit greatly from knowledge of the security risks built into a system. No design or implementation is perfect, and carrying risk is usually acceptable. Penetration testing can help you find out what this means to your fielded system. In fact, penetration testing in some sense collapses the “risk probability wave” into something much more tangible when testing clarifies ways that a risk can be exploited. That is, if you know what your likely risks are in the design, you can use penetration testing to figure out what impact this has on an actual fielded system.

As noted earlier, static and dynamic analysis tools should be uniformly applied; this holds true at the subsystem level too. In most cases, no customization of basic static analysis tools is necessary for component-level tests. However, dynamic analysis tools will likely need to be written or modified for the target component. Such tools often involve data-driven tests that operate at the API level. Any tool should include data sets known to cause problems, such as long strings, strange encodings, and control characters [Hoglund and McGraw 2004]. Furthermore, the design of the tool should reflect the security test’s goal—to misuse the component’s assets, to violate intercomponent assumptions, or to probe risks. Customizations are almost always necessary.

Penetration testing should focus at the system level and should be directed at properties of the integrated software system. For efficiency’s sake, testing should be structured in such a way as to avoid repeating unit-level testing (as described in Chapter 7), and should therefore be focused on aspects of the system that could not be probed during unit testing.

In order to be defined as penetration tests, system-level tests should analyze the system in its deployed environment. Such analysis may be targeted to ensure that suggested deployment practices are effective and reasonable, and that assumptions external to the system cannot be violated.

Incorporating Findings Back into Development

Perhaps the most common failure of the software penetration testing process is failure to identify lessons learned and propagate them back into the organization. As mentioned, it is tempting to view the results of a penetration test as a complete and final list of issues to be fixed rather than as a representative sample of faults in the system. Of course, even in this case, the existence of such a list does not do anything to actually fix the system. One of the major barriers to software security success is getting organizations to get around to fixing the problems found every day by security consultants. Don’t for a minute believe that penetration testing results make you any more secure; only acting on them does.

Mitigation strategy is thus a critical aspect of any penetration test. Rather than simply fixing only those issues identified, developers should carry out a root-cause analysis of the identified vulnerabilities. For example, if a majority of vulnerabilities are buffer overflows, the development organization should determine just how these bugs made it into the code base. In such a scenario, lack of developer training, misapplication (or nonexistence of) standard coding practices, poor choice of languages and libraries, intense schedule pressure, failure to use a source code analysis tool, or any combination thereof may ultimately represent an important cause.

Once a root cause has been identified, mitigation strategies should be devised to address the identified vulnerabilities and any similar vulnerabilities in the software. Furthermore, best practices should be developed and implemented to address such vulnerabilities proactively in the future. (See Chapter 10 for a discussion of how this idea relates to a large-scale software security program.)

Going back to the buffer overflow example, an organization may decide to train its developers and eliminate the use of potentially dangerous functions, such as strcpy(), in favor of safer string-handling libraries such as those found in the C++ Standard Templates Library (STL). Perhaps a static analysis tool can be used to enforce this decision.

A good last step involves using test result information to measure progress against a goal. Where possible, tests for a mitigated vulnerability should be added to automated test suites (which can be used in regression testing). If the vulnerability resurfaces in the code base at some point in the future, any measures taken to prevent the vulnerability should be revisited and improved. As time passes, iterative penetration tests should reveal fewer and less severe defects in the system. If a penetration test reveals serious severe problems, the “representative sample” view of the results should give the development organization real reservations about deploying the system.

Using Penetration Tests to Assess the Application Landscape

One of the major problems facing large organizations that have been creating software for years is the unmanageable pile of software they have created. How do you get started when you have over 1000 applications and nobody thought about software security until just recently?

Penetration testing can help. One idea is to run a uniform, fixed-length, standardized penetration test against all of the apps and then rank them according to results. This would best be enhanced by a very basic risk analysis to pin down the business context (see Chapter 5). In this way, a very rough cut at ranking the application pile by security posture is possible. An approach like this results in a plan of attack that makes sense. No reason to work on the most secure application first.

This idea can be expanded to cover sets of common components and libraries and their intersection with the application pile. The move toward Web Services and Service Oriented Architecture (SOA) means that much more attention must be paid to shared services. Put bluntly, shared services are also potential shared vulnerabilities and/or common points of failure. Getting things like state, messaging, and authentication right in the brave new world of SOA is a real challenge.

Proper Penetration Testing Is Good

Penetration testing is the most commonly applied mechanism used to inject security into the SDLC. Unfortunately, it is the most commonly misapplied mechanism as well. By adjusting penetration testing to account for results uncovered during testing at the unit level, driving outside→in test creation from risk analysis, and driving the results back into an organization’s SDLC, many common pitfalls can be avoided. Note that the approach described here is extremely useful and important, but also not very common. Ask lots of hard questions about any particular approach to penetration testing before you put too much credence in it, especially if security consultants are involved.

Don’t forget that the real value of penetration testing comes from its central role in vetting configuration and other essential environmental factors. Use penetration testing as a “last check” before code goes live instead of as a “first check” of security posture.

As a measurement tool, penetration testing is most powerful when fully integrated into the development process in such a way that early-lifecycle findings are used to inform testing and that results find their way back into development and deployment practices.



[1] Parts of this chapter appeared in original form in IEEE Security & Privacy magazine co-authored with Brad Arkin and Scott Stender [Arkin, Stender, and McGraw 2005]

[2] One critical exception to this rule occurs when a penetration tester finds out that security functionality does not work as specified and uses this as the basis for attack. The upshot is that a security tester must ensure that the application not only does not do what it is not supposed to do but also does do what it is supposed to do (with regard to security features).

[3] See <http://www.securityfocus.com> for the BugTraq archive.

[4] Put in more basic terms, don’t let the fox guard the chicken house. If you do, don’t be surprised if the fox finds absolutely no problems with the major hole in the northwest corner of the chicken yard.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.46.92