Chapter 5
Just because software architects design software with a security mindset and developers implement security by writing secure code, it does not necessarily mean that the software is secure. It is imperative to validate and verify the functionality and security of software, and this can be accomplished by quality assurance (QA) testing, which should include testing for security functionality and security testing. Security testing is an integral process in the secure software development life cycle (SDLC). The results of security testing have a direct bearing on the quality of the software. Software that has undergone and passed validation of its security through testing is said to be of relatively higher quality than software that has not.
In this chapter, what to test, who is to test, and how to test for software security issues will be covered. The different types of functional and security testing that must be performed will be highlighted, and criteria that can be used to determine the type of security tests to be performed will be discussed. Security testing is necessary and must be performed in addition to functional testing. Testing standards such as the ISO 9126 and methodologies such as the Open Source Security Testing Methodology Manual (OSSTMM) and Systems Security Engineering Capability Maturity Model® (SSE-CMM) that were covered in chapter 1 can be leveraged when security testing is performed.
As a CSSLP, you are expected to
This chapter will cover each of these objectives in detail. It is imperative that you fully understand the objectives and be familiar with how to apply them in the software that your organization builds or procures. The CSSLP is not expected to know all the tools that are used for software testing, but must be familiar with what tests need to be performed and how they can be performed. In the last section of this chapter, we will cover some common tools for security testing, but this is primarily for informational purposes only. Appendix E describes several common tools that can be used for security testing, more particularly application security tests.
In many organizations, the software testing teams are rightfully referred to as QA teams. QA of software can be achieved by testing its reliability (functionality), recoverability, resiliency (security), interoperability, and privacy. Figure 5.1 illustrates the categorization of the different types of software QA testing.
Reliability implies that the software is functioning as is expected by the business or customer. Since software is generally complex, the likelihood that all functionality and code paths will be tested is less and this can lead to the software’s being attacked. Resiliency is the measure of how strong the software is against attacks attempting to compromise it. Nonintentional and accidental user errors can cause downtime. Software attacks can also cause unavailability of the software. Software that is not highly resilient to attack will be susceptible to compromise, such as injection threats, denial of service (DoS), data theft, and memory corruption, and when this occurs, the ability of the software to be able to recover its operations should also be tested. Recoverability is the software’s ability to restore itself to an operational state after downtime, which can be caused accidentally or intentionally. Interoperability testing validates the ability of the software to function in disparate environments. Privacy testing is conducted to check that personally identifying information (PII), personal health information (PHI), personal financial information (PFI), and any information that is exclusive to the owner of the information is assured confidentiality without intrusion.
The results of these various types of testing can provide insight into the quality of software. However, as established in Chapter 1, software that is of high quality may not necessarily mean that it is secure. Software that performs efficiently to specifications may not have adequate levels of security controls in place. This is why security testing (covered later) is necessary, and since security is another attribute of quality, as is privacy and reliability, software that is secure can be considered as being of relatively higher quality. Testing can validate this.
The following section will cover the different types of testing for software QA. It is important that you are familiar with the definition of these tests and what they are used for.
Software testing is performed primarily to attest to the functionality of the software as expected by the business or customer. Functional testing is also referred to as reliability testing. We test to check if the software is reliable (is functioning as it is supposed to) according to the requirements specified by the business owner.
Although unit tests are not conducted by software testers but by the developers, it is the first process to ensure that the software is functioning properly, according to specifications. It is performed during the implementation phase (coding) of the SDLC by breaking the functionality of the software into smaller parts and testing each part in isolation for build and compilation errors as well as functional logic. If the software is architected with modular programming in mind, conducting unit tests are easier because each of the features is already isolated as discrete units (high cohesiveness) and has few dependencies (loose coupling) with other units.
In addition to functionality validation, unit testing can be used to find quality of code (QoC) issues as well. By stepping through the units of code methodically, one can uncover inefficiencies, cyclomatic complexities, and vulnerabilities in code. Some common examples of code that are inefficient include dangling code, code in which objects are instantiated but not destroyed, and infinite loop constructs that cause resource exhaustion and eventually DoS. Within each module, code that is complex in logic with circular dependencies on other code modules (not being linearly independent) is not only a violation of the least common mechanisms design principle, but is also considered to be cyclomatically complex (covered in Chapter 4). Unit testing is useful to find out the cyclomatic complexities in code. Unit testing can also help uncover common coding vulnerabilities, such as hard coding values, and sensitive information, such as passwords and cryptographic keys inline in code.
Unit testing can start as soon as the developer completes coding a feature. However, software development is not done in a silo, and there are usually many developers working together on a single project. This is especially true with current-day, agile programming methodologies such as extreme programming (XP) or Scrum. Additionally, a single feature that the business wants may be split into multiple modules and assigned to different developers. In such a situation, unit testing can be a challenge. For example, the feature to calculate the total price could be split into different modules: one to get the shipping rate (getShippingRate()), one to calculate the Tax (calcTax()), another to get the conversion rate for international orders (getCurrencyConversionRate()), and one to compute any discounts (calcDiscount) offered. Each of these modules can also be assigned to different developers, and some modules may be dependent on others. In our example, the getShippingRate() is dependent on the completion of the getCurrencyConversionRate(), and before its operation can complete, it will need to invoke the getCurrencyConversionRate() method and expect the output from the getCurrencyConversionRate() method as input into its own operation. In such situations, unit testing one module that is related to or dependent on other modules can be a challenge, particularly when the method that is being invoked has not yet been coded. The developer who is assigned to code the getShippingRate() method has to wait on the developer who is assigned the getCurrencyConversionRate() for the unit test of getShippingRate() to be completed. This is where drivers and stubs can come in handy. Implementing drivers and stubs is a very common approach to unit testing. Drivers simulate the calling unit whereas stubs simulate the called unit. In our case, the getShippingRate() method will be the driver because it calls the getCurrencyConversionRate() method, which will be the stub. Drivers and stubs are akin to mock objects that alleviate unit testing dependencies. Drivers and stubs also mitigate a very common coding problem, which is the hard coding of values inline code. By calling the stub, the developer of the driver does not have the need to hard code values within the implementation code of the driver method. This helps with the integrity (reliability) of the code. Additionally, drivers and stubs programming eases the development with third party components when the external dependencies are not completely understood or known ahead of time.
Unit testing also facilitates collective code ownership in agile development methodologies. With the accelerated development efforts and multiple software teams collectively responsible for the code that is released, unit testing can help identify any potential issues raised by a programmer on the shared code base before it is released.
Unit testing provides many benefits, including the ability to:
Just because unit testing results indicate that the code tested is functional (reliable), resilient (secure), and recoverable, it does not necessarily mean that the system itself will be secure. The security of the sum of all parts should also be tested. When individual units of code are aggregated and tested, it is referred to as integration testing. Integration testing is the logical next step after unit testing to validate the software’s functionality, performance, and security. It helps to identify problems that occur when units of code are combined. If individual code units have successfully passed unit testing, but fail when they are integrated, then it is a clear-cut indication of software problems upon integration. This is why integration testing is necessary.
Logic testing validates the accuracy of the software processing logic. Most developers are very intelligent, and good ones tend to automate recurring tasks by leveraging and reusing existing code. In this effort, they tend to copy code from other libraries or code sets that they have written. When this is done, it is critically important to validate the implementation details of the copied code for functionality and logic. For example, if code that performs the addition of two numbers is copied to multiply two numbers, the copied code needs to be validated to be make sure that the sign within the code that multiples two numbers is changed from “+” to “×,” as shown in Figure 5.2. Line-by-line manual validation of logic or step-by-step debugging (which is a means to unit test) ensures that the code is not only reliably functioning, but also provides the benefit of extended test coverage to uncover any potential issues with the code.
Logic testing also includes the testing of predicates. A predicate is something that is affirmed or denied of the subject in a proposition in logic. Software that has a high measure of cyclomatic complexity must undergo logic testing before being shipped or released, especially if the processing logic of the software is dependent on user input.
Boolean predicates return a true or false depending on whether the software logic is met. Logic testing is usually performed by negating or mutating (varying) the intended functionality. Variations in logic can be created by applying operators (e.g., AND, OR, NOT EQUAL TO, EQUAL TO) to Boolean predicates. The source of Boolean predicates can be one or more of the following:
Testing for blind SQL Injection is an example of logic testing in addition to being a test for error and exception handling.
Software is not static. Business requirements change, and newer functionality is added to the code as newer versions are developed. Whenever code or data are modified, there is a likelihood for those changes to break something that was previously functional. Regression testing is performed to validate that the software did not break previous functionality or security and regress to a nonfunctional or insecure state. It is also known as verification testing.
Regression testing is primarily focused on implementation issues over design flaws. A regression test must be written and conducted for each fixed bug or database modification. It is performed to ensure that:
Not only functionality needs to be tested, but also the security of the software. Sometimes implementation of security itself can deny existing functionality to valid users. An example of this is that a menu option previously available to all users is no longer available upon the implementation of role-based access control of menu options. Without proper regression testing, legitimate users will be denied functionality. It is also important to recognize that data changes and database modifications can have side effects, reverting functionality or reducing the security of the software, and so these need to be tested for regression as well.
Adequate time needs to be allocated for regression testing. It is recommended that a library of tests be developed that includes a predefined set of tests to be conducted before the release of any new version. The challenge with this approach is determining what tests should be part of the predefined set. At a bare minimum, tests that involve boundary conditions and timing should be included. Determining the relative attack surface quotient (RASQ) for newer versions of software with the RASQ values of the software before it was modified can be used as a measure to determine the need for regression testing and the tests that need to be run.
Usually, software QA teams perform regression testing, but since the changes that need to be made are often code-related, changes that need to be made are costly and project timelines can be affected. It is therefore advisable that regression testing be performed by developers after integration testing for code-related changes and also in the testing phase before release.
In addition to testing for reliability, software testing must be performed to assure the recoverability of software. Such tests check if the software will be available when required and that it has the appropriate replication, load balancing, and disaster recovery (DR) mechanisms functioning properly. Recoverability testing validates that the software meets the customer’s maximum tolerable downtime (MTD) and recovery time objective (RTO) levels.
Performance testing (load testing, stress testing) and scalability testing are examples of common recoverability testing, which are covered in the following section.
Testing should be conducted to ensure that the software is performing to the service level agreement (SLA) and expectations of the business. The implementation of secure features can have a significant impact on performance, and this must be taken into account. Having smaller cache windows, complete mediation, and data replication are examples of security design and implementation features that can adversely impact performance. However, performance testing is not performed with the intent of finding vulnerabilities (bugs or flaws), but with the goal of determining any bottlenecks in the software. It is used to find and establish a baseline for future regression tests (covered later in this chapter). The results of a performance test can be used to tune the software to organizational or established industry benchmarks. Bottlenecks can be reduced by tuning the software. Tuning is performed to optimize resource allocation. You can tune the software code and configuration, the operating system, or the hardware. Examples of configuration tuning include setting the connection pooling limits in a database server, setting the maximum number of users allowed in a Web server, and setting time limits for sliding cache windows.
The two common means to test for performance are load testing and stress testing the software.
It is performed to determine the ability of the software to handle loads beyond its maximum capabilities and is primarily performed with two objectives. The first is to find out if the software can recover gracefully upon failure after the software breaks. The second is to assure that the software operates according to the design principle of failing securely. For example, if the maximum number of allowed authentication attempts has been passed, then the user must be notified of invalid login attempts with a specific nonverbose error message, while at the same time, that user’s account needs to be locked out, as opposed to granting the user access automatically, even if it is only low-privileged guest access. Stress testing can also be used to find timing and synchronization issues, race conditions, resource exhaustion triggers, and events and deadlocks.
Scalability testing augments performance testing. It is a logical next step from performance testing the software. Its main objectives are to identify the loads (obtained from load testing) and to mitigate any bottlenecks that will hinder the ability of the software to scale to handle more load or changes in business processes or technology. For example, if order_id, which is the unique identifier in the ORDER table, is set to be of an integer type (Int16), with the growth in the business, there is a high likelihood that the orders that are placed after the order_id has reached the maximum range (65535) supported by the Int16 datatype will fail. It may be wiser to set the datatype for order_id to be a long integer (Int32) so that the software can scale with ease and without failure. Performance test baseline results are usually used in testing for the effectiveness of scalability. Degraded performance upon scaling implies the presence of some bottleneck that needs to be addressed (tuned or eliminated).
While functional testing is done to make sure that the software does not fail during operations or under pressure, security testing is performed with the intent of trying to make the software fail. Security testing is a test for the resiliency of software. It is testing that is performed to see if the software can be broken. It differs from stress testing (covered earlier) in the sense that stress testing is primarily performed to determine the software’s recoverability, whereas security testing is conducted to attest to the presence and effectiveness of the security controls that are designed and implemented in the software. It is to be performed with a hostile user (attacker or blackhat) mindset. Good security testers are focused on one thing and one thing only, which is to break the software by circumventing any protection mechanisms in the software. Typically, attackers think out of the box as a norm and are usually very creative, finding new ways to attack the software, while learning from and improving their knowledge and expertise from each experience. Security testing begins with creating a test strategy of high-risk items first, followed by low-risk items. The threat model from the design phase that was updated during the implementation phase can be used to determine critical sections of code and software features.
In the physical security world, for an attack to be successful there needs to be a confluence of three aspects: motive, opportunity, and means. The same is true in the information security space, as depicted in Figure 5.3. For cyber crime to be proven in a court of law, the same three aspects of crime must be determined. The motive of the attacker is usually tied to something that the attacker seeks to gain. This could range from just being recognized (fame) among peers, revenge from a disgruntled employee, or money. The opportunity for an attacker is directly related to the connectivity of the software and the vulnerabilities that exist in it. The expertise of and tools available to the hacker are the means by which they can exploit the software. Security testing can address two of the three aspects of crime. It can do little about the motive of an attacker, but the opportunities and means by which an attacker can exploit the software can be determined by security testing.
It is also important to distinguish between the testing of security functionality and security testing. Security testing is testing with an attacker perspective to validate the ability of the software to withstand attack (resiliency), while testing security functionality (e.g., authentication mechanisms, auditing capabilities, error handling) in software is meant to assure that the functionality of protection mechanisms is working properly. Testing security functionality is not necessarily the same as security testing.
Although security testing is aimed at validating software resiliency, it can also be performed to attest to the reliability and recoverability of software. Since integrity of data and systems is a measure of its reliability, security testing that validates data and system integrity issues attest to software reliability. Security testing can validate controls, such as fail secure mechanisms, and that proper error and exception handling are in place and are working properly to resume its functional operations as per the customer’s MTD and RTO, which are measures of the software’s recoverability. Security testing is also indicative of due diligence and due care measures that the organization takes to develop and release secure software for its customers.
Security testing should be part of the overall SDLC process, and engaging the testers to be part of the process early on is recommended. They should be allowed to assist in threat modeling exercises and be participants in the review of the threat model. This gives the software developer team an opportunity to discover and address prospective threats and gives the software testing team an advantage to start developing test scripts early on in the process. Architectural and design issues, weaknesses in logic, insecure coding, effectiveness of safeguards and countermeasures, and operational security issues can all be uncovered by security testing.
Security testing can be accomplished using several different methodologies.
Also known by other names such as glass box or clear box testing, white box testing is a security testing methodology based on the knowledge of how the software is designed and implemented. It is broadly known as full knowledge assessment because the tester has complete knowledge of the software. It can be used to test both the use case (intended behavior) as well as the misuse case (unintended behavior) of the software and can be conducted at any time after the development of code, although it is best advised to do so while conducting unit tests. In order to perform white box security testing, it is imperative first to understand the scope, context, and intended functionality of the software so that the inverse of that can be tested with an attacker’s perspective.
Inputs to the white box testing method include architectural and design documents, source code, configuration information and files, use and misuse cases, test data, test environments, and security specifications. White box testing of code requires access to the source code. This makes it possible to detect embedded code issues, such as Trojans, logic bombs, impersonation code, spyware, and backdoors, implanted by insiders. These inputs are structurally analyzed to ensure that the implementation of code follows design specifications, and whether security protection mechanisms or vulnerabilities exist. White box testing is also known as structural analysis. Data/information flow, control flow, interfaces, trust boundaries (entry and exit points), configuration, and error handling are methodically and structurally analyzed for security. Source code analyzers can be used to automate some of the source code testing. The output of a white box test is the white box test report, which includes defects (or incidents), flaws and deviations from design specifications, change requests, and recommendations to address security issues. The white box security testing process is depicted in Figure 5.4.
If white box testing is full knowledge assessment, black box testing is its opposite. It is broadly known as zero knowledge assessment, because the tester has very limited to no knowledge of the internal workings of the software being tested. Architectural or design documents, configuration information or files, use and misuse cases, and the source code of the software is not available to or known by the testing team conducting black box testing. The software is essentially viewed as a “black box” that is tested for its resiliency by determining how it responds (outputs) to the tester’s input, as illustrated in Figure 5.5. White box testing is structural analysis of the software’s security, whereas black box testing is behavioral analysis of the software’s security.
Black box testing can be performed before deployment (predeployment) or periodically once it is deployed (postdeployment). Depending on when black box testing is conducted, its objectives are different. Predeployment black box testing is used to identify and address security vulnerabilities proactively, so that the risk of the software’s getting hacked is minimized. Postdeployment black box testing helps to find out vulnerabilities that exist in the deployed production (or actual runtime environment) and can also be used to attest to the presence and effectiveness of the software security controls and protection mechanisms. Because identifying and fixing security issues early in the life cycle is less expensive, it is advisable to conduct black box testing predeployment. But doing so will not give insight into actual runtime environment issues, so when predeployment black box tests are conducted, an environment that mirrors or simulates the deployed production environment should be used.
Also known as fuzz testing or fault injection testing, fuzzing is a brute-force type of testing in which faults (random and pseudo-random input data) are injected into the software and its behavior observed. It is a test whose results are indicative of the extent and effectiveness of input validation. Fuzzing can be used not only to test applications and their programming interfaces (application programming interfaces [APIs]), but also protocols and file formats. It is used to find coding defects and security bugs that can result in buffer overflows that cause remote code execution, unhandled exceptions, and hanging threads that cause DoS, state machine logic faults, and buffer boundary checking defects. The data that are used for fuzzing are commonly referred to as fuzz data or fuzzing oracles. Fuzz data can be either synthesized or mutated. Synthesized fuzz data are data that generated from scratch without being based on previous input, whereas mutated data are valid data that are corrupted so that their format (data structure) is not what the application expects.
Although fuzzing is a very common methodology of black box testing, not all fuzz tests are necessarily black box tests. Fuzzing can be performed as a white box test as well. Black box fuzzing is the sending of malformed data with any verification of the actual code paths that were covered and ones that were not. White box fuzzing is sending malformed data with verification of all code paths. When there is zero knowledge of the software and debugging the software to determine weaknesses is not an option, black box fuzzing is used, and when information about the makeup of the software (e.g., target code paths, configuration) is known, white box fuzzing is performed.
The two types of fuzzing prevalent today are dumb fuzzing and smart fuzzing. When truly random data without any consideration for the data structure is used, it is known as dumb fuzzing. This can be dangerous and lead to DoS, destruction, and complete disruption of the software’s operations. Smart fuzzing is preferred because the data structure, such as encoding and relations (checksums, bit flags, and offsets), are known, and the input data are variations of these data structures.
Black box dumb fuzzing covers the least amount of code and finds the least number of bugs, whereas white box smart fuzzing has maximum code coverage and can detect all of the bugs or weaknesses in code.
I once asked one of my students, “Why do we need to scan our networks and software for vulnerabilities?” and his response, although amusing, was profound. He said, “If we don’t, someone else will.” When there is very limited or no prior knowledge about the software makeup, its internal workings, or the computing ecosystem in which it operates, black box testing can start by scanning the network as well as the software for vulnerabilities. Network scans are performed with the goal of mapping out the computing ecosystem. Wireless access points and wireless infrastructure scans must also be performed. These scans help determine the devices, operating system software (applications), services (daemons), ports, protocols, application and infrastructural interfaces, and Web server versions that make up the environment in which the software will run. They can also be used to find out vulnerabilities that exist in the network or in the software that an attacker can exploit.
The process of determining an operating system version is known as OS fingerprinting. OS fingerprinting is possible because each operating system has a unique way it responds to packets that hit the TCP/IP stack. An example of OS fingerprinting using the Nmap (Network Mapper) scanner is illustrated in Figure 5.6.
There are two main means by which an OS can be fingerprinted: active and passive. Active OS fingerprinting involves the sending of crafted, abnormal packets to the remote host and analyzing the responses from the remote host. If the remote host network is monitored and protected using intrusion detection systems, active fingerprinting can be detected. The popular Nmap tool uses active fingerprinting to detect OS versions. Unlike active fingerprinting, passive OS fingerprinting does not contact the remote host. It captures traffic originating from a host on the network and analyzes the packets. In passive fingerprinting, the remote host is not aware that it is being fingerprinted. Tools such as Siphon, which was developed by the HoneyNet project, and P0f use passive fingerprinting to detect OS versions. Active fingerprinting is fast and useful when there are a large number of hosts to scan, but it can be detected by IDS and IPS. Passive fingerprinting is relatively slower and best used for single host systems, especially if there is historical data. Passive fingerprinting can also go undetected since there is no active probe of the remote host being fingerprinted.
A scanning technique that can be used to enumerate and determine server versions is known as banner grabbing. Banner grabbing can be used for legitimate purposes, such as inventorying the systems and services on the network, but an attacker can use banner grabbing to identify network hosts that have vulnerable services running on them. The File Transfer Protocol (FTP) port 21, Simple Mail Transfer Protocol (SMTP) port 25, and Hypertext Transfer Protocol (HTTP) port 80 are examples of common ports used in banner grabbing. By looking at the server field in a HTTP response header, upon request, one can determine the Web server and its version. This is a very common Web server fingerprinting exercise when black box testing Web applications. Tools such as Netcat or Telnet are used in banner grabbing. Figure 5.7 depicts banner grabbing a Web server version using Telnet.
In addition to scanning the network (wired and wireless), software can be scanned. Software scanning can be either static or dynamic. Static scanning includes scanning the source code for vulnerabilities; dynamic scanning means that the software is scanned at runtime. Static scanning using source code analyzers is usually performed during the code review process in the development phase, whereas dynamic scanning is performed using crawlers and spidering tools during the testing phase of the SDLC.
Compliance with the Payment Card Industry Data Security Standard (PCI DSS) requires that organizations must periodically test their security systems and processes by scanning for network, host, and application vulnerabilities in the card holder data environment. The scan report should not only describe the type of vulnerability, but also provide risk ratings and recommendations on how to fix the vulnerabilities. Figure 5.8 is an illustration of a sample vulnerability scan report for PCI compliance.
Scanning can therefore be used to
Vulnerability scanning is passive in nature, meaning we use it to detect the presence of weaknesses and loopholes that can be exploited by an attacker. On the other hand, penetration testing is active in nature, because it goes one step further than vulnerability scanning. The main objective of penetration testing is to see if the network and software assets can be compromised by exploiting the vulnerabilities that were determined by the scans. The subtle difference between vulnerability scans and penetration testing (commonly referred to as pen-testing) is that a vulnerability scan identifies issues that can be attacked, whereas penetration testing measures the resiliency of the network or software by evaluating real attacks against those vulnerabilities. In penetration testing, attempts to emulate the actions of a potential threat agent (e.g., hacker or malware) are performed. In most cases, pen-testing is done after the software has been deployed, but this need not be the case. It is advisable to perform black box assessments using penetration testing techniques before deployment for the presence of security controls and after deployment to ensure that they are working effectively to withstand attacks.
When penetration testing is performed after deployment, it is important to recognize that rules of engagement need to be followed, and the penetration test itself is methodically conducted. The rules of engagement should explicitly define the scope of the penetration test for the testing team, regardless of whether they are an internal team or an external security service provider. Definition of scope includes restricting IP addresses and the software interfaces that can be tested. Most importantly, the environment, data, infrastructural, and application interfaces that are not within the scope must be identified before the test and communicated to the pen-testing team and, during the test monitoring, must be in place to assure that the pen-testing team does not go beyond the scope of the test. The technical guide to information security testing and assessment, published as Special Publication (SP) 800-115 by the National Institute of Standards and Technology (NIST), provides guidance on conducting penetration testing. The OSTMM (covered in Chapter 1) is known for its prescriptive guidance on the activities that need to be performed before, during, and after a penetration test, including the measurement of results. When conducted postdeployment, penetration testing can be used as a mechanism to certify the software (or system) as part of the validation and verification (V&V) activity inside of certification and accreditation (C&A). V&V and C&A are covered in Chapter 7.
Generically, the pen-testing process includes the following steps:
The penetration test report has many uses as listed below. It can be used
As we have seen, security testing can be accomplished using either a white box approach or a black box approach. Each methodology has its merits and challenges. White box testing can be performed early in the SDLC processes, thereby making it possible to build security into the software. It can help developers to write hack-resilient code, as vulnerabilities that are detected can be identified precisely, in some cases to the exact line of code. However, white box testing may not cover code dependencies (e.g., services or libraries) or third party components. It provides little insight into the exploitability of the vulnerability itself and so may not present an accurate risk picture. Just because the vulnerability is present, it does not mean that it will be exploited, as compensating controls may be in place that white box testing may not uncover. On the other hand, black box testing can attest to the exploitability of weaknesses in both simulated and actual production systems. The other benefit of black box testing is that there is no need for source code, and the test can be conducted both before (pre-) and after (post-) deployment. The limitations of black box testing are that the exact cause of vulnerability may not be easily detectable, and the test coverage itself can be limited to the scope of the assessment.
The following section covers different criteria that can be used to determine the type of approach to take when validating software security. These include:
Table 5.1 tabulates the comparison between the white box and black box security testing methodologies.
Comparison between White Box and Black Box Security Testing
White Box |
Black Box |
|
Also known as |
Full knowledge assessment |
Zero knowledge assessment |
Assesses the software’s |
Structure |
Behavior |
Root cause identification |
Can identify the exact line of code or design issue causing the vulnerability |
Can analyze only the symptoms of the problem and not necessarily the cause |
Extent of code coverage possible |
Greater; the source code is available for analysis |
Limited; not all code paths may be analyzed |
Number of false positives and false negatives |
Low; contextual information is available |
High; since normal behavior is unknown, expected behavior can also be falsely identified as anomalous |
Logical flaws detection |
High; design and architectural documents are available for review |
Low; limited to no design and architectural documentation is available for review |
Deployment issues identification |
Limited; assessment is performed in predeployment environments |
Greater; assessment can be performed in pre- as well as postdeployment production or production-like simulated environment |
So then, what kind of testing is “best” to assure the reliability, resiliency and recoverability of software? The answer is, “It depends.” For determining syntactic issues early on in the SDLC, white box testing is appropriate. If the source code is not available and testing needs to be performed with a truly hostile user perspective, then black box testing is the choice. In reality, however, usually a hybrid of the two approaches, also referred to as gray box or translucent box, is performed to validate security protection mechanisms in place. For a comprehensive security assurance assessment, the hybrid gray box approach is recommended, in which white box testing is conducted predeployment in development and test environments, and black box testing is performed pre- and postdeployment as well in production-like and actual production environments.
We have covered so far the various types of software testing for QA and the different methodologies for security testing. In the following section, security testing as it is pertinent to software security issues will be covered. We will learn about the different types of tests and how they can be performed to attest to the security of code that is developed in the development phase of the SDLC.
Before we start testing for software security issues in code, one of the first questions to ask is whether the software being tested is new or a version release. If it is a version release, we must check to ensure that the state of security has not regressed to a more insecure state than it had in its previous version. This can be accomplished by conducting regression tests (covered earlier) for security issues. We should specifically test for the introduction of any newer side effects that impact security and the use of banned or unsafe APIs in previous versions.
For software revisions, regression testing must be conducted, and for all versions, new or revisions, the following security tests must be performed, if applicable, to validate the strength of the security controls. Using a categorized list of threats as a template of security testing is effective in ensuring comprehensive coverage of the varied threats to software. The National Security Agency (NSA) identity and access management (IAM) threat list and STRIDE threat lists are examples of categorized threat lists that can be used in security testing. Ideally, the same threat list that was used when threat modeling the software will be used for conducting security tests. This way, security testing can be used to validate the threat model.
Most software security vulnerabilities can be mitigated by input validation. Such hostile actions as buffer overflows, injection flaws, and scripting attacks can be effectively reduced if the software performs validation of input before accepting it for processing.
In a client/server environment, input validation tests for both the client and the server should be performed. Client-side input validation tests are more a test for performance and user experience than for security. If you only have the time or resource to perform input validation tests on either the client or the server, choose to ensure that validation of input happens on the server side.
Attributes of the input, such as its range, format, data type, and values, must all be tested. When these attributes are known, input validation tests can be conducted using pattern matching expression and/or fuzzing techniques (covered earlier). Regular expression (RegEx) can be used for pattern matching input validation. Some common examples of RegEx patterns are tabulated in Table 5.2. Tests must be conducted to ensure that the whitelist (acceptable list) of input is allowed whereas the blacklist (dangerous or unacceptable) of input is denied. Not only must the test include the validation of the whitelists and blacklists, but it must also include the anti-tampering protection of these lists. Since canonicalization can be used to bypass input filters, both the normal and canonical representations of input should be tested. When the input format is known, smart fuzzing can be used, otherwise dumb fuzzing using random and pseudo-random inputs values can be used to attest the effective of input validation.
Commonly Used Regular Expressions (RegEx)
Regular Expression |
Validates |
Description |
Example |
^[a-zA-Z’’-’s]{1,20}$ |
Name |
Allows up to 20 uppercase and lowercase characters and some special characters that are common to some names |
John Doe O’ Hanley Johnson-Paul |
^([0-9a-zA-Z]([-.w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-w]*[0-9a-zA-Z].)+[a-zA-Z]{2,9 })$ |
|
Validates an email address |
|
^(ht|f)tp(s?)://[0-9a-zA-Z]([-.w]*[0-9a-zA-Z])*(:(0-9)*)*(/?)([a-zA-Z0-9-.?,’/\+&%$#_]*)?$ |
URL |
Validates a uniform resource locator (URL) |
|
(?!^[0-9]*$)(?!^[a-zA-Z]*$)^([a-zA-Z0-9]{8,15})$ |
Password |
Validates a strong password; it must be between eight and 15 characters, contain at least one numeric value and one alphabetic character, and must not contain special characters |
|
^(-)?d+(.dd)?$ |
Currency |
Validates currency format; if there is a decimal point, it requires two numeric characters after the decimal point |
289.00 |
Since injection attacks take the user-supplied input and treat it as a command or part of a command, input validation is an effective defensive safeguard against injection flaws. In order to perform input validation tests, it is first important to determine the sources of input and the events in which the software will connect to the backend store or command environment. These sources can include authentication forms, search input fields, hidden fields in Web pages, querystrings in the URL address bar, and more. Once these sources are determined, input validation tests can be used to ensure that the software will not be susceptible to injection attacks.
There are other tests that need to be performed as well. These include tests to ensure that
The issue of nonrepudiation is enforceable by proper session management and auditing. Test cases should validate that audit trails can accurately determine the actor and their actions. It must also ensure that misuse cases generate auditable trails appropriately. If the code is written to perform auditing automatically, then tests to assure that an attacker cannot exploit this section of the code should be performed. Security testing should not fail to validate that user activity is unique, protected, and traceable. Tests cases should also include verifying the protection and management of the audit trail and the integrity of audit logs. NIST Special Publication 800-92 provides guidance on the protection of audit trails and the management of security logs. The confidentiality of the audited information and its retention for the required period of time should be checked as well.
Both network and software spoofing test cases need to be executed. Network spoofing attacks include address resolution protocol (ARP) poisoning, IP address spoofing, and media access control (MAC) address spoofing. On the software side, user and certificate spoofing tests, phishing tests, and verification of code that allows impersonation of other identities, as depicted in Figure 5.9, need to be performed. Testing the spoofability of the user and/or certificate along with verifying the presence of transport layer security can ensure secure communication and protection against man-in-the-middle (MITM) attacks. Cookie expiration testing must also be conducted, along with verifying that authentication cookies are encrypted.
The best way to check for defense against phishing attacks is to test users for awareness of social engineering techniques and attacks.
Software is prone to failure due to accidental user error or intentional attack. Not only should software be tested for QA so that it does not fail in its functionality, but failure testing for security must be performed. Requirement gaps, omitted design, and coding errors can all result in defects that cause the software to fail. Testing to determine if the failure is a result of multiple defects or if a single defect yields multiple failures must be performed.
Software security failure testing includes the verification of the following security principles:
If the errors and exceptions are logged and only a reference identifier for that issue is displayed to the end user, as depicted in Figure 5.10, then tests to assure that the reference identifier mapping to the actual error or exception is protected need to be performed as well.
Cryptographic validation includes the following attestation:
Since the consequences of buffer overflow vulnerabilities are extremely serious, testing to ensure defense against buffer overflow weaknesses must be conducted. Buffer overflow defense tests can be both black box and white box in nature. Black box testing for overflow defense can be performed using fuzzing techniques. White box testing includes verifying
Testing for elevated privileges or privilege escalation is to be conducted to verify that the user or process cannot gain access to more resources or functionality than they are allowed. Privilege escalation can be either vertical or horizontal or both. Vertical escalation is the condition wherein the subject (user or process) with lower rights gets access to resources that are to be restricted to subjects with higher rights. An example of vertical escalation is a nonadministrator gaining access to administrator or super user functionality. Horizontal escalation is the condition wherein a subject gets access to resources that are to be restricted to other subjects at their same privilege level. An example of horizontal escalation is an online banking user’s being able to view the bank accounts of other online banking users.
Insecure direct object reference design flaws and coding bugs with complete mediation can lead to privilege escalation, so parameter manipulation checks need to be conducted to verify that privileges cannot be escalated. In Web applications, both POST (form) and GET (querystring) parameters need to be checked.
Testing for anti-reversing protection is particularly important for shrink wrap, commercial off-the-shelf (COTS) software, but even in business applications, tests to assure anti-reversing should be conducted. The following are some of the recommended tests.
Another important aspect of software security assurance testing includes the testing of the security of the environment itself in which the software will operate. Environment testing needs to verify the integrity not just of the configuration of the environment, but also that of the data. Trust boundaries demarcate one environment from another, and end-to-end scenarios need to be tested. With the adoption of Web 2.0 technologies, the line between the client and server is thinning, and in cases where content is aggregated from various sources (environments), as in the case of mashups, testing must be thorough to assure that the end user is not subject to risk. Interoperability testing, simulation testing, and DR testing are important verification exercises that must be performed to attest to the security aspects of software.
When software operates in disparate environments, it is imperative to verify the resiliency of the interfaces that exist between the environments. This is particularly important if credentials are shared for authentication purposes between these environments, as is the case with single sign-on. The following is a list of interoperability testing that can be performed to verify that
It is also necessary to check the software’s upstream and downstream dependency interfaces. For example, it is important to verify that there is secure access to the key by which a downstream application can decrypt data that were encrypted by an application upstream in the chain of dependent applications. Furthermore, tests to verify that the connections between dependent applications are secure need to be conducted.
The effectiveness of least privilege implementation and configuration mismatches can be uncovered using simulation testing. A common issue faced by software teams is that the software functions as desired in the development and test environments, but fails in the production environment. A familiar and dangerous response to this situation is that the software is configured to run with administrative or elevated privileges. The most probable root cause for such varied behavior is that the configuration settings in these environments differ. When production systems cannot be mirrored, assurance can still be achieved by simulation testing. By simulating the configuration settings between these environments, configuration mismatch issues can be determined. Additionally, the need to run the software in elevated privileges in the production environment can be determined and appropriate least privilege implementation measures can be taken.
It is crucially important to test data issues as well, but this can be a challenge. It may be necessary to test cascading relationships, but data to support that relationship may not be available in the test environment. Usually, production data are migrated to the testing environments, and this is a serious threat to confidentiality. It can also have compliance and legal ramifications. Production data must never be ported and processed in test environments. For example, payroll data of employees or credit card data of real customers should never be available in the test environments. It is advisable to use dummy data by creating them from scratch in the test or simulated environment. In cases where production data need to be migrated to maintain referential integrity between sets of data, then only nonconfidential information must be migrated, or the data must be obfuscated or masked. Testing must verify that data in test environments do not violate security and simulation testing for data issues must be controlled.
An important aspect of environment testing is the ability of the software to restore its operation after a disaster happens. DR testing verifies the recoverability of the software. It also uncovers data accuracy, integrity, and system availability issues. DR testing can be used to gauge the effectiveness of error handling and auditing in software as well. Important questions to answer using DR testing include: Does the software fail securely, and how does it report errors upon downtime? Is there proper logging of the failure in place? Failover testing is part of disaster testing, and the accuracy of the tests is dependent on how closely a real disaster can be simulated. Since this can be a costly proposition, proper planning, resource, and budget allocation is necessary, and testing by simulating disasters must be undertaken for availability assurance.
Software should be tested to assure privacy. For software that handles personal data, privacy testing must be part of the test plan. This should include the verification of organizational policy controls that impact privacy. It should also encompass the monitoring of network traffic and the communication between end points to assure that personal information is not disclosed. Tests for the appropriateness of notices and disclaimers when personal information is collected must also be conducted. This is critically important when collecting information from minors or children and privacy testing of protection data, such as the age of the child and parental controls, cannot be ignored in such situations. Both opt-in and opt-out mechanisms need to be verified. The privacy escalation response mechanisms upon a privacy breach must also be tested for accuracy of documentation and correctness of processes.
During the software acceptance phase before software release, the end user needs to be assured that the software meets their specified requirements. This can be accomplished using user acceptance testing (UAT), which is also known as end user testing or smoke testing. As the penultimate step before software is released, UAT is a gating mechanism used to determine whether the software is ready for release and can help with security because it gives the opportunity to prevent insecure software from being released into production or to the end user. Additionally, it brings the benefit of extending software testing to the end users, as they are the ones who perform the UAT before accepting it. The results of the UAT are used to provide the end user with confidence of the software’s reliability. It can also be used to identify design flaws and implementation bugs that are related to the usability of the software.
Prerequisites of UAT include the following:
UAT is generally performed as a black box test that focuses primarily on the functionality and usability of the application. It is most useful if the UAT test is performed in an environment that most closely simulates the real world or production environment. Sometimes UAT is performed in a real production environment postdeployment to get a more accurate picture of the software’s usability. However, when this is the case, the test should be conducted within an approved change window with the possibility of rolling back.
The final step in the successful completion of an UAT is a go/no-go decision, best implemented with a formal sign off. The decision is to be captured in writing and is the responsibility of the signature authority representing the end users.
Coding bugs, design flaws, behavioral anomalies (logic flaws), errors, faults, and vulnerabilities all constitute software defects, as depicted in Figure 5.12, and once any defect is suspected and/or identified, it needs to be appropriately reported, tracked, and addressed before release. In this section, we will focus on how to report and track software defects. In the following section, we will learn about how these defects can be addressed based upon the potential impact they have and what corrective actions can be taken.
Software defects need first to be reported and then tracked. Reporting defects must be comprehensive and detailed enough to provide the software development teams the information that is necessary to determine the root cause of the issue so that they can address it.
The goal of reporting defects is to ensure that they get addressed. Information that must be included in a defect report is discussed in the following subsections.
A unique number or identifier must be given to each defect report so that each defect can be tracked appropriately. Do not clump multiple issues into one defect. Each issue should warrant its own defect report. Most defect tracking tools have an automated means to assign a defect ID when a new defect is reported.
Provide a concise yet descriptive title for the defect. For example, “Image upload fails.”
Provide a summary of the defect to elaborate on the defect title you specified. For example, “When attempting to insert an image into a blog, the software does not allow the upload of the image and fails with an error message.”
If the defect is not reproducible, then the defect will not get fixed, so detailed steps are necessary as to how the defect can be reproduced by the software development team. For example, it is not sufficient to say that the “Upload feature does not work.” Instead, it is important to list the steps taken by the tester, such as:
It is important to describe what the expected result of the operation is so that the development teams can understand the discrepancy from intended functionality. The best way to do this is to tie the defect ID with the requirement identifier in the requirements traceability matrix (RTM). This way any deviations from intended functionality as specified in the requirements can be reviewed and verified against.
If possible and available, a screenshot of the error message should be attached. This proves very helpful to the software development team because:
An example of a screenshot is depicted in Figure 5.13.
Note that, if the screenshot image contains sensitive information, it is advisable to not capture the screenshot. If, however, a screenshot is necessary, then appropriate security controls, such as masking of the sensitive information in the defect screenshot or role-based access control, should be implemented to protect against disclosure threats.
If possible, it is recommended to categorize the defect based on whether it is a functional issue or an assurance (security) one. You can also subcategorize the defect. Figure 5.14 is an example of categories and subcategories of software defects.
This way, pulling reports on the types of defects in the software is made easy. Furthermore, it makes it easy to find out the security defects that are to be addressed before release.
Capturing the environment in which the defect was evident is important. Some important considerations to report on include:
The version of the product in which the defect was determined is an important aspect in defect reporting. This makes it possible to compare versions to see if the defect is universal or specific to a particular version. From a security perspective, the build number can be used to determine the RASQ between versions, based on the number of security defects that are prevalent in each version release.
The individual who detected the defect must be specified so that the development team members know whom they need to contact for clarification or further information.
The date and time (if possible) as to when the defect was reported needs to be specified. This is important in order to track the defect throughout its life cycle (covered later in this chapter) and determine the time it takes to resolve a defect, as a means to identify process improvement opportunities.
This is to indicate the tester’s determination of the impact of the defect. This may or may not necessarily be the actual impact of the defect, but it provides the remediation team with additional information that is necessary to prioritize their efforts. This is often qualitative in nature, and some examples of severity types are:
The priority indicator is directly related to the extent of impact (severity) of the defect and is assigned based on the amount of time within which the defect needs to be addressed. It is a measure of urgency and supports the availability tenet of software assurance. Some common examples of priority include mission critical (0 to 4 hours), high (>4 to 24 hours), medium (>24 to 48 hours), and low (>48 hours).
Every defect that is reported automatically starts with the “new” status, and as it goes through its life cycle the status is changed from “new” to “confirmed”, “assigned”, “work-in-progress”, “resolved/fixed”, “fix verified”, “closed”, “reopened”, “deferred”, and so on.
When a software defect is assigned to a development team member so that it can be fixed, the name of the individual who is working the issue must be specified.
Upon the identification and verification of a defect, the defect needs to be tracked so that it can be addressed accordingly. It is advisable to track all defects related to the software in a centralized repository or defect tracking system. Centralization of defects makes it possible to have a comprehensive view of the software functionality and security risk. It also makes it possible to ensure that no two individuals are working on the same defect. A defect tracking system should have the ability to support the following requirements:
Testing findings that are reported as defects need to be addressed. We can use the priority (urgency) and severity (impact) levels from the defect report to address software defects. High-impact, high-risk defects are to be addressed first. When agile or extreme programming methodologies are used, identified software defects need to be added to the backlog. Risk management principles (covered in Chapter 1) can be used to determine how the defect is going to be handled. Corrective actions have a direct bearing on the risk. These can include one or more of the following:
Knowledge of security defects in the software and ignoring the risk can have serious and detrimental effects when the software is breached. All security defects must be addressed and preferably mitigated.
Additionally, it is important to fix the defects in the development environment, attest the solution in the testing environment, verify functionality in the UAT environment, and only then release (promote) the fix to production environments, as illustrated in Figure 5.16.
It is not important for a CSSLP to have a thorough understanding of how each security tool can be used, but they must be familiar with what each tool can be used for and how it can impact the overall state of software security. Some of the common security tools include:
It is recommended that you be familiar with some of the common tools that are described in Appendix E.
Security testing validates the resiliency, recoverability, and reliability of software, whereas functionality testing is primarily focused on the reliability and secondarily on the recoverability aspects of software. It is imperative to complement functionality testing with security testing of software. Security testing can be used to determine the means and opportunities by which software can be attacked. Both white box and black box security testing are used to determine the threats to software. Knowledge of how to test for common software vulnerabilities, such as failures in input validation, output encoding, improper error handling, least privilege implementation, and use of unsafe programming libraries and interfaces, is important. Various tools are used to conduct security testing. Both functional and security defects need to be reported, tracked through their life cycle, and addressed using risk management principles. Fixing defects must never be performed directly in the production environment, and proper change management principles must be used to promote fixes from development and test environments into the UAT and production environment.
A. Redundancy
B. Recoverability
C. Resiliency
D. Reliability
A. Waterfall
B. Agile
C. Spiral
D. Prototyping
A. Logic
B. Scalability
C. Integration
D. Unit
A. Stress
B. Unit
C. Integration
D. Regression
A. Regression
B. Stress
C. Integration
D. Simulation
A. Source code analyzers
B. Fuzzers
C. Banner-grabbing software
D. Scanners
A. White box
B. Black box
C. Clear box
D. Glass box
A. Rules of engagement
B. Role-based access control mechanisms
C. Threat models
D. Use cases
A. Availability
B. Authentication
C. Nonrepudiation
D. Authorization
A. Injection flaws
B. Lack of reverse engineering protection
C. Cross-site scripting
D. Broken session management
A. Defect identifier
B. Title
C. Expected results
D. Tester name
A. Hashing
B. Cloaking
C. Masking
D. Watermarking
A. Truly random data without any consideration for the data structure
B. Variations of data structures that are known
C. Data that get interpreted as commands by a backend interpreter
D. Scripts that are reflected and executed on the client browser
A. Normal operational functionality is not restored automatically
B. Access to all functionality is denied
C. Confidentiality, integrity, and availability are not adversely impacted
D. End users are adequately trained and self-help is made available for the end user to fix the error on their own
A. Integration
B. Stress
C. Unit
D. Regression
A. The point at which the software will break
B. If the software can restore itself to normal business operations
C. The presence and effectiveness of risk mitigation controls
D. How a blackhat would circumvent access control mechanisms
A. Redundancy
B. Recoverability
C. Resiliency
D. Reliability
A. Integration
B. Regression
C. Unit
D. Penetration
A. Unit
B. Integration
C. Performance
D. Regression
A. Measure the resiliency of the software by attempting to exploit weaknesses
B. Detect the presence of loopholes and weaknesses in the software
C. Detect the effectiveness of security controls that are implemented in the software
D. Measure the skills and technical know-how of the security tester
Brown, J. 2009. Fuzzing for fun and profit. Krakow Labs Literature, 02 Nov.
Cannings, R., H. Dwivedi, and Z. Lackey. 2008. Hacking Exposed Web 2.0: Web 2.0 Security Secrets and Solutions. New York, NY: McGraw-Hill.
Cox, K., and C. Gerg. 2004. Anatomy of an attack: The five Ps. Managing Security with Snort and IDS Tools. Sebastopol, CA: O’Reilly.
Eilam, E., and E. J. Chikofsky. 2005. Reversing Secrets of Reverse Engineering. Indianapolis, IN: Wiley.
Gallagher, T., B. Jeffries, and L. Landauer. 2006. Hunting Security Bugs. Redmond, WA: Microsoft.
Herzog, P. 2010. Open Source Security Testing Methodology Manual. ISECOM. Apr. 23, 2010. http://www.isecom.org/osstmm (accessed March 6, 2011).
Howard, M., and D. LeBlanc. 2003. Writing Secure Code. Redmond, WA: Microsoft.
Kaminski, G. n.d. Software logic mutation testing. Lecture.
Kelley, D. 2009. Black box and white box testing: Which is best? SearchSecurity.com. Nov. 18, 2009. http://searchsecurity.techtarget.com/tip/0,289483,sid14_00,00.html (accessed March 6, 2011).
Microsoft. 2008. Patterns & Practices: Performance Testing Guidance for Web Applications. CodePlex. Aug. 28, 2008. http://perftestingguide.codeplex.com (accessed March 6, 2011).
Microsoft Developer Network. 2010. Unit testing. http://msdn.microsoft.com/en-us/library/aa292197(VS.71).aspx (accessed Apr. 30, 2010).
Neystadt, J. 2008. Automated penetration testing with white-box fuzzing. MSDN: Microsoft Development, MSDN Subscriptions, Resources, and More. Microsoft, Feb. 2008. http://msdn.microsoft.com/en-us/library/cc162782.aspx (accessed March 6, 2011).
OWASP. 208. OWASP Testing Guide V3. Sept. 15, 2008. Web. http://www.owasp.org/index.php/OWASP_Testing_Project (accessed March 6, 2011).
PCI Security Standards Council. Security scanning procedures. Sept. 2006. Web. https://www.pcisecuritystandards.org/pdfs/pci_scanning_procedures_v1-1.pdf (accessed March 6, 2011).
Petersen, B. 2010. Intrusion detection FAQ: What is P0f and what does it do? SANS.org. Apr. 30, 2010. http://www.sans.org/security-resources/idfaq/p0f.php (accessed March 6, 2011).
Piliptchouk, D. 2005. WS-security in the enterprise. ONJava.com. Sept. 2, 2005. http://onjava.com/pub/a/onjava/2005/02/09/wssecurity.html (accessed March 6, 2011).
Rogers, L. 2000. Library | Cybersleuthing: Means, motive, and opportunity. InfoSec Outlook June. http://www.sei.cmu.edu/library/abstracts/news-at-sei/securitysum00.cfm.
Scarfone, K., M. Souppaya, A. Cody, and A. Orebaugh. 2008. Technical Guide to Information Security Testing and Assessment. National Institute of Standards and Technology. Sept. 2008. http://csrc.nist.gov/publications/nistpubs/800-115/SP800-115.pdf (accessed Apr. 30, 2010).
Sundmark, T., and D. Theertagiri. 2008. Phase space analysis of session cookies. Thesis, Linköpings Universitetet, Sweden. http://www.ida.liu.se/~TDDD17/oldprojects/2008/projects/9.pdf (accessed March 6, 2011).
U.S. Department of Homeland Security. Security testing. Apr. 30, 2010. https://buildsecuri tyin.us-cert.gov/bsi/articles/best-practices/testing.html (accessed March 6, 2011).
Yarochki, F. V., O. Arkin, M. Kydyraliev, S. Dai, Y. Huang, and S. Kuo. Xprobe2++: Low volume remote network information gathering tool. Department of Electrical Engineering, National Taiwan University. http://xprobe.sourceforge.net/xprobe-ng.pdf (accessed Apr. 30, 2010).
Zalewski, M. 2005. Silence on the Wire: A Field Guide to Passive Reconnaissance and Indirect Attacks. San Francisco, CA: No Starch.
3.15.237.123