9 Software Verification

Acronym

BVA boundary value analysis
CAST Certification Authorities Software Team
CC1 control category #1
CC2 control category #2
CCB change control board
DC/CC data coupling and control coupling
DP discussion paper
FAA Federal Aviation Administration
IMA integrated modular avionics
ISR interrupt service routine
MC/DC modified condition/decision coverage
OOT object-oriented technology
PR problem report
PSAC Plan for Software Aspects of Certification
RAM random access memory
SAS Software Accomplishment Summary
SCMP Software Configuration Management Plan
SDP Software Development Plan
SLECI Software Life Cycle Environment Configuration Index
SQA software quality assurance
SQAP Software Quality Assurance Plan
SVCP Software Verification Cases and Procedures
SVP Software Verification Plan
SVR Software Verification Report
TRR test readiness review
WCET worst-case execution timing

9.1 Introduction

Verification is an integral process that applies throughout the entire software life cycle. It starts in the planning phase and goes all the way through production release and even into maintenance.

DO-178C glossary defines verification as: “The evaluation of the outputs of a process to ensure correctness and consistency with respect to the inputs and standards provided to that process” [1]. The verification guidance in DO-178C includes a combination of reviews, analyses, and tests. Reviews and analyses assess the accuracy, completeness, and verifiability of the outputs of each of the life cycle phases (including planning, requirements, design, code/integration, test development, and test execution). In general, a review provides a qualitative assessment of correctness, whereas an analysis provides repeatable evidence of correctness [1]. Testing is “the process of exercising a system or system component to verify that it satisfies specified requirements and to detect errors” [1]. All three approaches are used extensively when verifying safety-critical software.

When performing verification, the terms error, fault, and failure are commonly used. The DO-178C glossary defines each term as follows [1]:

  • “Error—With respect to software, a mistake in requirements, design, or code.”

  • “Fault—A manifestation of an error in software. A fault, if it occurs, may cause a failure.”

  • “Failure—The inability of a system or system component to perform a required function within specified limits. A failure may be produced when a fault is encountered.”

The primary purpose of verification is to identify errors, so that they can be corrected before they become faults or failures. In order to identify errors as early as possible, verification must start early in the software life cycle.

9.2 Importance of Verification

Ronald Reagan is credited with, or at least popularizing, the saying: “Trust but verify.” George Romanski puts it this way: “Many people can write software, but not many people would trust their lives with it until it was verified.” Verification is an important process in any software life cycle but is especially so for safety-critical software. In DO-178C, over half of the objectives are classified as verification objectives. The more critical the software, the more verification activities are required, and the more confidence we have that errors have been identified and removed.

According to the many program managers that I talk to and work with, for safety-critical software, over half of the software project budget is dedicated to verification. Unfortunately, in too many projects, the verification activity is seen as a necessary evil; therefore, it is understaffed and is approached as just a check mark activity. Despite the fact that volumes of evidence show that early error detection saves time and money, many projects continue to just get through the reviews to satisfy the certification authority and push testing off to the end. Too often junior engineers are thrown at the reviews, analyses, and tests to just get it done. I do not have anything against junior engineers; in fact, I used to be one. However, developing good verification skills takes time and proper training. Verification should include experienced engineers. It is fine to use less experienced engineers as well; but, those who have been through several projects are typically the ones who find the errors that will wreak havoc on the project if left undetected.

A colleague told me about how he once walked into the lab and observed a junior engineer performing a test of a minor function. While the small part of the system being tested was working properly, the engineer failed to notice the cockpit warnings that were being displayed from a major fault in the system. This was a case of an inexperienced engineer getting a check mark for his assigned task but missing the basic system understanding. It is like having a mechanic check your car and say the tires are prop erly inflated, while at the same time smoke is pouring from the engine. The challenge is to know the entire system and to not have target fixation (tunnel vision).

From a safety perspective, verification is absolutely essential. It is used to satisfy the regulations by confirming that the software performs its intended function—and only its intended function. Basically, without good verification, development assurance has no merit, because it is verification that builds confidence in the product.

9.3 Independence and Verification

Before going into some of the details of verification, let us briefly exam ine the topic of independence. If you have ever tried to edit your own work, you know how hard it is to find those sneaky typos. You know how it is supposed to read, so you just overlook them. The same situation is also true when verifying software data. Therefore, as the criticality of the software increases, the required independence between activities increases.

DO-178C defines independence as follows:

Separation of responsibilities which ensures the accomplishment of objective evaluation. (1) For software verification process activities, independence is achieved when the verification activity is performed by a person(s) other than the developer of the item being verified, and a tool(s) may be used to achieve equivalence to the human verification activity. (2) For the software quality assurance process, independence also includes the authority to ensure corrective action [1].

This chapter is concerned with the first half of the definition, which addresses verification independence. As can be seen from the definition, verification independence does not require a separate organization—only a separate person(s) or tool(s). Chapter 11 will explore the second half of the definition, which addresses software quality assurance independence.

DO-178C Tables A-3 to A-7 identify the verification objectives that require independence (shown with a filled circle [⚫]). For level A, 25 verification objectives require independence; for level B, only 13 verification objectives need to be satisfied with independence; and for levels C and D no verification objectives require independence. Independence for DO-178C Tables A-3 through A-5 is typically satisfied by someone reviewing the data that did not write the data. However, DO-178C Table A-5 objectives 6 and 7 tend to require some analysis and/or test as well. The two independence objectives in DO-178C Table A-6 are normally satisfied by having someone who did not write the code write the tests. For DO-178C Table A-7, all level A and three level B objectives require independence. Table A-7 independence is normally satisfied by a combination of reviews (objectives 1–4) and analyses (objectives 5–9).*

DO-248C discussion paper (DP) #19 explains the typical interpreta tion of verification independence. This DP was based on a Certification Authorities Software Team (CAST) paper (CAST-26); therefore, it provides insight into how certification authorities normally interpret verification independence [2]. It should be noted that the CAST paper promoted more independence between development activities than DO-248C DP #19 identifies. DO-248C clarifies that development independence is only needed between the developer of the source code and the test specifications; other development activities do not require an independent person (or tool) [3]. DO-178C also clarifies this in section 6.2.e, which states: “For independence, the person who created a set of low-level requirements-based test cases should not be the same person who developed the associated Source Code from those low-level requirements” [1].

As mentioned elsewhere, the verification is only as good as the person or tool doing the verifying. Therefore, it is important to use skilled personnel or effective tools. In some cases the tools may need to be qualified (see Chapter 13).

There are a few DO-178C objectives that do not require independence; however, it is still a good practice to have it—particularly for levels A and B software. Experience shows that independent verification is one of the most effective ways to find errors early. Many mature companies use independence for their requirements and design reviews at all levels, even if it is not required, because it is effective at finding errors and saves time and money in the long run. Also, as will be discussed shortly, verifiers tend to have a different mentality than developers, so they are often able to find more errors than a developer reviewing his or her own work.

9.4 Reviews

Let us now examine the three verification approaches in DO-178C. This section considers reviews; the next two sections will examine analysis and test, respectively.

As noted earlier, a review is a qualitative assessment of an artifact for correctness and compliance to the required objectives [1]. Chapters 6 through 8 discussed reviews of requirements, design, and code. Most companies use a peer review process to review their artifacts. The peer review involves a team of reviewers—each with a specific purpose and focus. Chapter 6 provided suggestions for conducting an effective peer review. Typically, a review references the applicable standard(s) and includes a checklist to guide the reviewers. The reviewers document their comments and then each comment is appropriately addressed or dispositioned and verified. If the review leads to significant updates, it may be necessary to completely re-review the data.

9.4.1 Software Planning Review

Chapter 5 discussed the development of five software plans and three standards. For levels A, B, and C DO-178C Table A-1 objectives 6 and 7 require that plans comply with DO-178C and are coordinated [1]. These objectives are normally satisfied with a review of the plans and standards to ensure consistency and compliance with DO-178C guidance. The preferred practice is to review each document separately; followed by a review of all the plans and standards together once they are all written. Without the review of the documents together, inconsistencies and gaps may go undetected. I often emphasize: “Do not let the certification authority or authorized designee be the first to read your plans and standards together.”

9.4.2 Software Requirements, Design, and Code Reviews

Chapters 6 through 8 discussed the characteristics of good requirements, design, and code. Additionally, these chapters identified the objectives (from DO-178C Tables A-3 through A-5) for verifying each of the artifacts. The review of each life cycle data item should happen as early as possible in the life cycle to identify and correct errors proactively. Requirements, design, and code are often informally reviewed several times before the formal peer review. Additionally, coders normally perform their own debug tests prior to the code peer review. The informal reviews and debug activities sift out the big and most obvious errors earlier, reducing rework later.

9.4.3 Test Data Reviews

Please note that peer reviews are also used to verify test cases and procedures, analysis procedures and results, and test results. These reviews are discussed later in this chapter.

9.4.4 Review of Other Data Items

Reviews are also used to ensure accuracy and correctness of other important data items (e.g., Software Life Cycle Environment Configuration Index, Software Configuration Index, and Software Accomplishment Summary).

9.5 Analyses

An analysis is a verification activity that provides repeatable evidence of correctness [1]. There are several types of analyses performed during the safetycritical software life cycle. Two main categories of analysis are needed for compliance with DO-178C: (1) code and integration analyses and (2) coverage analyses. Other analyses are also frequently conducted, depending on the selected verification approach. The typical code/integration analyses are explained in this section. Coverage analysis is discussed later in this chapter (see Section 9.7).

Engineers (myself included) sometimes use the terms analysis or analyze loosely. However, analysis for DO-178C compliance has a specific meaning— it is to be repeatable, therefore, it needs to be well documented. When reviewing data as an FAA designee, I frequently find substantial issues with analyses. Oftentimes, the so-called analysis is not written down; therefore, it is not repeatable. Likewise, when analyses are written, they are often missing criteria to determine success. Smoke-and-mirrors, hand-waving, or black-magic analyses are not acceptable. An analysis should have procedures and results. The procedures include the following [4]:

  • Purpose, criteria, and related requirements

  • Detailed instructions for conducting the analysis

  • Analysis acceptability and completion criteria

The analysis results include the following [4]:

  • Identification of the analysis procedure

  • Identification of data item analyzed

  • Identification of who performed the analysis

  • Analysis results and supporting data

  • Corrective actions generated as a result of analysis

  • Analysis conclusion with the substantiating data

Analyses should be performed with the appropriate level of independence, as identified by the DO-178C Annex A tables. If an analysis is used in lieu of a test, the same level of independence required for the testing objective is needed.

Analyses should be started as soon as feasible in order to identify issues as early as possible. Normally, once a code baseline is established, the analysis activities can begin. The final analyses will not be performed until the software is finalized, but preliminary analyses may uncover some issues. I recently consulted on a project that discovered their worst-case execu tion time (WCET) was about 110%–130% of what was required; essentially, they had negative margins, which meant that functions might not run. Unfortunately, the problem was found a few weeks before the targeted certification date and caused lengthy delays. When it comes to analysis, the ideal and reality often collide (usually because of staff shortage and schedule compression); however, the longer a team waits to start analyses, the more risk they incur.

The typical integration analyses performed for DO-178C compliance are briefly discussed.* The results of all integration analyses are normally summarized in a software verification report, which is explained in Section 9.6.7. Some analyses are also summarized in the Software Accomplishment Summary (SAS) as software characteristics. The analyses summarized in the SAS are noted later. The SAS itself is discussed in Chapter 12.

9.5.1 Worst-Case Execution Time Analysis

Knowing a program’s timing characteristics is essential for the successful design and execution of real-time systems. A critical timing measure is the WCET of a program. WCET is the longest possible time that it may take to complete the execution of a set of tasks on a given processor in the target environment. WCET analysis is performed to verify that the worst-case time stays within its allocation. Although the approach taken depends on the software and system architecture, WCET is typically both analyzed and measured.

When analyzing WCET, each branch and loop in the code is generally analyzed to determine the worst-case execution path(s) through the code. The time for each branch and loop in the worst-case execution path are then summed together to give the WCET.* This time is verified against time allocated in the requirements. The analysis needs to be confirmed with actual timing measurements. There are several factors that complicate WCET analysis, including interrupts, algorithms with multiple decision steps, data or instruction cache usage, scheduling methods, and use of a real-time operating system [3]. These factors should be considered during the development of the software in order to ensure the WCET will fall within the required margins. Use of cache memory (such as L1, L2, and L3 caches) or pipelining further complicates the WCET analysis and requires additional analysis to ensure that the cache and pipeline impact on timing is understood.

Oftentimes, tools are used to help identify the worst-case path. When this is the case, the accuracy of the tool needs to be determined. There are several approaches to confirming the tool accuracy, including (1) manually verifying the tool output, (2) qualifying the tool, or (3) running an independent tool in parallel and comparing results. Be warned that tools have similar challenges as manual analyses, when cache or pipelining is used.

The WCET analysis approach and results are summarized in the SAS as part of the software characteristics (as will be discussed in Chapter 12).

9.5.2 Memory Margin Analysis

A memory margin analysis is performed to ensure that there are adequate margins for production operation and future growth. All used memory is analyzed, including nonvolatile memory (NVM), random access memory (RAM), heap, stack, and any dynamically allocated memory (if used).

The memory margin analysis approach and results are summarized in the SAS as part of the software characteristics.

As an example, stack usage is typically performed by analyzing the source code and determining the deepest level function call tree during routine processing and during interrupt processing. These functions are then used to determine the greatest amount of stack memory from the combined call trees. If this is performed at the source level, then further analysis is required to determine how much additional data are used in each function for saved registers, formal parameters, local variables, return address data, and any compiler-required intermediate results. Stack analysis performed on the executable image will factor this in. The analyzed stack usage amount is then compared with the amount of stack available to determine that there is adequate margin. For partitioned or multitasked software, this is repeated for all partitions and tasks as each will have its own stack. As with the timing analysis, the stack usage analysis is typically confirmed using actual measurements, unless it is performed by a qualified stack analysis tool.

9.5.3 Link and Memory Map Analysis

Link analysis verifies that the modules for the software build are mapped correctly into the processor memory segments defined by the corresponding linker command files. The link analysis typically involves an inspection of the memory map file to verify the following:

  • The origin, maximum length, and attributes of each segment allocated by the linker correspond to the segment definition specified in the linker command file.

  • None of the allocated segments overlap.

  • The actual total allocated length of each segment is less than or equal to the maximum length specified in the linker command file.

  • The various sections defined by each source code module are mapped into the proper segments by the linker.

  • Only the expected object modules from the source code appear.

  • Only the expected procedures, tables, and variables from the linked object modules appear.

  • The linker assigns correct values to the output linker symbols.

In an integrated modular avionics (IMA) system, some of this checking may be performed against the configuration data as well.*

9.5.4 Load Analysis

The load analysis is sometimes performed in conjunction with the link analysis. It verifies the following:

  • All software components are built and loaded into the correct location.

  • Invalid software is not loaded onto the target.

  • Incorrect or corrupted software will not be executed.

  • Loading data are correct.

9.5.5 Interrupt Analysis

For real-time systems, an interrupt analysis is often performed to verify that (1) all interrupts enabled by the software are correctly handled by the corresponding interrupt service routines (ISRs) defined in the software and (2) there are no unused ISRs. The interrupt analysis includes examining the source code to determine the set of enabled interrupts and verifying that an ISR exists for each enabled interrupt. Each ISR is generally analyzed to confirm the following:

  • The ISR is located in memory correctly.

  • At the beginning of the ISR the system context is saved.

  • The operations performed within the ISR are appropriate for the corresponding physical interrupt, for example, no blocking operations are permitted in the interrupt context.

  • All time-critical operations occurring within the ISR complete before any other interrupt can occur.

  • Interrupts are disabled prior to passing data to control loops.

  • The time that interrupts are disabled for passing data is minimized.

  • The system context is restored at the end of the ISR.

9.5.6 Math Analysis

While not always necessary, some teams perform a math analysis to ensure that mathematical operations do not have a negative impact on the software operation. The math analysis is sometimes performed as part of the code review, and therefore it is not always documented as a separate analysis (if this is the case, it should be noted in the code review checklist). The analysis typically involves reviewing each arithmetic or logical operation in the code to do the following:

  • Identify the variables that comprise each arithmetic/logical operation.

  • Analyze each variable’s declaration (including its scaling) to verify that the variables used in the arithmetic/logic operation are properly declared and correctly scaled (with appropriate resolution).

  • Verify that overflow from the arithmetic operation cannot occur.

In some projects, static code analyzers are used to perform or supplement this analysis. Additionally, if math libraries are used, they will need to be verified for correctness as well (see Section 8.2.5.2). Of particular interest is the behavior of the floating point operations at the boundaries. A typical modern processor may have floating point states such as plus and minus zero, plus and minus infinity, and denormalized numbers known as NaN’s (not a number).* The behavior of the floating point algorithms which could get such values should be specified in robustness requirements and be verified.

9.5.7 Errors and Warnings Analysis

As noted in Chapter 8, during the build procedure, the compiler and linker may generator errors and warnings. Errors should be resolved; however, some warning may be acceptable. Any unresolved warnings not justified in the build procedures need to be analyzed to confirm that they do not impact the software’s intended functionality.

9.5.8 Partitioning Analysis

If a system includes partitioning, a partitioning analysis is often performed to confirm the robustness of the partitioning. This analysis is particularly pertinent to IMA systems and partitioned operating systems. (See Chapter 21 for discussion of partitioning analysis.)

9.6 Software Testing

High-level and low-level requirements-based testing are major activities that are required for compliance with DO-178C. DO-178C Table A-6 summarizes the objectives for testing. The objectives include the development and execution of test cases and procedures to verify the following [1]:

  • DO-178C Table A-6 objective 1: “Executable Object Code complies with high-level requirements.”

  • DO-178C Table A-6 objective 2: “Executable Object Code is robust with high-level requirements.”

  • DO-178C Table A-6 objective 3: “Executable Object Code complies with low-level requirements.”

  • DO-178C Table A-6 objective 4: “Executable Object Code is robust with low-level requirements.”

  • DO-178C Table A-6 objective 5: “Executable Object Code is compatible with target computer.”

Testing is only part of the overall verification effort, but it is an important part and requires a tremendous amount of effort, particularly for higher software levels. Therefore, four sections are dedicated to the subject:

  • Section 9.6 provides a lengthy discussion of the testing topics, including (1) the purpose of software testing, (2) an overview of DO-178C’s test guidance, (3) a survey of testing strategies, (4) test planning, (5) test development, (6) test execution, (7) test reporting, (8) test traceability, (9) regression testing, (10) testability, and (11) test automation.

  • Section 9.7 explains verification of the test activities, including (1) review of test cases and procedures, (2) review of test results, (3) requirements coverage analysis, and (4) structural coverage analysis. This is referred to as verification of verification in DO-178C Table A-7.

  • Section 9.8 provides suggestions for problem reporting, since the testing effort is intended to identify problems.

  • Section 9.9 ends the chapter by providing some recommendations for the overall test program.

9.6.1 Purpose of Software Testing

The purpose of software testing is to uncover errors that were made during the development phases. Although some see this as a negative viewpoint, experience shows that success-based testing is ineffective. If someone is out to prove the software works correctly, it likely will. However, that does not help the overall quality of the product. As I often say, “ ‘Yes, Sir!’ men and women are generally not good testers.”

Many people consider a successful test one that does not find any errors. However, from a testing perspective, the opposite is true: a successful test is one that finds errors. Testing is normally the only part of the project that does not focus directly on success. Instead, testers try to break or destroy the software. Some consider this to be a cynical mentality and attempt to put a positive spin on the testing effort. However, in order to do the job properly, the focus needs to be kept on finding failures. If the errors are not found, they cannot be fixed, and they will make it to the field [6,7].

By its nature, testing is a destructive process. Good testers hunt errors and use creative destruction to break the software, rather than show its correctness. Edward Kit writes:

Testing is a positive and creative effort of destruction. It takes imagination, persistence, and a strong sense of mission to systematically locate the weaknesses in a complex structure and to demonstrate its failures. This is one reason why it is particularly hard to test our own work. There is a natural real sense in which we don’t want to find error in our own material [8].

Not everyone is effective at testing. In fact, in my experience, good testers tend to be in the minority. Most people want to make objects work rather than rip them apart [7]. However, Alan Page et al. point out that effective testers have a different DNA than most developers. Tester DNA includes (1) a natural ability for systems level thinking, (2) skills in problem decomposition, (3) a passion for quality, and (4) a love to discover how something works and how to break it [9]. Even though testers are often viewed as a pessimistic lot, their mission is critical. Testers look for the flaws in the software so the problems can be fixed and a high quality and safe product can be delivered.

DO-178C focuses on requirements-based tests to ensure that the requirements are met and that only the requirements are met. However, it is important to keep in mind that the requirements tell us how the program is supposed to behave when it is coded correctly. “They don’t tell us what mistakes to anticipate, nor how to design tests to find them” [6]. Good testers anticipate the errors and write tests to find them. “Software errors are human errors. Since all human activity, especially complex activity, involves error, testing accepts this fact and concentrates on detecting errors in the most productive and efficient way it can devise” [8]. The harsher the testing, the more confidence we can have in the quality of the product.

Some believe that software can be completely tested. However, this is a fallacy. Complete testing would indicate that every aspect of the software has been tested, every scenario has been exercised, and every bug has been discovered. However, in their book Testing Computer Software, Cem Kaner et al. point out that it is impossible to completely test software for three reasons [10]:

  • The domain of possible inputs is too large to test.

  • There are too many possible paths through the program to test.

  • The user interface issues (and thus the design issues) are too complex to completely test.

Kit’s book Software Testing in the Real World includes several test axioms, which are summarized here [8]:

  • Testing is used to show the presence of errors but not their absence.

  • One of the most challenging aspects of testing is knowing when to stop.

  • Test cases should include a definition of the expected output or result.

  • Test cases must be written for invalid and unexpected, as well as valid and expected, input conditions.

  • Test cases must be written to generate desired output conditions— not just the input. That is, output space as well as input space should be exercised through testing.

  • To find the maximum number of errors, testing should be an independent activity.

9.6.2 Overview of DO-178C’s Software Testing Guidance

Now that the overall purpose of software testing has been reviewed, let us consider what DO-178C says. DO-178C focuses on requirements-based testing: tests are written and executed to show that the requirements are met and to ensure that there is no unintended functionality. The more critical the software, the more rigorous the test effort.

9.6.2.1 Requirements-Based Test Methods

DO-178C section 6.4.3 promotes three requirements-based test methods.

  1. Requirements-based software/hardware integration testing [1]: This test method executes tests on the target computer to reveal errors when the software runs in its operating environment, since many software errors can only be identified in the target environment. Some of the functional areas verified during software/hardware integration testing are interrupt handling, timing, response to hardware transients or failures, databus or other problems with resource contention, built-in-test, hardware/software interfaces, control loop behavior, hardware devices controlled by software, absence of stack overflow, field-loading mechanisms, and software partitioning. This method of testing is normally performed by running tests against the highlevel requirements, with both normal and abnormal (robustness) inputs, on the target computer.

  2. Requirements-based software integration testing [1]: This method of testing focuses on the interrelationships of the software to ensure that the software components (typically, functions, procedures, or modules) interact properly and satisfy the requirements and architecture. This testing focuses on errors in the integration process and component interfaces, such as corrupted data, variable and constant initialization errors, event or operation sequencing errors, etc. This testing is typically accomplished starting with the high-level requirementsbased tests for hardware/software integration testing. However, requirements-based tests may need to be added to exercise the software architecture; likewise, lower level testing may also be needed to supplement the software/hardware integration testing.

  3. Requirements-based low-level testing [1]: This method generally focuses on compliance with the low-level requirements. It examines lowlevel functionality, such as algorithm compliance and accuracy, loop operation, correct logic, combinations of input conditions, proper response to corrupt or missing input data, handling of exceptions, computation sequences, etc.

9.6.2.2 Normal and Robustness Tests

DO-178C also promotes the development of two kinds of test cases: normal and robustness.*

9.6.2.2.1 Normal Test Cases

Normal test cases look for errors in the software with normal/expected conditions and inputs. DO-178C requires normal test cases for highand lowlevel requirements for levels A, B, and C, and for high-level requirements for level D. Normal test cases are written against the requirements to exercise valid variables (using equivalence class and boundary value tests, which are discussed in Sections 9.6.3.1 and 9.6.3.2), verify operation of time-related functions under normal conditions, exercise state transitions during normal operation, and verify correctness of normal range variable usage and Boolean operators (when requirements are expressed as logic equations) [1].

9.6.2.2.2 Robustness Test Cases

Robustness test cases are generated to show how the software behaves when exposed to abnormal or unexpected conditions and inputs. DO-178C requires robustness test cases for highand low-level requirements for levels A, B, and C, and for high-level requirements for level D. Robustness test cases consider invalid variables, incorrect state transitions, out-of-range computed loop counts, protection mechanisms for partitioning and arithmetic overflow, abnormal system initialization, and failure modes of incoming data [1]. DO-178C places strong emphasis on using the requirements as the means to determine the test cases. Requirements specify intended behavior; such behavior needs to be explored. Although the behavior is specified, testers should strive to explore different or additional behaviors when writing tests cases using their understanding of what the system is intended to do and trying to check that different behaviors are not present in the implementation (i.e., apply the break it mentality). This is an open-ended process as it is impossible to determine every additional behavior that may exist. The initial set of requirements serves as a guide and any new tests cases added to check the behavior beyond that specified may lead to additional requirements, such as robustness requirements (since all tests used for certification credit need to trace to requirements).

Testing serves two purposes: (1) to ensure that the requirements are satisfied and (2) to show absence of errors. The second purpose is the most challenging and is where experience is extremely beneficial. Keep in mind that the requirements proposed initially are the minimal set. Missing or inadequate requirements, particularly robustness requirements, may be identified when developing test cases. Requirements changes should be proposed and tracked through a formal change process to ensure that they are properly considered.

Many projects apply requirements blinders when writing tests (i.e., the tests are written just using the written requirements). This may result in faster testing and fewer test failures, but it is not an effective way to find errors. I have seen numerous cases where a the software testers have blessed a software load only to have an experienced and somewhat devious systems engineer come to the lab and break the software within 10 minutes. A good test engineer knows the requirements, how a designer/coder is likely to work, and the typical error sources. Something as simple as holding a button down for an extended period of time or pressing it for a very short time has led to some interesting problem reports.

James A. Whittaker’s book How to Break Software provides some thoughtprovoking strategies for software testing. Although his approach is a bit unconventional for the safety-critical realm, he does have some good suggestions that can be applied to robustness testing. He emphasizes that there are four fundamental capabilities of software [11]:

  1. Software accepts input from its environment.

  2. Software produces output and transmits it to its environment.

  3. Software stores data internally in one or more data structures.

  4. Software performs computations using input and stored data.

He then focuses on attacking the four software capabilities. He provides suggestions for testing input, testing output, testing data, and testing computations. The input and output testing require detailed knowledge of the software functionality (high-level functionality). The data and computation tests concentrate on the design (low-level functionality) [11].

9.6.3 Survey of Testing Strategies

Having read dozens of books on software testing, I am astounded at the width of the chasm between what is in the books and what happens in real safetycritical software projects. For example, software engineering literature uses the terms black-box testing and white-box testing extensively. Black-box testing is functional and behavioral testing performed without knowledge of the code. White-box testing (sometimes called glass-box testing) uses knowledge of the program’s internal structure and the code to test the software. These terms (especially white-box testing) tend to obscure the necessary connection to requirements. Therefore, these terms do not appear in DO-178C. In fact, DO-178C tends to stay clear of white-box testing strategies. Instead, DO-178C concentrates on testing the highand low-level requirements to ensure that the code properly implements the requirements.

Despite the chasm between the literature and DO-178C, many of the concepts of black-box testing, and to some degree white-box testing, can be applied to safety-critical software. In my experience, black-box testing approaches are applicable to both highand low-level requirement-based testing. White-box testing strategies are a little more challenging to apply. However, if the strategies are applied to the low-level requirements and architecture, rather than the code itself, they do provide some useful practices. Of course, this requires the low-level requirements to be written at an appropriate level. The DO-178C glossary defines low-level requirements as: “Software requirements developed from high-level requirements, derived requirements, and design constraints from which Source Code can be directly implemented without further information” [1]. If the low-level requirements are treated as design details just above the code, many of the white-box testing concepts can be applied. But, care must be taken to ensure that the tests are written against the requirements and not the code.

This section briefly examines several testing strategies that may be employed on a project. The information is provided at a high level to begin to bridge the chasm between DO-178C testing concepts and what is presented in software test textbooks.

9.6.3.1 Equivalence Class Partitioning

Pressman writes:

Equivalence partitioning is a black-box testing method that divides the input domain of a program into classes of data from which test cases can be derived. An ideal test case single-handedly uncovers a class of errors (e.g., incorrect processing of all character data) that might otherwise require many test cases to be executed before the general error is observed [12].

Equivalence partitioning requires an evaluation of equivalence classes. The DO-178C glossary defines equivalence class as: “The partition of the input domain of a program such that a test of a representative value of the class is equivalent to a test of other values of the class” [1]. Equivalence class testing considers classes of input conditions to be tested, where each class covers a large set of other possible tests. It “enables the tester to evaluate input or output variables systematically for each parameter in a feature” [9]. It is typically applied to situations where valid (normal) and invalid (robustness) inputs over a range are identified. Rather than testing every value of the range, only representative values are needed (typically on both sides of the boundaries). For large ranges of data, a value in the middle and at the extremes of each end are also typical. In addition to ranges of values, equivalence classes consider similar groups of variables, unique values that require different handling, or specific values that must or must not be present.

Even though equivalence class partitioning may seem trivial, it can be quite challenging to apply. Its effectiveness relies on “the ability of the tester to decompose the variable data for a given parameter accurately into welldefined subsets in which any element from a specific subset would logically produce the same expected result as any other element from that subset” [9]. If the tester does not understand the system and the domain space, he or she may miss critical defects and/or execute redundant tests [9].

The following recommendations may help when looking for equivalence classes [7,10]:

  • Consider the invalid inputs. These are often where the vulnerabilities in the software and the bugs are found. These also serve as robustness tests.

  • Consider organizing classifications into a table with the following columns: input or output event, valid equivalence classes, and invalid equivalence classes.

  • Identify ranges of numbers. Typically for a range of numbers, the following are tested: values within the range (at both ends and in the middle), a value below the smallest number in the range, a value above the largest number in the range, and a non-number.

  • Identify membership in a group.

  • Consider variables that must be equal.

  • Consider equivalent output events. This can sometimes be challenging because it requires one to determine the inputs that generate the outputs.

9.6.3.2 Boundary Value Testing

Boundary value analysis (BVA) is a test-case design technique that complements equivalence partitioning. Rather than selecting any element of an equivalence class, BVA leads to the selection of test cases at the “edges” of the class. Rather than focusing solely on input conditions, BVA derives test cases from the output domain as well [12].

BVA is closely related to equivalence class partitioning with two differences:

(1) BVA tests each edge of the equivalence class and (2) BVA explores the outputs of equivalence classes [8]. Boundary conditions can be subtle and difficult to identify [8].

Boundary values are the biggest, smallest, soonest, shortest, loudest, fastest, ugliest members of the equivalence class (that is, the most extreme values). Incorrect equalities (e.g., > instead of >=) are also considered [10]. Programmers may accidently create incorrect boundaries for linear variables; therefore, the boundaries are one of the best places to find errors. Boundary testing is especially effective for detecting looping errors, off-by-one errors, and erroneous relational operators [9]. It is important to consider outputs as well as inputs. Midrange values should also be examined.

A classical programming error is to be off by one in a loop count (e.g., the programmer may check for < when he or she should check for <=). Therefore, testing the boundaries of loops can be a good place to find errors. When testing the loop, one first bypasses the loop; then one iterates through the loop once, twice, maximum number of times, maximum minus one times, and maximum plus one times. This type of testing is often performed as part of low-level testing, since looping is normally a design detail rather than a functional requirements detail.

9.6.3.3 State Transition Testing

When state transitions are used, state transition testing considers the valid and invalid transitions between each state. Invalid state transitions that could impact safety or functionality should be tested in addition to the valid transitions.

9.6.3.4 Decision Table Testing

Requirements sometimes include decision tables to represent complex logical relationships. Since the tables function as requirements, they need to be tested. Typically, conditions in the table are interpreted as inputs and actions are interpreted as outputs. Equivalence classes may be present in the table as well. Testers need to show that they have thoroughly exercised the requirements presented in the table. Testing may reveal deficiencies in the table itself (such as, incorrect logic or incomplete data), as well as incorrect implementation in the code.

9.6.3.5 Integration Testing

Integration is the process of putting the software components together. The components may work beautifully when performing the low-level testing on each component; however, the process of integrating the components may result in unexpected results. When components are integrated, several things can go wrong, including data may be lost across an interface, one component may adversely affect another component, subfunctions may not properly contribute to the higher level functionality desired, global data may become corrupted, etc. [12]. To avoid such integration hurdles, an integration strategy is typically applied. Some common integration approaches are discussed in following.

Big bang integration: This approach throws the software components together to see if it works (it usually does not). Even though conventional wisdom tells us that big bang is a bad idea, it happens. I have seen multiple companies unsuccessfully attempt it. One of the main problems with big bang is that when a problem exists, it is difficult to track it down, because the integrity of the modules or their interaction is unknown. Some more effective integration strategies are now discussed.

Top-down integration: With this approach components are integrated moving downward through the hierarchy—typically starting with the main control module. Stubs are used for components that are not yet integrated. Stubs are replaced with one component at a time. Tests are conducted as each component is integrated.

Bottom-up integration: With this approach, the lower level components are tested using drivers. Drivers emulate components at the next level up the tree. Then the components are integrated up the hierarchy.

Combination of top-down and bottom-up integration (sometimes referred to as sandwich integration or hybrid integration): Oftentimes one branch at a time is integrated. With this approach not as many stubs and drivers are needed.

The preferred approach depends on the architecture and other programmatic details. Typically, it is best to test the most critical and high-risk modules first, as well as key functionality. Then functionality is added in an organized manner.

9.6.3.6 Performance Testing

Performance testing evaluates the run-time performance of the software within the integrated system and identifies bottlenecks in the system. Performance tests are used to demonstrate that performance requirements (such as, timing and throughput) are satisfied. Race conditions and other timing-related issues are also examined. Performance testing typically occurs throughout the testing process and may lead to redesign or optimization setting changes, if results are inadequate. As noted in Section 9.5, these tests are often also used to help determine the margins of the software in the target environment.

9.6.3.7 Other Strategies

There are several other strategies for testing that may be employed, depending on the system architecture and the requirements organization. For example, loop testing may be applied for simple loops, nested loops, and concatenated loops. Also, if a model is used in the requirements, model-based testing techniques may be applied (model-based development and verification are discussed in Chapter 14). Additionally, for algorithms, special testing may be needed, depending on the nature of the algorithm. For example, when testing the digital implementation of a first order lag filter, enough iterations need to be run to be able to show that the first order response actually curves instead of just being a linear ramp.

9.6.3.8 Complexity Measurements

Experience shows that complex code and interactions of complex portions of code are generally more prone to mistakes [9]. Complex code has a tendency to be buggy and difficult to maintain. Complexity measurements are often applied to code during the development phase to alert coders of the risky code constructs. When possible, the software is redesigned or refactored to make it less complex. Coding standards often include a guideline to measure and minimize the complexity (e.g., McCabe’s cyclomatic complexity measure is often applied). However, if the code complexity is not resolved, it can be a good indicator to testers of where the code vulnerabilities exist. Complex software should be pounded hard (that is, more normal and robustness tests), because complex code will typically have more errors. The errors often creep in when the code is modified, because the code is difficult to understand and maintain.

One project that I lived through had a particularly complex feature. It was called the fragile code. Any time it was modified, the entire subsystem had to be completely retested, because the impact of the change was unpredictable. It ended up being an unacceptable situation for a safety-critical system and had to be redesigned late in the project. Highly complex or fragile code should be redesigned as early as possible. It is less agonizing to fix it up front than to try to track down phantom symptoms at the 11th hour.

9.6.3.9 Summary and Characteristics of a Good Test

Section 9.6.3 has surveyed several testing approaches. Most projects use a combination of many or all of the aforementioned approaches. There are others we could cover. Kaner et al. explain that regardless of the test approach, there are some common characteristics of a good test. Good test cases satisfy the following criteria [10]:

  • “It has a reasonable probability of catching an error.” Tests are designed to find errors—not merely to prove functionality. When writing a test one must consider how the program might fail.

  • “It is not redundant.” Redundant tests offer little value. If two tests are looking for the same error, why run both?

  • “It is the best of its breed.” Tests that are most likely to find errors are most effective.

  • “It is neither too simple nor too complex.” Overly-complex tests are difficult to maintain and to identify the error. Overly-simple tests are often ineffective and add little value.

9.6.4 Test Planning

Chapter 5 discussed the Software Verification Plan (SVP). The SVP provides the plan for how verification on the overall project will be performed. It also explains the test strategy and documentation. However, in addition to the SVP, most companies also develop a detailed test plan, which is often included in a Software Verification Cases and Procedures (SVCP) document. The SVCP document typically focuses on more than just testing; it also includes the procedures for reviews and analyses; however, this section concentrates on the test aspects of the SVCP.

The test planning portion of the SVCP (sometimes referred to as the test plan) goes into more detail than the SVP on the test environment (identifies equipment needed, test-related tools, test-station assembly), test case and procedure organization and layout, categories of tests, test naming conventions, test groupings, test tracing strategy, test review process and checklists, responsibilities, general test set-up procedures, test build procedures, test readiness review (TRR) criteria, and test station configuration audit or conformity plan. The SVCP is normally a living document that gets updated as the testing program matures.* As the program matures, a listing of all requirements is normally added, along with a notation of the test technique that will be applied. As tests are written, the SVCP is updated to include a summary of all the test cases and procedures that will be executed during the test execution, as well as traceability matrices showing traceability between the test cases and requirements and between the test cases and test procedures. Test time estimations and test execution plan (including order of execution) are also added prior to test execution. The SVCP may also include the regression test approach.

The test planning in the SVCP is beneficial because it helps ensure the following:

  • Everyone is on the same page and knows the expectations, including schedule, responsibilities, and assigned tasks.

  • The test development and execution effort is organized.

  • All necessary tests are developed and executed.

  • There’s no unnecessary redundancy.

  • Schedules are accurately developed (the plan allows the test manager to develop an accurate schedule rather than aim at a fictitious date mandated by project management).

  • Priorities are identified.

  • All tasks are staffed (the plan may help identify staff shortages).

  • Necessary equipment and procedures are identified.

  • Expected test layout is communicated.

  • Test cases and procedures are developed, reviewed, and dry run.

  • The TRR criteria are established ahead of time, so everyone knows what to expect.

  • The order of test execution is planned to efficiently execute the program.

  • The version of test cases and test procedures to be used for score are identified.

  • Test set-up procedures are clear.

  • All requirements are tested.

  • Test groupings are logical.

  • The necessary test equipment is acquired and properly configured.

  • All test tools are properly controlled.

  • Test files are built in such a way that they represent the actual software that will be fielded.

  • Test stations are properly configured to represent the target environment.

  • The approach for auditing test station configuration is identified.

  • Risks are identified and mitigated.

The SVCP should be written to serve the project, not just to be a completed data item. It should be organized so that it is easy to update throughout the project. It provides evidence of an organized and complete test program. Otherwise, it is difficult to confirm that everything was properly executed.

Also note that Chapter 12 (Section 12.5) explains the expected maturity of software before flight testing for certification credit. This maturity criterion should be considered during test planning, since it could impact the test schedule and approach.

9.6.5 Test Development

The DO-178C expectations for testing and the typical test strategies employed have been examined. In this section more details of the test cases and procedures development from a DO-178C perspective are explored.

9.6.5.1 Test Cases

DO-178C defines a test case as: “A set of test inputs, execution conditions, and expected results developed for a particular objective, such as to exercise a particular program path or to verify compliance with a specific requirement” [1]. Typically, a template is provided for the test team to ensure that the tests cases are laid out in a common manner. If the tests are automated, the format is particularly important. Additionally, a tool is sometimes used to extract information from the test case headers to generate a test summary or traceability report. Whether the tests are manual or automated, each test case usually includes the following:

  • Test case identification

  • Test case revision history

  • Test author

  • Identification of software under test

  • Test description

  • Requirement(s) tested

  • Test category (high-level, low-level, or both highand low-level)

  • Test type (normal or robustness)

  • Test inputs

  • Test steps or scenarios (the actual test)

  • Test outputs

  • Expected results

  • Pass/fail criteria

9.6.5.2 Test Procedures

In addition to test cases, DO-178C mentions the need for test procedures. A test procedure is “detailed instructions for the set-up and execution of a given set of test cases, and instructions for the evaluation of results of executing the test cases” [1]. Test procedures come in different shapes and sizes. Some are actually embedded in the test cases themselves; others are separate high-level documents that explain how to execute the test cases. The test procedures are the steps used to execute the test cases and obtain the test results. Sometimes they are manual and sometimes they are automated.

Each test step and how to verify pass/fail should be clear and repeatable. The person who executes the test is often not the one who wrote the test. In fact some companies and certification authorities insist on independent test execution. Vague test procedures are a common problem; the steps may be clear to the test author, but someone who is not familiar with the test or functionality may not be able to execute the test successfully. To ensure clarity and repeatability, it is a good practice to have someone who did not write the tests perform a dry run of the tests as early as possible. This may actually be part of the test review process, which is discussed later.

9.6.5.3 DO-178C Requirements

Table 9.1 summarizes the DO-178C minimum requirements for highand low-level requirements testing, as well as normal and robustness testing.

Table 9.1 Summary of DO-178C Test Requirements

High-Level Normal High-Level Robust Low-Level Normal Low-Level Robust
Level A Required Required Required with independence Required with independence
Level B Required Required Required with independence Required with independence
Level C Required Required Required Required
Level D Required Required Not required Not required

Source: RTCA DO-178C, Software Considerations in Airborne Systems and Equipment Certification, RTCA, Inc., Washington, DC, December 2011.

9.6.5.4 Low-Level Requirements Testing versus Unit Testing

Before exploring test execution, I want to briefly discuss unit testing. Some companies use unit testing for their low-level requirements testing, but this needs to be handled with care. Traditional unit testing is performed by writing tests against the code. This certainly is not prohibited and is a good practice for the code developers to perform to ensure their code does what they want (that is, debugging). However, DO-178C emphasizes that test ing for certification credit is to be against the requirements. The concepts of unit testing can still apply when each module (or group of modules) is tested against the low-level requirements, but care must be taken to write tests against the requirements and not the code.* Additionally, testing module by module may create some issues for integration. If low-level testing is performed at the module or function level, a concrete plan for how the modules are integrated and tested together is needed. As discussed earlier, big bang integration is not advised and is normally unacceptable. Also, at the highest level of integration it may be difficult to confirm the correctness of interfaces between software modules or functions. When module-level testing is used, there is generally some additional software/software integration testing required in addition to the high-level software/hardware integration testing. When data coupling and control coupling analyses are discussed later (in Section 9.7), it should become more apparent why the software/software integration is needed.

9.6.5.5 Handling Requirements That Cannot Be Tested

During the test development phase, it may be discovered that some requirements are not testable. In some cases, the requirement may need to be re-written (e.g., negative requirements, requirements without tolerances, or ambiguous requirements). In other cases, an analysis or code inspection might be used in lieu of a test. Such situations should be handled carefully, because in general, analysis and code inspection are not considered as effective as testing.* If an analysis or code inspection is used instead of a test, it must be justified why the analysis or code inspection will find the same kinds of errors as a test and why the analysis or code inspection is equivalent to (or better) than a test. I recommend including the justification in the analysis or code inspection itself, so it is recorded. The analysis should also be well documented and repeatable (as noted in Section 9.5). An analysis or code inspection that is used in lieu of a test needs the same level of independence as a test would.

9.6.5.6 Obtaining Credit for Multiple Levels of Testing

In some situations a project may be able to write some tests so that they exercise both highand low-level requirements. Generally, it is more likely that a high-level test to can satisfy low-level requirements, rather than low-level tests satisfying high-level requirements, but it depends on the integration approach and the granularity of the requirements. If a single test or group of tests is used to claim credit for multiple levels of requirements, there needs to be documented evidence that both levels of requirements are satisfied, including trace data and review records. In some cases an analysis may be needed demonstrate how the test(s) fully exercise both levels of requirements, particularly if the tests were not originally written to cover multiple levels of requirements. Such an analysis should be included as part of the test data.

9.6.5.7 Testing Additional Levels of Requirements

In some projects, in addition to highand low-level requirements, there may be other intermediate levels of requirements. These requirements will also need to be tested, with the possible exception of level D software. As noted in the previous section, it might be possible to use one set of requirementsbased tests to exercise multiple levels of requirements.

9.6.6 Test Execution

Test cases and procedures are written to be executed. This section discusses some of the things to consider when preparing for and executing tests.

9.6.6.1 Performing Dry Runs

Most programs do a dry run of the tests, in order to work out any issues and resolve problems prior to for-score or formal runs that are used for certification credit. A dry run of tests is highly recommended. Trying to go straight from a paper review to formal execution typically results in some unexpected and undesired surprises. For higher software levels (levels A and B), it is a good practice to have an independent person who did not write the procedures execute the dry run. This helps work out issues with the procedures, as well as uncovers unexpected failures.

9.6.6.2 Reviewing Test Cases and Procedures

Prior to test execution, the test cases and procedures should be reviewed (for levels A, B, and C) and put under change control. Oftentimes, the tests are informally run prior to the review, so that the results can be referenced during the review. The review process is further discussed in Section 9.7.

9.6.6.3 Using Target Computer versus Emulator or Simulator

Tests are typically executed on the target computer—particularly the hardware/software integration tests. Sometimes a target computer emulator or a host-based computer simulator is used to execute the tests. If this is the case, the differences between the emulator/simulator and the target computer need to be assessed to ensure that the ability to detect errors is the same as on the target computer. Showing the equivalence may be achieved by a difference analysis* or by qualifying the emulator or simulator (using the tool qualification approach described in DO-178C and DO-330, which will be discussed more in Chapter 13). It should be noted that some tests will likely need to be rerun on the target computer, even if an emulator or simulator is used, since there are some types of errors that may only be detected in the target computer environment.,

9.6.6.4 Documenting the Verification Environment

The Software Life Cycle Environment Configuration Index (SLECI) (or equivalent) should be up-to-date prior to executing the tests for certification credit. The SLECI documents the details of the verification environment and is normally used to configure the test stations before tests are run for certification credit. Chapter 10 provides information on the SLECI contents.

9.6.6.5 Test Readiness Review

DO-178C does not discuss the TRR; however, most projects perform a TRR prior to executing the tests to ensure the following:

  • The requirements, design, source code, and executable object code have been baselined.

  • The tests are aligned with the baselined version of the requirements, design, and code.

  • The test cases and procedures have been reviewed and baselined.

  • Any issues from dry runs have been addressed.

  • The test builds have been completed and the part-number of the software under test is documented.

  • The approved test environment is used.

  • The test stations have been configured per the approved SLECI (or equivalent).

  • The test trace data is up-to-date (traceability is discussed later in this chapter).

  • The test schedule has been provided to those who are required to witness or support the test (for example, software quality assurance [SQA], customer, and/or certification liaison personnel).

  • Any known problem reports against the software under test have been resolved or agreed upon for deferral (if there are tests that are expected to fail, it is important to know the expected test failures going into the formal test run).

It should be noted that for some larger projects, the tests are executed in groups. In this situation, there may be a TRR for each group of tests. When this occurs, care must be taken to ensure that all requirements are fully evaluated and that any dependencies between the test groups are considered.

9.6.6.6 Running Tests for Certification Credit

Test execution is the process of executing test cases and procedures, observing the responses, and recording the results. Test execution is often run by someone who did not write the tests and is frequently witnessed by SQA and/or the customer.

During the formal test execution, the test procedures are followed and checked off as they are completed (they should already have been reviewed and baselined).* A test log is typically kept to record who performed the test, when it was performed, who witnessed the test, the test and software version information, and what the test results are. Oftentimes pass/fail will also be determined during the test execution. However, if the data needs to be analyzed, the pass/fail determination might occur later.

As the tests are run for certification credit (formal test execution) there may be some minor redlines* to the procedures. If a good dry run is performed, redlines should be minimal. Too many redlines tends to be an indicator of immature test procedures. When redlines are minimal, they are typically recorded on a hard copy or on a test log and will be approved by engineering and SQA prior to release of the results. The redlined data are normally approved by SQA and are included in a problem report, so the tests are updated in the future.

When executed for certification credit, the tests may uncover some unexpected failures. When this happens, the failures are analyzed and appropriate action taken. Oftentimes, the tests and/or requirements are updated and tests are rerun. Test failures and resulting updates should be handled via the problem reporting process. For tests that are rerun, a regression analysis may be needed. See Section 9.6.9 for additional discussion on the regression test process.

9.6.7 Test Reporting

After the tests are run, the results are analyzed and summarized. Any test failures should result in a problem report and must be analyzed. The review of test failures is discussed in Section 9.7.

Typically, a Software Verification Report (SVR) is generated to summarize the results of verification activities. The SVR summarizes all verification activities, including reviews, analyses, and tests, as well as the results of the verification activities. The SVR often includes the final traceability data as well.

For any verification activities that were not successful, the SVR typically summarizes the analysis from the problem report and justifies why the failures are acceptable.

The verification report is normally reviewed by the certification authority and/or their designees, as well as the customer. Therefore, the report should be written for an audience that may not be familiar with the development and verification details.

9.6.8 Test Traceability

DO-178C section 6.5 emphasizes that trace data should exist to demonstrate bidirectional traces between all of the following [1]:

  • Software requirements and test cases

  • Test cases and test procedures

  • Test procedures and test results

These trace data are often included in the SVR, but might be included in the SVCP or even in a stand-alone document. The trace data accuracy should be verified during the review of test cases, test procedures, and test results. In some cases, the tracing is implicit because the documents are combined; for example, the test cases and procedures may be in the same document. In other cases, a naming convention may be used; for example, each test procedure may generate a test result with the same name (such as TEST1.CMD and TEST1.LOG). If this is the case, the strategy should be explained in the SVP and summarized in the SVR.

9.6.9 Regression Testing

Regression testing is a re-execution of some or all of the tests developed for a specific test activity [8]. After the tests are executed for certification credit, there may be some additional tests that need to run. These may be tests that failed during the initial run and were modified, or there may be new tests added or modified because of new or modified functionality.

Pressman explains that a regression test run typically includes the following types of tests [12]:

  • A representative sample of tests that will exercise all software functions (to support side-effect/stability regression).

  • Tests that focus on the change(s).

  • Tests that focus on the potential impacts from the software changes (that is, tests of impacted data items).

Before such tests are run, a change impact analysis is needed. As a minimum, the change impact analysis considers traceability data and data and control flows to identify all changed and impacted data items (such as requirements, design, code, test cases, and test procedures). The change impact analysis will also identify what reverification is required. Reverification will typically involve reviews of the updated and impacted data; rework of any impacted analyses; rerun of new, changed, and impacted tests; and execution of regression tests (explained later). Chapter 10 provides some additional insight on what is typically considered during a change impact analysis.

Depending on the extent of the change, it is sometimes advisable to rerun the all of the tests rather than try to justify a partial regression approach. This is particularly the case for automated tests. A thorough change impact analysis can take considerable time. It is sometimes easier to rerun the tests than to justify why they do not need to be rerun.

If a regression approach is taken and a small number of tests are run to address the change, it is advisable to run a subset of the overall test suite to confirm that there were no unintended changes. Some projects run a standard subset of tests every time a change is made, in addition to the tests of the changed or impacted software. The subset typically includes safety-related tests and tests that prove critical functionality. This is sometimes referred to as side-effect regression or stability regression, because it proves that the change did not introduce a side effect or impact the stability of the code base. These regression tests help to ensure that the changes do not introduce unintended behavior or additional errors.

9.6.10 Testability

Testability should be considered during the development. Chapter 7 explains that testable software is operable, observable, controllable, decomposable, simple, stable, and understandable. Chapter 7 also identifies features that help make software testable, including error or fault logging, diagnostics, test points, and accessible interfaces. Having testers involved during the requirements and design reviews can help make the software more testable.

9.6.11 Automation in the Verification Processes

Automation is commonly used during the verification activities. Tools do not replace the need to think, and users need to understand how the tools work in order to use them effectively; however, there are some tasks that tools can do more effectively and accurately than humans. Some examples of automation used during the verification processes are as follows:

  • Test templates are commonly used for the test schedule and tasks, test plan, test cases, test procedures, and the test report. The amount of automation for these varies greatly.

  • Test script tools provide a language to write automated tests, as well as a template to promote consistent test layout.

  • Traceability tools are used to help capture and verify the trace data. The data in the tool still need to be verified, but trace tools can help to identify missing test cases or procedures, missing results, untested requirements, etc.

  • Test execution tools execute test scripts and may be used to determine test pass or failure results. If the test output is not reviewed, the tool may need to be qualified, since it automates DO-178C objectives. Tool qualification criteria are discussed in Chapter 13.

  • Debugger tools are sometimes used during the test process to set breakpoints in order to examine or manipulate the software. These should be used cautiously during formal testing to ensure they do not incorrectly change the software. Also, an overabundance of breakpoints may lead to difficulties when proving the completeness of integration. Depending how the debug tool is used, its functionality may need to be qualified.

  • Memory tools are used to detect memory problems, memory overwriting, memory that has been allocated but not freed, and uninitialized memory use [8].

  • Emulators or simulators are sometimes used during the test process to access internal data that may be difficult to access on the actual target.

  • Coverage tools are used to help with the assessment of structural coverage analysis. These tools frequently instrument the code in order to gather metrics on the code execution. Therefore, it may be necessary to run the tests with and without the instrumentation, compare the results, and ensure that the instrumentation does not impact the test results.

  • Static analysis tools are used to evaluate code complexity, compliance to rules, worst-case path, worst-case execution timing, worst-case stack usage, etc. Static analysis tools may also evaluate data and control flow to identify undefined variables, inconsistent interfaces among modules, and unused variables [7].

  • Test vector generators are documented in a formal manner or with models. Such tools generate test vectors to verify the accuracy of the implementation. Depending how these tools are used, they may require qualification. It is also recommended that human-generated tests be used to supplement the tool-generated test vectors to thoroughly test the functionality. It is important that test vector generators use requirements as inputs and not the code. There are some tools that generate requirements from the source code, and then generate test vectors for those requirements. These should be avoided.

9.7 Verification of Verification

DO-178C Table A-7 is entitled “Verification of Verification Process Results.” It is probably the most discussed and debated table in DO-178C, primarily because it includes several objectives that are unique to the aviation industry and are not well documented in other software engineering literature (e.g., modified condition/decision coverage). The DO-178C Table A-7 objectives essentially require a project to evaluate the adequacy and completeness of their testing program. That is, it calls for the verification of the testing effort. The objectives of DO-178C Table A-7 verify the test procedures and results, the coverage of the requirements, and the coverage of the code structure. Each of the major subjects of verification of verification is discussed in this section. Figure 9.1 provides an overview of the typical process.

Images

Figure 9.1 Overview of DO-178C software testing. (Adapted from RTCA DO-178C, Software Considerations in Airborne Systems and Equipment Certification, RTCA, Inc., Washington, DC, December 2011, DO-178C Figure 6-1. Used with permission from RTCA, Inc.)

9.7.1 Review of Test Procedures

DO-178C Table A-7 objective 1 is summarized as: “Test procedures are correct” [1]. The objective is to ensure that the test cases are correctly developed into test procedures. Typically, this is satisfied by a review of both the test cases and test procedures. Some argue that the objective only covers the test procedure review, since the test cases are addressed as part of objectives 3 and 4 (of DO-178C Table A-7). However, in my experience, this objective has been satisfied concurrently with objectives 3 and 4 to ensure that all of the requirements are tested, that the requirements are fully tested, and that the test cases and procedures are accurate, consistent, and compliant with the SVP.

DO-178C Table A-7 objective 1 applies to both highand low-level testing and is required for levels A, B, and C. Level A requires independence. Typically, a peer review, similar to what is used for reviewing the requirements, design, and code is used to satisfy the objective. (See Chapter 6 for information on the peer review process.) The typical inputs to the review are the requirements and design, SVP, SVCP, preliminary test results (not same as dry run), test cases, test procedures, and trace data. If analyses or code inspections are used in lieu of any tests, they are also provided. Oftentimes, someone on the peer review team runs the test to verify that it produces the same results that are presented in the review, is repeatable, and is accurate. For manual tests, this is particularly effective at working out ambiguities in the test procedures and the pass/fail criteria. Clearly, the peer review should take place before the tests are executed for certification credit.

The most common issues identified during these reviews are noted here:

  • Some requirements are not be tested.

  • Robustness testing is not adequate.

  • The tests do not completely exercise the requirements.

  • Some tests are not well explained or commented (which makes maintenance difficult).

  • Some steps in the procedures are missing or are not clear.

  • The pass/fail criteria are not clear for some tests.

  • Tests have not been run to ensure that they work.

  • Traceability from requirements to test is not documented (only the test to requirements).

  • Test environment is not documented well.

9.7.2 Review of Test Results

After the tests are executed, the results are reviewed. DO-178C Table A-7 objective 2 is summarized as: “Test results are correct and discrepancies explained” [1]. The objective is to ensure the correctness of the test results and to confirm that any failed tests are analyzed and properly addressed. This objective is typically satisfied by a review of the test results and the software verification/test report. The review includes an evaluation of explanations for any failed tests (typically in problem reports and the software verification/ test report). In some situations, test failures will require a fix to the software or the test cases/procedures. In general, if a dry run has been performed, the number of unexpected test failures is minimized. In an effort to shorten the schedule, some teams try to bypass the dry runs and go straight to formal testing. However, projects that attempt to go to formal testing without a thorough dry run normally end up doing the formal testing at least two times. In any case, all failures need to be documented in problem reports and addressed appropriately. Analysis of failures is also normally included in the SVR.

9.7.3 Requirements Coverage Analysis

Objectives 3 and 4 of DO-178C Table A-7 are referred to as the requirements coverage objectives. Objective 3 is summarized as: “Test coverage of high-level requirements is achieved” [1]. Similarly, objective 4 is summarized as: “Test coverage of low-level requirements is achieved” [1]. The purpose of these objectives is to confirm that all requirements have been tested (or verified by another means if properly justified). Typically, these objectives are evaluated during the test cases and procedures review. However, just before or after the final test run, there is often a final coverage analysis to ensure that all requirements are covered and nothing has been missed during the final stages of the test program (for example, a missing test case due a late-breaking requirement change). The trace data between test cases and requirements are typically used to confirm this coverage. However, there must also be a technical evaluation to ensure that the test cases fully cover the requirements (this is normally evaluated during the test case review). Basically, if the test cases have been adequately reviewed for complete coverage of the requirements, this coverage analysis is rather trivial. If an automated tool is used to run the tests for certification credit, then it is important to check that the tool actually ran all of the tests and none were skipped. This can be checked manually or automated as part of the tool that runs the tests. If credit is taken for the tool performing the check, then the tool needs to be qualified. Since low-level requirements do not have to be tested for level D, the requirements coverage of low-level requirements does not apply for level D. In my experience, the coverage analysis is either included in or summarized in the SVR.

9.7.4 Structural Coverage Analysis

The remaining five objectives of DO-178C Table A-7 address structural coverage. Objectives 5–7 and 9 pertain to the test coverage of the software structure, to ensure that the code has been adequately exercised during the requirements-based testing. Objective 8 is similar, except it focuses on the coverage of the data and control coupling between code components (for example, modules).

Structural coverage has the following purposes:

  • Ensures that all code has been executed at least once.

  • Finds unintended functionality and untested functionality.

  • Identifies dead code* or extraneous code.

  • Helps confirm that deactivated code is truly deactivated.

  • Identifies a minimal set of combinations for testing (i.e., it does not require exhaustive testing).

  • Helps identify incorrect logic.

  • Serves as an objective completion criteria for the testing effort (although it may not address completeness of the robustness testing).

It must be noted that structural coverage analysis in DO-178C is not equivalent to structural testing that is mentioned in much of the software engineering literature. Structural coverage analysis is performed to identify any code structure that was not exercised during the requirements-based testing. Structural testing is the process of writing tests from the code to exercise the software. Structural testing is not adequate for DO-178C because it is not requirements based and, hence, does not satisfy the purpose of structural coverage analysis.

A brief warning is in order before examining the structural coverage criteria in more detail. Some teams proactively run their structural coverage tools throughout the test cases and procedures development. This is a good practice; however, some teams use the coverage data to identify the tests needed—that is, they write tests to satisfy the tool rather than allowing the requirements to drive the testing. This is not the intent of structural coverage and can result in inadequate testing. Some software results in over 50% coverage just by turning on the system—without running a single test. Therefore, it must be emphasized that tests are to be written and reviewed against the requirements before structural coverage data is collected. To avoid the temptation to just make the tool happy, a different engineer is generally responsible for the coverage analysis (this also helps satisfy the independence required for level A and B structural coverage objectives).

Each DO-178C Table A-7 structural coverage objective is briefly discussed in the following subsections.

9.7.4.1 Statement Coverage (DO-178C Table A-7 Objective 7)

Statement coverage is required for levels A, B, and C; it ensures that “every statement in the program has been invoked at least once” [1]. This is achieved by evaluating the coverage of the code during requirements-based testing and ensuring that every executable statement has been invoked at least once. Statement coverage is considered a relatively weak criterion because it does not evaluate some control structures and it does not detect certain types of logic errors. An if-then construct only requires the decision to be evaluated to true to cover the statements. An if-then-else requires the decision to be evaluated both true and false. Basically, statement coverage ensures that the code has been covered, but it might not be covered for the right reason.

9.7.4.2 Decision Coverage (DO-178C Table A-7 Objective 6)

Decision coverage is required for levels A and B and is commonly equated with branch or path coverage. However, decision coverage is slightly different. DO-178C defines decision and decision coverage as follows [1]:

  • “Decision – A Boolean expression composed of conditions and zero or more Boolean operators. If a condition appears more than once in a decision, each occurrence is a distinct condition.”

  • “Decision coverage – Every point of entry and exit in the program has been invoked at least once and every decision in the program has taken on all possible outcomes at least once.”

The confusion centers around the traditional industry understanding of branch point (which is typically an if-then-else, do-while, or case statement) versus the DO-178C literal definition of decision. Basically, using the literal definition of DO-178C, a decision is not synonymous with a branch point. The main difference is that a Boolean value that is output on an input/output (I/O) port may have a binary effect on the behavior of the system (e.g., wheels up when TRUE and wheels down otherwise). It is important to make sure that this value is evaluated both TRUE and FALSE during test. This could be treated as a boundary value problem or a decision coverage problem.

In general decision coverage ensures that all paths in the code are taken and if there are any Boolean assignments, both the true and false conditions are exercised either when the value is used directly externally or when the value is used in a construct that results in a branch.

9.7.4.3 Modified Condition/Decision Coverage (DO-178C Table A-7 Objective 5)

Modified condition/decision coverage (MC/DC) is required for level A software and has been the subject of much debate most of my career and will probably continue to generate heated discussions after my retirement.

DO-178C defines condition, decision, and modified condition/decision coverage as follows [1]:

  • “Condition – A Boolean expression containing no Boolean operators except for the unary operator (NOT).”

  • “Decision – A Boolean expression composed of conditions and zero or more Boolean operators. If a condition appears more than once in a decision, each occurrence is a distinct condition.”

  • “Modified condition/decision coverage – Every point of entry and exit in the program has been invoked at least once, every condition in a decision in the program has taken all possible outcomes at least once, every decision in the program has taken all possible outcomes at least once, and each condition in a decision has been shown to independently affect that decision’s outcome. A condition is shown to independently affect a decision’s outcome by: (1) varying just that condition while holding fixed all other possible conditions, or (2) varying just that condition while holding fixed all other possible conditions that could affect the outcome.”

Even though an MC/DC criterion has been applied to aviation projects since the late-1980s/early-1990s, it is still rarely discussed in the mainstream software engineering literature. It was developed for the aviation industry to allow comprehensive criteria to evaluate test completion, without requiring that each possible combination of inputs to a decision be executed at least once (i.e., exhaustive testing of the input combinations to a decision, known as multiple condition coverage). While multiple condition coverage might provide the most extensive structural coverage measure, it is not feasible for many cases, because a decision with n inputs, requires 2n tests [13]. MC/DC, on the other hand, generally requires a minimum of n + 1 test cases for a decision with n inputs.

Most projects apply MC/DC at the source code level, because the source code coverage tools are more available. However, there have been a few projects that have applied the structural coverage criteria at the object code or executable object code levels. There has been some debate on this approach, as noted in the Certification Authorities Software Team (CAST) position paper CAST-17, entitled Structural coverage of object code. The position paper provides a summary of the motivations behind object code coverage and some of the issues to be addressed [14]. The CAST-17 paper has served as the foundation for project-specific Federal Aviation Administration (FAA) issue papers. DO-178C section 6.4.4.2.b modified the wording from DO-178B to make it clear that object code or executable object code coverage are acceptable approaches: “Structural coverage analysis may be performed on the Source Code, object code, or Executable Object Code” [1]. DO-248C frequently asked question #42, entitled “What needs to be considered when performing structural coverage at the object code level?” also clarifies this topic [3].

It is tempting to write more about structural coverage—especially MC/DC, since I have spent considerable time investigating, debating, and evaluating it over the last several years. However, because it has been so widely discussed and debated in the aviation community, there are publicly available documents that cover the topic well. Items 1–5 in the “Recommended Reading” section of this chapter identify some particularly helpful resources.

9.7.4.4 Additional Code Verification (DO-178C Table A-7 Objective 9)

DO-178C Table A-7 objective 9 is summarized as follows: “Verification of additional code, that cannot be traced to Source Code, is achieved” [1].

This objective was added for DO-178C, although the text in DO-178B also required it. Most projects perform structural coverage on the source code (although a few do perform object code coverage or machine code coverage). However, the executable object code is what actually flies. For level A software, there needs to be some kind of analysis to ensure that the compiler does not generate untraceable or nondeterministic code.

For projects that perform their structural coverage on the source code, a source-to-object code traceability analysis is normally performed. In my experience, this includes a comparison between source code and object code to ensure that the compiler is not adding, deleting, or morphing the code. The analysis is usually applied using a sample of the actual code, rather than 100% of the code.* The sample used should include all constructs that are allowed in the source code and comprise at least 10% of the actual code base. It is also recommended that combinations of the constructs be evaluated. The analysis should be performed using the compiler settings that will be used for the actual code generation. The analysis normally involves a lineby-line comparison of the source code and the object code (or machine code). Any object code (or machine code) that is not directly traceable to the source code is analyzed and explained. In some situations, the analysis may identify unacceptable compiler behavior, which requires action. Compiler features such as register tracking, instruction scheduling, and branch optimization can lead to problems with the analysis. Highly optimized compiler settings and certain languages (e.g., languages with object-oriented features) can also result in untraceable code, and hence would not be certifiable. The analy sis requires an engineer with knowledge of the specific language, assembly, machine code, and compilers. If the source code changes, there will need to be an assessment to ensure that no additional code constructs are added and that the analysis remains valid. Once this analysis is performed, it may be used for multiple projects, as long as the compiler and its settings are unchanged and the code uses the same constructs.

For the majority of projects that I have worked on, the source-to-object code analysis has not uncovered significant issues, as long as the optimization is kept minimal and mature languages and compilers are used. However, the analysis is still needed and should be done thoroughly.

9.7.4.5 Data Coupling and Control Coupling Analyses (DO-178C Table A-7 Objective 8)

I have saved the best for last in this section on structural coverage. The data coupling and control coupling (DC/CC) objective (Table A-7 objective 8) was included in DO-178B but was not clear. In particular, DO-178B varied from the overall software engineering definition of data coupling and control coupling but did not clearly explain the intent. According to members from the RTCA Special Committee #167 that developed DO-178B, the data and control coupling objective was added late in the committee deliberations and did not have extensive discussion. Consequently, there has been confusion about what was actually meant by the objective. Thankfully, DO-178C has clarified it. However, the clarification will likely cause some challenges throughout the industry, because many companies do not comply with the clarified objective (i.e., their interpretation of the original DO-178B objective is not consistent with the DO-178C clarification).

DO-178C defines data coupling, control coupling, and component as follows [1]:

  • “Data coupling—The dependence of a software component on data not exclusively under the control of that software component.”

  • “Control coupling—The manner or degree by which one software component influences the execution of another software component.”

  • “Component—A self-contained part, combination of parts, subassemblies, or units that performs a distinct function of a system.”

According to DO-178C, the intent of Table A-7 objective 8 is to perform an “analysis to confirm that the requirements-based testing has exercised the data and control coupling between code components” [1]. CAST-19 states that the purpose of data and control coupling analyses is as follows:

To provide a measurement and assurance of the correctness of these modules/components’ interactions and dependencies. That is, the intent is to show that the software modules/components affect one another in the ways in which the software designer intended and do not affect one another in ways in which they were not intended, thus resulting in unplanned, anomalous, or erroneous behavior. Typically, the measurements and assurance should be conducted on R-BT [requirements-based tests] of the integrated components (that is, on the final software program build) in order to ensure that the interactions and dependencies are correct, the coverage is complete, and the objective is satisfied [16].*

Unfortunately, this is not how many developers have interpreted the criteria. Many have applied it as a design and code review activity to ensure that the code accurately implements the control and data flow of the design. This activity during development is important to ensure that the final DC/CC analysis will be successful and to satisfy DO-178C Tables A-4 and A-5 objectives; however, this is not the intent of the Table A-7 objective. In addition to design and code reviews, those who comply with DO-178C also need to ensure that their tests cover the data couples and control couples between components (such as, modules or functions). Some organizations have applied the DC/CC criteria as intended, and it can be quite challenging—particularly, if design and code reviews were inadequate or if the integration approach is not sound. If the requirements-based testing and structural coverage measurements are performed on an integrated system (rather than module by module) without instrumenting the code, this helps improve confidence.

In order to successfully perform DC/CC analysis using requirementsbased tests, it is important to ensure during development that the architecture is consistent with requirements and that code complies with architecture. Satisfying DO-178C’s guidance on DC/CC analysis normally involves four steps as described in the following.

  1. First, the software architecture must be documented (DO-178C Table A-2 objective 3). DO-178C section 11.10 provides guidance on what to document in the software design description, including the following related to data or control flow: data structures, software architecture, internal and external input/output, data flow and control flow of the design, scheduling procedures and inter-processor/inter-task communication mechanisms, partitioning methods, and descriptions of software components [1]. Chapter 7 discussed the design process.

  2. Second, the software architecture and code are reviewed and/or analyzed for consistency. DO-178C Table A-4 objective 9 references DO-178C section 6.3.3., which explains that one purpose of the design review and/or analysis is to ensure the correct relationship between “components of the software architecture. This relationship exists via data flow and control flow…” [1]. DO-178C Table A-5 objective 2 references DO-178C section 6.3.4.b, which explains that one purpose of the code review and/or analysis is to “ensure that the Source Code matches the data flow and control flow defined in the software architecture” [1]. Normally, both the design and code review/analysis activities involve a detailed checklist or questionnaire to consider common data coupling and control coupling issues. Table 9.2 provides some example issues to consider during the reviews and analyses; the specifics will vary depending on the architecture, language, environment, etc. These are merely examples.

  3. Third, requirements-based integration tests are developed (DO-178C Table A-5). DO-178C section 6.4.3.b explains that requirementsbased software integration testing is performed to ensure that “the software components interact correctly with each other and sat isfy the software requirements and software architecture” [1]. The integration testing is intended to identify the following kinds of errors [1]: incorrect initialization of variables and constants, parameter passing errors, data corruptions (especially global data), incorrect sequencing of events and operations, and inadequate end-to-end numerical resolution. Some argue that including the requirementsbased emphasis in the DO-178C explanation of DC/CC analysis weakens the criteria, because the architecture also needs to be exercised. However, during design review the consistency between architecture and requirements should be verified. If that occurs, then the requirements-based testing should also exercise the architecture. Also, as noted in DO-178C section 6.4.3.b the architecture should be considered during the DC/CC analysis.

    Table 9.2 Example Data and Control Coupling Items to Consider During Design and Code Review/Analysis

    Example Data Coupling Items to Consider Example Control Coupling Items to Consider
    • All external inputs and outputs are defined and are correct

    • All internal inputs and outputs are defined and are correct

    • Data is typed correctly/consistently

    • Units are consistent and agree with data dictionary

    • Data dictionary and code agree and are both complete

    • Data is sent and received in the right order

    • Data is used consistently

    • Data corruption is prevented or detected

    • Data is initialized or read-in before being used

    • Stale or invalid data is prevented or detected

    • Data miscompares or data dropouts are prevented or detected

    • Unexpected floating point values are prevented or detected

    • Parameters are passed properly

    • Global data and data elements within global data constructs are correct

    • I/O is properly accessed from external sources

    • All variables are set (or initialized) before being used

    • All variables are used

    • Overflow and underflow is identified and correct

    • Local and global data are used

    • Arrays are properly indexed

    • Code is consistent with the design

    • Order of execution is identified and correct

    • Rate of execution is identified and correct

    • Conditional execution is identified and correct

    • Execution dependencies are identified and correct

    • Execution sequence, rate, and conditions satisfy the requirements

    • Interrupts are identified and correct

    • Exceptions are identified and correct

    • Resets are identified and correct

    • Responses to power interrupts are identified and correct

    • Foreground schedulers execute in proper order and at the right rate

    • Background schedulers are executed and not stuck in infinite loop

    • Code is consistent with the design

  4. Fourth, tests are analyzed to confirm that the data and control coupling between components are exercised by the requirementsbased tests (DO-178C Table A-7 objective 8). If Steps 1–3 are not performed adequately, it will be difficult (probably impossible) to adequately complete Step 4. As far as I know, there are no commercially available tools to perform the data coupling and control coupling analysis. Most companies either development their own tool(s) or perform the analyses manually. Hopefully, there will be some expansion of the better commercially available tools in the future to help with the data and control coupling analyses.

Item 6 in the “Recommended Readings” section of this chapter pro vides some additional insight into data coupling and control coupling analyses. Additionally, since partitioning is closely related to data and control coupling, Chapter 21 discusses some additional items worth considering when developing an approach to satisfy DO-178C Table A-7 objective 8.

9.7.4.6 Addressing Structural Coverage Gaps

When measuring code coverage obtained through requirements-based tests, some gaps in coverage may be identified. An analysis is performed to determine the reason for the gaps. The following actions may be taken:

  • If the coverage gap is caused because of missing tests, additional test cases are added and executed.

  • If the gap is caused because of a missing requirements (an undocumented feature), the requirements are updated and tests are added and executed.

  • If the gap identifies dead code or extraneous code, the dead or extraneous code is removed and an analysis is performed to assess the effect and the need for reverification. Typically, some reverification is needed. See Chapter 17 for information on dead and extraneous code.

  • If the gap is caused by deactivated code, the deactivation approach is analyzed to ensure it is adequate and consistent with requirements and design. See Chapter 17 for discussion of deactivated code.

  • If the gap is not addressed by any of the aforementioned, an additional analysis is needed to ensure that the code is adequately verified and works as intended. Typically, this analysis is either included in or summarized in the SVR.

9.7.4.7 Final Thoughts on Structural Coverage Analysis

Before we leave the subject of structural coverage, I want to summarize four important things.

First, when structural coverage credit is claimed, it should be based on requirements-based tests that pass (and satisfy the requirements they are traced to). If the test fails, the coverage may not be valid.

Second, structural coverage is typically measured by instrumenting the source code to identify where the source code has been exercised during the test execution. As the source code is changed it is important to assess if the changes introduced by the instrumentation are benign (as they should be). The resultant code optimization is also affected by the instrumentation because the information recording coverage features changes information flow. Therefore, it is necessary to rerun the tests without the instrumentation and compare the instrumented and non-instrumented results before claiming structural coverage credit.

Third, not all structural coverage tools fully satisfy DO-178C’s criteria. In particular, be cautious of tools used to measure coverage in concurrent code that have not been designed to measure coverage in the presence of tasking constructs. Tools should be carefully evaluated and thoroughly understood before investing time and money. The FAA research report, entitled Software Verification Tools Assessment Study, provides some interesting insight. The FAA-sponsored research proposed a test suite to assess commercially available structural coverage tools. Anomalies were found in each of the three evaluated tools. This research demonstrates that care must be taken when selecting a tool, since some tools may not meet the DO-178C definition of structural coverage [17].

Lastly, the structural coverage analysis results are typically either included in or referenced in the SVR. Any structural coverage gaps and their analysis will be evaluated closely by certification authorities and/or their designees; therefore, they should be clearly stated and thoroughly justified.

9.8 Problem Reporting

Although problem reporting is included as part of the configuration management section in DO-178C and is further explored in Chapter 10 of this book, it is discussed in this chapter on verification, because many of the software problems are discovered during verification. Problem reports can be generated at any phase of the project and are usually initiated once a data item (e.g., requirements, design, code, test cases, or test procedures) has been reviewed and baselined. The problem reports are used to manage change between baselines and to address known errors with the software and its life cycle data.

During the verification process, it is common to discover problems with systems requirements, software requirements, design, and source code. Also, problems with the plans, the life cycle processes, tools, or even the hardware may be discovered. Problem reports are used to document issues with processes, data, and product. Problem reporting typically does not occur until after a life cycle data item has been reviewed, updated (if needed), and baselined. Once that has occurred, it is typical for all changes to be documented in a problem report. Some companies have two classes of problem reports— one for code and one for process and documentation errors. They sometimes have yet another class for future product enhancements. I tend to prefer one problem reporting system to document all kinds of issues or possible changes to the software, with a good classification system for each problem report (i.e., the problem report may be classified as code issues, documentation issue, process issues, future enhancement, etc.).

Each problem report (PR) typically contains the following information:

  • PR number

  • Date PR written

  • Problem summary

  • Affected life cycle data items and their versions

  • Problem type (e.g., code, design, requirements, documentation, enhancement, process, tool, other)

  • Problem severity (impact or consequence) from aircraft safety and compliance perspective (e.g., safety impact, certification/compliance impact, major functional/operational impact, minor functional/ operational impact, documentation only)

  • Problem priority from the software developer’s perspective (e.g., fix immediately, fix as soon as possible, fix before next release, fix before certification, fix if possible, do not fix)

  • PR author name and problem discovery date

  • How the problem was found

  • Functional area of the problem

  • Problem description and how to reproduce it

  • Suggested fix (usually an optional field)

  • Supporting data (typically attached to the PR)

  • Person assigned to investigate the problem

  • Problem analysis and comments (i.e., where the problem investigation is documented)

  • Status (e.g., open, in-work, fixed but not yet verified, verification in progress, fixed and verified, cannot be reproduced, deferred, duplicate, or canceled [if it is not a valid problem])

It should be noted that there are not always severity and priority classifications. Some problem reporting systems just have one classification and some have two. It will depend on the nature of the project. The PR classification scheme should be explained in the Software Configuration Management Plan (SCMP) to ensure it is consistent and clear to everyone involved. Chapter 10 provides additional information on problem reporting, including some of the certification authorities’ recommended classifications.

Writing good PRs is challenging for many engineers. However, failure to write them well makes it difficult to confirm that changes are properly assessed, implemented, and verified. Following are some practical suggestions for writing and addressing PRs:

Suggestion 1: Document all problems in a PR, unless the document, data, or product has not yet been reviewed, in which case the problem should be noted in the review record. Verbal and e-mail reporting of problems are not trackable or actionable.

Suggestion 2: Write a descriptive, concise, accurate, and unique problem summary. The problem summary may end up in the SAS, if the problem is deferred; therefore, it could have high visibility. Even nondeferred problem reports may be read by managers, customers, systems engineers, safety person nel, SQA, certification liaison personnel, and occasionally, certification authorities. Therefore, the problem summary should be written in such a way that it makes sense even for someone not knowledgeable of the software details. Likewise, it should be value added for those who do know the software details. Essentially, extra time should be taken to write a good problem summary, since it could be around for a while.

Suggestion 3: Number and classify each PR. The classification may change as the problem is investigated, but it is important to have some idea of the severity or priority of the issue. Chapter 10 provides more discussion on the classification approach.

Suggestion 4: Ensure that the process for classifying PRs is clearly documented in the SCMP. As noted earlier, some projects will have multiple classification schemes (e.g., one for severity and one for priority). The classification approach must be clear to the developers, customer, change control board (CCB), certification liaison personnel, management, and anyone else who will read the PRs and make decisions using the PRs.

Suggestion 5: Only identify one problem in each PR. It is sometimes tempting to group related problems together; however, these are difficult to close if some are fixed and some are not.

Suggestion 6: Ensure that the PR is legible and understandable. A clear and complete description of the problem is needed, as well as specific information on how the problem was noted, how to reproduce the problem, and what the problem is. Using attachments with excerpts from the data item or screen shots of the problem can be very helpful. If the problem is not clearly described, proper action may not be taken.

Suggestion 7: Immediately document problems when they are discovered. Even obvious bugs should be reported; otherwise, they may not be fixed.

Suggestion 8: Document nonreproducible bugs in a PR. Such bugs may become more apparent as the project progresses.

Suggestion 9: Generally, the PR author should spend time explaining the problem but should not try to solve the problem. Some suggestions for resolution may be included, but the solutions will come during the investigation.

Suggestion 10: Ensure that the PRs are professional and nonjudgmental. It does not help to blame the coder, management, or the architect. Finger pointing is not productive. The PR should focus on the technical issue, not the people.

Suggestion 11: Once a problem is reported, it should be reviewed by one or more people to determine the next steps. A CCB (or a PR review board) is generally utilized to evaluate new problems and to make assignments. Once all necessary data is gathered and a recommendation for a solution is established, the CCB decides to approve the solution or defer the solution.

Suggestion 12: During the investigation phase of the problem, the engineer should consider related issues. The reported problem may just be a symptom of a bigger issue. The investigator looks for the root cause, not just the symptoms.

Suggestion 13: Remember that problem investigation and debugging can take time. Some problems are obvious, but some may take weeks to evaluate. Investigations should be assigned to experienced engineers who have a good understanding of the overall system and the development process.

Suggestion 14: Ensure that proposed solutions clearly identify what items will be changed and impacted and what the expected change will be to each item. Basically, a simplified change impact analysis is needed for each change. The better this analysis is documented, the easier it will be to support the consolidated change impact analysis for the software baseline.

Suggestion 15: Once a data item has been reviewed and baselined, it is only changed with a PR (or equivalent change authorization vehicle). This applies to CC1 (control category #1) data items (such as, requirements, design, code, verification cases/procedures). CC1/CC2 is explained in Chapter 10. It can be tempting to change obvious errors, when fixing a data item for a documented problem. It is not wrong to fix such errors; however, the issues and solutions need to be documented in a PR and not just fixed without a record.

Suggestion 16: Evaluate the problems together. Throughout the project, developers and management (as well as customers, quality, and certification liaison personnel) should have an understanding of the overall group of PRs.

Sometimes problems are related, but if the technical people are not in tune with the bigger picture, they may miss the connections.

Suggestion 17: Be aware of the tendency for bug morphing [9]. This occurs when a bug is reported and the investigator gets off track (perhaps he or she finds some other issues). If other issues are identified during a PR investigation, additional PRs should be created to address them.

Suggestion 18: Do not be afraid to report duplicate issues. Some developers hesitate to enter a PR because they think it might be a duplicate. They should quickly review the list of existing PRs and submit their problem if they do not clearly see the problem reported. It should also be noted that when there is a troubled area in the software, there may be multiple PRs that are related but are slightly different (oftentimes, there are multiple symptoms of an underlying issue). Therefore, it is better to write a potentially duplicate PR than to not report the issue. The CCB (or PR review board) will do the more thorough assessment to determine duplication.

Suggestion 19: To ensure quality and consistent PRs, assign an engineer to review PRs for readability, accuracy, completeness, etc. Oftentimes, this person also leads the CCB (or PR review board) and ensures the PR is mature before being evaluated by the board.

Suggestion 20: Verify the resolution(s) to the problem prior to closing the PR. The verification process may include re-review of requirements, design, code, and test data. Typically, an independent reviewer is preferred. The data item should go through the same level of rigor as it initially did (that is, the same processes, checklists, etc.).

Suggestion 21: If a PR is deferred, write a justification for deferral in the PR. This justification should explain why there is no safety, operation, performance, or compliance issues if the problem is not fixed prior to certification. The justification will have considerable scrutiny, so it must be clearly stated and well supported by the data. The justification will be evaluated by the CCB (and/or the PR board), the customer, safety personnel, and certification liaison to determine if deferral is feasible. Once the decision to defer is agreed, the justification is included in the SAS for agreement with certification authorities and/or their designees.

Suggestion 22: Include SQA in the PR deferral and closure process. SQA may be a member of the CCB or may have a separate approval process.

9.9 Recommendations for the Verification Processes

Some claim that there is no such thing as a best practice when it comes to verification, since so much depends on the specific situation. However, there are several practices that make the project go smoother and increase the probability of success. This section provides a summary of recommendations based on my involvement in numerous projects. Some of these are a summary of the aforementioned material and others have not yet been covered.

Recommendation 1: Plan early and adjust frequently. No project can predict all of the issues that will arise. However, planning is still important and can reduce some of the greater issues. In addition to the planning, continual adjustments must be made to address the actual issues.

Recommendation 2: Strive for a realistic schedule. Fictitious schedules are one of my pet peeves. Dream-based planning and head-in-the-sand management are just not effective. One time I was working on a schedule and told the manager that I needed to consider the tasks before I could project the schedule. He looked at me like I was crazy. He said: “I have my management hat on.” I do not understand why a management hat cannot be based on reality. As one of my colleagues likes to say: “Good leaders don’t ignore reality.” Far too much scheduling is based on business agendas, rather than reality. Most engineers will work like maniacs to meet a realistic, even aggressive, schedule. However, too many impossible or dream-based schedules cause them to become demoralized and demotivated.

Recommendation 3: Design for robustness. It is nearly impossible to test in robustness (i.e., to implement robustness during the test phase). It is better to anticipate abnormal input and to design the system to address that input.

Recommendation 4: Design for testability. As noted earlier, it is helpful for developers to build the system so that it will support testing. For example, making some data structures visible at an outer level makes testing easier than verifying with data that is difficult to access. It is helpful to have a tester involved in the requirements and design reviews, because he or she will be thinking ahead about the test aspects.

Recommendation 5: Start verification early and continue it throughout the project. Verification should start as early as possible in the development effort. Reviews of requirements, design, and code should take place as the data is generated. It is extremely beneficial to have informal technical reviews of the data prior to holding the formal review. Putting verification off to the end is not effective, can be very expensive, and can harm the relationship with the certification authority and the customer.

Recommendation 6: Define roles during reviews. During peer reviews, it is helpful to define roles. For example, one person may focus on how well the tests exercise the requirements; one person may concentrate on the accuracy of the traces; one may evaluate correctness and accuracy of the test cases; and one may execute the test procedures to determine repeatability.

Recommendation 7: Involve the testers during the review phase. If testers are involved in the review of requirements and design, they can help ensure that the software is testable and can become familiar with the requirements that they will be testing.

Recommendation 8: Start testing activities early. As the requirements mature, someone should begin the test planning and test case development.

Recommendation 9: Give testers time and opportunity to understand the system. The better they understand the system’s intent and its actual functionality, the better they can identify the errors. When the requirements are split among the testers and no one really has an overall understanding of the system, testing is not as effective.

Recommendation 10: Know the environment. Testers should understand the software’s operational environment. Knowledge of the development environment, the hardware, and the interfaces is also important for effective testing.

Recommendation 11: Testers should be encouraged to and allowed to question everything. Obviously, the questions do not need to be out loud, but the best way to find errors is to continually be considering “what if?” or “how does it work?” or “why does it work?” A non-curious verifier tends to be ineffective.

Recommendation 12: Use experienced people and allow them to apply their experience. Obviously, the pool of experts is limited. However, experienced testers should be used for the key tasks. The junior engineers can play a role as well, and some of their work is incredibly ingenious. But, the senior and more experienced personnel should be allowed to lead. Some companies institute a form of pair testing—where two people work together on tests. This can be an effective way to train the junior engineer and utilize the experienced testers.

Recommendation 13: Encourage testers to think beyond the requirements—especially early in the project. If only the requirements are tested and the requirements are wrong or incomplete, some serious issues could be discovered late or may not be discovered at all.

Recommendation 14: Test critical and key functionality first. If the basic things do not work, the rest will not really matter.

Recommendation 15: If something is confusing, it should be tested more. If the requirements, design, and/or code are confusing, there is usually a reason. It might be because the developer did not understand the problem well (causing omissions or mistakes) or that he or she understood it too well (leading to oversimplification). If the requirements or design are confusing, it often leads to errors in the code. In many cases, the confusing area may need to be redesigned and recoded, or at least cleaned-up.

Recommendation 16: Expect to spend more time testing complex areas. As noted earlier, errors tend to hide in the caverns of complexity. If something is complex, it will likely have more issues and will take more time to verify and to fix when the problems are discovered.

Recommendation 17: Realize that quality needs to be built in—not tested in at the end. Testing is an indicator of the overall quality of the product, but waiting until the test phase to address quality is too late. Quality should be proactively built in. Involving testers in the development processes and developing and executing tests early (1) helps identify issues while they can still be efficiently fixed and (2) improves quality. Sometimes testing will uncover an issue that leads to redesign and reimplementation; it is better to find these errors early.

Recommendation 18: Testers should focus on the technical functionality—not the people. As mentioned, testers are often seen as a negative group. Their negativity helps identify problems that need to be resolved. However, they must take care to focus the negative energy on the software and not the people who wrote it.

Recommendation 19: Use as much independent verification as possible. Even though DO-178C does not require independence for many of the verification activities, some degree of independence is quite effective. It truly is difficult to find errors in your own work.

Recommendation 20: Write accurate PRs and keep abreast of the overall status of the program’s problems. PRs should be generated as soon as problems are noted. Likewise, the PRs should be as accurate as possible—bad data does not help anyone. Additionally, management should regularly read the PRs in order to properly manage the project.

Recommendation 21: Automate wisely. Automation should only be used when it makes sense. Some projects automate just to be automating. Engineers love to build tools. Sometimes it can take longer to build and qualify a tool than to just test the product manually. Also, tools do not replace the need to think. If not used wisely, tools can give a false sense of security.

Recommendation 22: Set the team up for success. An effective manager fosters teamwork, encourages each member to do his or her best, utilizes diversity, encourages a nonhostile and productive work environment, rewards talent, encourages testers to find errors, deals with issues proactively, and makes decisions based on data rather than feelings (particularly, when it comes to schedules).

Recommendation 23: Foster creativity. Thorough testing requires creativity. Sometimes the testing requires more creativity than the development, because the testers have to think how to break the software rather than make it work. Creativity can be encouraged with free time to think (I call this dream time), fun activities, and competitions.

Recommendation 24: Invest in training for the testers. Good testers are always learning. Managers should provide opportunities for training and growth— even if it takes a day or two away from the project.

Recommendation 25: Implement continuous improvement. I once saw a comic strip where a guy was chopping down trees in the forest with an axe. He had a schedule to keep and did not have time to learn how to use the chainsaw in his truck. Sometimes we get so focused on the task at hand that we fail to look for opportunities to improve. Obviously, at the end of the project lessons learned should be captured and actions taken. However, it is also valuable to correct course during the project when something is not working or when there is a more effective way to do it.

Recommendation 26: Identify risks and issues and deal with them proactively. As issues arise, they should be addressed. They do not go away on their own.

Recommendation 27: When outsourcing some or all of the testing, do not just “throw it over the wall.” Outsourcing or subcontracting will require close oversight, training, and continual communication. This is discussed more in Chapter 26.

Recommendation 28: Keep a master list of common issues and errors and make sure your team is aware of them. It is beneficial to compile a list of common issues based on input from experienced testers, problem reports, and literature. The list can be updated over time. Testers should be educated on the common mistakes. Obviously, they should not limit their verification to just these, but these will be a good start at shaking out problems in the system.

Recommendation 29: Be prepared for the challenges. One never knows what kind of challenges he or she will encounter when verifying software. Be ready to roll with the punches. Some of the most common challenges are: ambiguous requirements, missing requirements, determining how much robustness is enough, addressing schedule pressure (since testing is typically the last activity), exercising the architecture as well as the requirements, managing change (keeping up with changes to requirements, design, and code), keeping morale up, dealing with problems that are discovered late in the process, and maintaining an adequate tester-to-developer ratio. Every program has its own special set of issues. Be ready for anything.

References

1. RTCA DO-178C, Software Considerations in Airborne Systems and Equipment Certification (Washington, DC: RTCA, Inc., December 2011).

2. Certification Authorities Software Team (CAST), Verification independence, Position Paper CAST-26 (January 2006, Rev. 0).

3. RTCA DO-248C, Supporting Information for DO-178C and DO-278A (Washington, DC: RTCA, Inc., December 2011).

4. RTCA DO-254, Design Assurance Guidance for Airborne Electronic Hardware (Washington, DC: RTCA, Inc., April 2000).

5. Certification Authorities Software Team (CAST), Addressing cache in airborne systems and equipment, Position Paper CAST-20 (June 2003, Rev. 1).

6. C. Kaner, J. Bach, and B. Pettichord, Lessons Learned in Software Testing (New York: John Wiley & Sons, 2002).

7. G. J. Myers, The Art of Software Testing (New York: John Wiley & Sons, 1979).

8. E. Kit, Software Testing in the Real World (Harlow, England, U.K.: ACM Press, 1995).

9. A. Page, K. Johnston, and B. Rollison, How We Test Software at Microsoft (Redmond, WA: Microsoft Press, 2009).

10. C. Kaner, J. Falk, and H. Q. Nguyen, Testing Computer Software, 2nd edn. (New York: John Wiley & Sons, 1999).

11. J. A. Whittaker, How to Break Software: A Practical Guide to Testing (Boston, MA: Addison-Wesley, 2003).

12. R. S. Pressman, Software Engineering: A Practitioner’s Approach, 7th edn. (New York: McGraw-Hill, 2010).

13. K. J. Hayhurst, D. S. Veerhusen, J. J. Chilenski, and L. K. Rierson, A Practical Tutorial on Modified Condition/Decision Coverage, NASA/TM-2001-210876 (Hampton, VA: Langley Research Center, May 2001).

14. Certification Authorities Software Team (CAST), Structural coverage of object code, Position Paper CAST-17 (June 2003, Rev. 3).

15. Certification Authorities Software Team (CAST), Guidelines for approving source code to object code traceability, Position Paper CAST-12 (December 2002).

16. Certification Authorities Software Team (CAST), Clarification of structural coverage analyses of data coupling and control coupling, Position Paper CAST-19 (January 2004, Rev. 2).

17. V. Santhanam, J. J. Chilenski, R. Waldrop, T. Leavitt, and K. J. Hayhurst, Software Verification Tools Assessment Study, DOT/FAA/AR-06/54 (Washington, DC: Office of Aviation Research, June 2007).

Recommended Readings

1. K. J. Hayhurst, D. S. Veerhusen, J. J. Chilenski, and L. K. Rierson, A Practical Tutorial on Modified Condition/Decision Coverage, NASA/TM-2001-210876 (Hampton, VA; Langley Research Center, May 2001). This tutorial provides a practical approach to assess aviation software for compliance the DO-178B (and DO-178C) objective for MC/DC (DO-178C Table A-7 objective 5). The tutorial presents a five-step approach to evaluate MC/DC coverage without a coverage tool. The tutorial also addresses factors to consider when select ing and/or qualifying a structural coverage analysis tool. Tips for reviewing MC/DC artifacts and pitfalls common to structural coverage analysis are also discussed.

2. J. J. Chilenski, An Investigation of Three Forms of the Modified Condition Decision Coverage (MCDC) Criterion, DOT/FAA/AR-01/18 (Washington, DC; Office of Aviation Research, April 2001). This report compares three forms of MC/DC and provides justification for why MC/DC should be part of the software system development process. The three forms of MC/DC are compared theoretically and empirically for minimum probability of error detection performance and ease of satisfaction.

3. V. Santhanam, J. J. Chilenski, R. Waldrop, T. Leavitt, and K. J. Hayhurst, Software Verification Tools Assessment Study, DOT/FAA/AR-06/54 (Washington, DC; Office of Aviation Research, June 2007). This report documents the investigation of criteria to effectively evaluate structural coverage analysis tools for use on projects intended to comply with DO-178B (and now DO-178C). The research effort proposed a test suite to increase objectivity and uniformity in the application of the structural coverage tool qualification criteria. The prototype test suite identified anomalies in each of the three coverage analysis tools evaluated, demonstrating the potential for a test suite to help evaluate a tool’s compatibility with the DO-178B (and now DO-178C) objectives.

4. J. J. Chilenski and J. L. Kurtz, Object-Oriented Technology Verification Phase 3 Handbook—Structural Coverage at the Source-Code and Object-Code Levels, DOT/ FAA/AR-07/17 (Washington, DC; Office of Aviation Research, June 2007). This handbook provides guidelines for meeting DO-178B (and now DO-178C) structural coverage analysis objectives at the source code versus object code or executable object-code levels when using object-oriented technology (OOT) in commercial aviation. The differences between source code and object code or executable object-code coverage analyses for the object-oriented features and MC/DC are identified. An approach for dealing with the differences is provided for each issue identified. While the focus is OOT, many of the concepts are applicable to non-OOT projects.

5. CAST-17, Structural coverage of object code (Rev 3, June 2003). This paper, written by the international Certification Authorities Software Team (CAST), explains some of the motivations behind structural coverage at the object code or executable object code level and identifies issues to be addressed when using such an approach.

6. J. J. Chilenski and J. L. Kurtz, Object-Oriented Technology Verification Phase 2 Handbook—Data Coupling and Control Coupling, DOT/FAA/AR-07/19 (Washington, DC; Office of Aviation Research, August 2007). This handbook provides guidelines for the verification (confirmation) of data coupling and control coupling within OOT in commercial aviation. Coverage of inter-component dependencies is identified as an acceptable measure of integration testing in both non-OOT and OOT software to satisfy DO-178B (and now DO-178C) Table A-7 objective 8. This approach is known as coupling-based integration testing.

*The details of the verification objectives are discussed later; for now, the focus is on which objectives require independence.

CAST is a team of international certification authorities who strive to harmonize their positions on airborne software and aircraft electronic hardware in CAST papers.

*DO-178C sections 6.3.4.f and 6.3.5 allude to several of these as part of code and integration verification.

*Care must be used when gathering WCET data. It is quite possible to have a WCET greater than 100% yet have the system perform safely. For example, the worst-case branches in a procedure may be mutually exclusive. The problem gets worse when the mutual exclusion occurs across components, that is, the worst-case time in component A cannot occur at the same time as the worst-case time in component B. Yet if both components are in the same execution frame, the WCET data may sum them both.

CAST-20, entitled Addressing cache in airborne systems and equipment [5] provides additional details on the concerns regarding WCET when cache and/or pipelining is used.

*IMA systems are briefly discussed in Chapters 20 and 21. Likewise, configuration data is examined in Chapter 22.

*A NaN can be a quiet NaN or a signaling NaN.

It should be noted that some technologies, such as formal methods, may alleviate the need for some of the testing activities. Formal methods are discussed in Chapter 16.

*Per DO-178C section 6.4.2.

*Rather than having one living document, some projects have a test plan which is completed up front, as well as a verification cases and procedures document that evolves throughout the program.

*The ability to write low-level tests module by module will depend on how the low-level requirements are organized.

*Code inspections are particularly subjective when used to confirm that executable object code satisfies the requirements. When used to satisfy DO-178C Table A-6 objectives, the focus of the code inspection is on how the executable object code meets the requirements, not a repeat of the code review (DO-178C Table A-5 objectives).

*DO-178C section 4.4.3.b.

DO-178C section 6.4.1.a.

FAA Order 8110.49 (change 1) chapter 16 provides some guidance on managing the development and verification environments when an emulator or simulator is used. The Order can be found on FAA’s website: www.faa.gov

*Even automated tests require procedures and need a test log.

*A redline is a modification to the procedure. It is normally documented using a red pen on a hard copy or using change tracking for an electronic copy.

*Dead code is: “Executable Object Code (or data) which exists as a result of a software development error but cannot be executed (code) or used (data) in any operational configuration of the target computer environment. It is not traceable to a system or software requirement. The following exceptions are often mistakenly categorized as dead code but are necessary for implementation of the requirements/design: embedded identifiers, defensive programming structures to improve robustness, and deactivated code such as unused library functions” [1].

Extraneous code is: “Code (or data) that is not traceable to any system or software requirement. An example of extraneous code is legacy code that was incorrectly retained although its requirements and test cases were removed. Another example of extraneous code is dead code” [1].

*CAST-12, entitled Guidelines for approving source code to object code traceability provides additional information on this topic [15].

*Brackets added for clarification.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.14.93