5.3. System Validation (Acceptance Tests)

Insofar as software validation means establishing that the software meets its requirements, software validation can be done in the laboratory. But where validation without the software qualifier means design validation—that is, showing that the product meets customer needs and is fit for its intended use—is something that must be done by the customers themselves in actual use. For medical devices, this means clinical trials or clinical evaluation.

The design of clinical trials is outside the scope of this paper, and beyond my expertise as well. Medical professionals and statisticians need to be involved. A big issue in design validation is the labeling, or claims for what the device is good for. If the marketing literature makes a claim for an application of the device, the claim has to be substantiated with clinical data. Medical manufacturers cannot get away with the type of intellectual sloppiness we are used to in other types of advertising. It is insufficient to claim “new and improved”; you will have to demonstrate it. So some caution about claims is warranted. You don’t have to work outside your specified range. And you also cannot control how the doctor uses the device. At least in the United States, the doctor can use devices and drugs “off-label,” that is to say, contrary to the indications in the labeling that accompanies the device. But to do so is to put himself and his practice at risk, and, as a practical matter, attempting a legal defense that the warnings should have prevented misuse is usually not sufficient to exonerate a firm from blame.

5.4. Traceability

Traceability is an analytical activity in which you establish the link between requirements, design elements, implementation, and especially to tests. Remember, the software system test is a demonstration that all of the software requirements have been fulfilled, so it is directly traced from requirements. And it is not sufficient to just generate a matrix that shows each requirement number has a corresponding number in the test document. There needs to be a review of the tests against the specific requirements to verify that the test tests the requirement correctly.

Requirements should give rise to design elements. If there are design elements without corresponding formal requirements, it is a mistake one way or another. It could be a sign of tacit requirements that need to be stated explicitly so that they are not forgotten upon further development, and are written down to ensure that they are tested. Or it could be a sign of gold-plating at the design level. Gold-plating is bad for several reasons. First, the system should not have unnecessary elements that may cause failure—the system is more elaborate than it needs to be and complexity works against reliability [16]. Second, from the point of view of the project, implementation that was not part of the requirements is something that the company did not sign up to pay for.

Traceability is particularly important to risk management. If the output of the risk analysis process is hazards that require mitigation, it is crucially important to verify that the software does indeed mitigate the hazards. It is the worst kind of mistake to identify a hazard, attempt to mitigate it, and fail to properly design a safe system because of lack of oversight during testing that allowed an identified hazard to occur. It is hard enough already to get these systems right and ensure that they are safe. To fail to verify that we do what we promise in the requirements is the worst kind of mistake.

5.5. Metrics

As for any quality system, metrics are valuable to the FDA. Software is highly complex and unlikely to be defect free. It is impractical to test forever, and difficult to know whether the software is of sufficient quality to be safe and effective. This is where metrics are important. The number of defects, their discovery rate, and their severity should all be on a downward trend before release. In addition, the types of defects are of interest to detect problems and trends, and manufacturers are required to analyze defects for trends [25].

As usual, the FDA does not specify which metrics are important or how to collect them, but there are implicit requirements for metrics to establish the other types of evidence that the FDA is looking for, such as the list of anomalies and their effect on the system. One thing the FDA realizes is Deming’s insight that it is more expensive to test quality into a product rather than to design it in. For software, it is impossible to test quality in for all but the simplest of projects. Software is enormously complex; the system could exist in trillions of states, cannot be shown to be correct past any level of complexity greater than the trivial, and is unlikely to be 100% bug-free. The metrics will show that the software is on a path toward a minimum number of defects.

This is not a discussion of software metrics, and it is not the main focus of the FDA or the design control process. The purpose of metrics is to establish that the trends are in the right direction and that the process is in fact controlled. You can of course create as elaborate a set of software metrics as you like. There are whole books on the subject. However, a little is worth a lot, and a lot is not necessarily worth a lot more. In fact, a lot can be worth far less, if the time the team spends collecting and analyzing metrics subtracts from the time available to get the software itself as correct as it can be.

Nor are all defects created equal. Some obviously are more serious than others. It is essential to focus the team’s energies on the most serious matters. This is just a commonplace of criticality partitioning. Your policies should establish what defect severity means, usually based on the harm that could arise if the defect were to occur. Broad categories are better—you don’t want to have to spend a lot of mental energy deciding what the classification is. And you should give severity benefit of the doubt. It is also useful to separate the priority of a defect from its severity. Priority is a business matter; severity it an objective matter.

Sometimes it is important to the business to fix a nuisance bug that is irritating an important customer, whereas the policy will usually say that a nuisance bug would not prohibit shipping the product, provided that the nuisance does not create a hazard. From the point of view of the regulatory bodies, they don’t care much about nuisance bugs provided that the product is safe. The exception is difficulty of use, because human factors analysis would tell us that something that is not easy to use is prone to user error, and the errors themselves may create hazards.

Major defects would be something like a system reset or hang. These will be evident from the hazard analysis. Moderate defects are less critical to delivery of the essential function of the device. Minor defects are things like nuisances, misspellings, cosmetic errors, some label in the wrong color, and so forth. (Insofar as labels for some parameters may be required to be a certain color by international standard or convention, it may be a moderate defect if one label were the wrong color, whereas if another label were the wrong color it would be only minor.)

The defect analysis does not usually belong only to the software team. In the first place, the defect may not even be software related; unfortunately, software is mostly what users see, so they begin by blaming the software. But in many cases it could be a hardware defect, a faulty system, or a misunderstanding on the part of the user. In the second place, the software team is probably too self-interested to make correct judgments about priority.

If you are extremely lucky, the requirements are well understood and beautifully written, so that the only kinds of defects you have are noncompliance of the implementation with the requirements. But usually this is not the case; the toughest bugs are bugs in requirements. This is where the CCB comes into play.

The purpose of the defect metric is to establish a trend toward zero. For the highest level of concern, there should be no defects of major or catastrophic severity. You may also have the policy that there are no defects of moderate severity. If you plot the number of open defects versus time, you should see a negative slope. The line should converge on zero defects—some use this as the metric to determine when testing is complete and they can ship product. The line will have brief jumps when more effort is focused on testing—this could be normalized away by using a metric like the number of defects found per hour of testing. More time should pass between each defect found as bugs are being fixed.

What you don’t want to see is a positive slope. It is entirely possible in an unstable system to see more defects introduced as other defects are fixed. This is a sign of trouble, and usually also a sign of poor design. It may be necessary to go back to an earlier phase in the software life cycle in order to fix design errors that are causing too many defects.

You don’t want to see the number of changes increase as the software gets close to release (Fig. 4.7 A, B). The size of the triangles represents the size of the change. The dark triangle in the corner is the number of re-injection errors. Re-injection errors are mistakes you made while making other changes. The industry standard is about 1 in 8; the FDA experience is 80% [1]. (The error re-injection rate for your organization is another metric that would be good to have. It probably varies with the phase of development or the work product. Knowing it would help you estimate schedules and determine the amount of regression testing that is called for.)

image

Figure 4.7 A, B: The change trend.

You want the rate of change to the software to be trending down, as in Figure 4.7 (A), so that the re-injection errors get smaller as well. A mistake that I have often seen is the urge to ram in changes at the last minute, as in Figure 4.7 (B). Each change has a risk of introducing a defect elsewhere in the system. Last-minute changes have the same re-injection error rate as any other change—in fact, they are probably worse, because people will be taking shortcuts with testing under schedule pressure. So while it appears that a lot of work gets done at the last minute, it really just extends deadlines, because the defects are genuine—and it is far worse to have your customers discover your defects rather that discover them yourself and have a chance to fix them. The stabilization phase represented by 4.7 (A) is still going to happen, but it will happen after release, when it is the most expensive to fix.

The CCB can play a role here in limiting willy-nilly change. (Sometimes this board is the source of the problem, but that is another matter.) It is important for the board to remind themselves of the definition of a defect: does it so interfere with the form, fit, or function of the device that it cannot be used for its intended purpose?

Early in the project, it is okay to be proactive in fixing defects, and accepting a low bar for what constitutes a defect. Late in the project, after the accrual of a great deal of verification and testing, changes should not be made lightly. This is especially the case because under time pressure, the risk of creating a defect elsewhere and not finding it because of the limitations of regression testing is so high.

5.6. FDA Regulatory Approval Process

The first thing to determine is whether your product is a medical device, and will thus fall under the purview of the CDRH. A medical device is an instrument, apparatus, or implant used in the diagnosis or treatment of disease, including accessories, which is not a drug. It includes in vitro diagnostic equipment and the associated reagents [31]. (The technical definition is, of course, more elaborately legal than what I have stated here.) If a product meets the definition of a medical device, “it will be regulated by the Food and Drug Administration (FDA) as a medical device and is subject to premarketing and postmarketing regulatory controls” [32]. Other medical products are also regulated by the FDA, but they are regulated by other centers within the umbrella agency, instead of CDRH.

Since devices may contain software, the software in it is regulated by the FDA. Furthermore, software that is itself the medical device is regulated. Software for use in blood bank management must meet FDA regulations, as does software used in the quality system or for manufacturing medical products. This last is important—it means that software used to control the process to manufacture the device also should meet the V&V requirements we’ve been discussing.

5.7. Device Risk Classes

The FDA classifies devices into three risk classes related to the risk presented by use of the device, but the classification is really a method to define the regulatory requirements. The risk class also governs what type of approval process the device will go through. “Most Class I devices are exempt from Premarket Notification 510(k); most Class II devices require Premarket Notification 510(k); and most Class III devices require Premarket Approval [PMA]” [33]. There are various exemptions and limitations on exemptions which move the regulatory requirements boundaries around. Thus some Class II devices are not required to have a 510(k), whereas others require a PMA. In any case, all medical devices are subject to a baseline set of general controls, as defined in the QSR. The general controls mean that “[a]ll medical devices must be manufactured under a quality assurance program, be suitable for the intended use, be adequately packaged and properly labeled, and have establishment registration and device listing forms on file with the FDA” [3].

Class I is the lowest-risk class. These present minimal potential harm in their use. They are also often simple devices. Examples include exam gloves, bandages, hand-held surgical instruments, and preformed gold denture teeth. Most external prosthetics fall in this category, such as hearing aids, artificial limbs, and eyeglasses. Artificial hands are an example of a device that is even exempt of GMP requirements except for records and complaints. Medical library software or other nondiagnostic software would be considered Class I.

The FDA considers general controls sufficient for Class I devices. However, Class I devices with software are subject to design control. Devices of greater complexity, risk, or uncertainty may not have sufficient assurance of safety and effectiveness solely through general controls. These will be Class II or Class III devices.

Devices where we know how to establish safety and effectiveness—because we have done so for similar products—will generally be Class II with a path to market using 510(k). If we don’t have enough information to ensure safety and effectiveness, the device will be Class III and will usually require PMA before it can be sold. Class III devices are riskier: “Class III devices are usually those that support or sustain human life, are of substantial importance in preventing impairment of human health, or which present a potential, unreasonable risk of illness or injury” [35].

It is possible for a malfunction in a Class II device to cause serious injury or death, but its primary function is not life-support, which is what distinguishes it from Class III. Generally, diagnostic equipment is Class II. The reason for this is that a malfunction in such a device could result in incorrect data. The clinician could mistakenly use this incorrect data and deliver wrong therapy. Ultimately, however, the clinician is still responsible for using judgment when relying on the device and its information.

Pulse oximeters, blood pressure machines, and x-ray machines are all Class II devices. X-ray machines are interesting because not only do they provide images for diagnosis, but they also deliver harmful radiation. Hence they have an additional hazard in that a malfunction could result in a harmful dose that injures the patient. Medical devices that emit radiation—such as x-ray, ultrasound, or laser—are also regulated by CDRH and have special standards they must meet in addition to the usual QSR standards.

The highest-risk devices and the ones requiring the most scrutiny are Class III. Pacemakers are Class III because they pace the heart and might cause injury or death if they fail, whereas a device that tests pacemaker leads is Class II devices because it provides diagnostic information. Apparently because of their troubled history, silicone breast implants are Class III.

Higher-risk classes, by their nature, require more risk analysis and risk management. In addition, to establish that the risk analysis process has been conducted with due diligence, the higher risk classes require more stringent design control, more thorough documentation of the development process, and more extensive verification. Moreover, much of this additional documentation will have to be supplied in the premarket application [3]. I will have more to say about the ramifications of risk class on the software process later.

5.7.1. Determining Your Device’s Risk Class

While the amount of risk presented by a device is a general guideline to what classification it might be, you can’t rely on guesswork. The classification is based at least in part on how comfortable the agency is with the technology of a device and its history, that is, how long it has been on the market. For instance, you might think a gas machine that ventilates a patient during surgery and is used to deliver anesthetic agents would be a life-support apparatus and hence Class III. So it was 15 years ago; it is now Class II.

The best place to go for more information about devices, risk classes, and the premarket application process is the CDRH’s Device Advice website at www.fda.gov/cdrh/devadvice/. Here you will find the Product Classification database (under the CDRH Databases tab) [36]. You can search in this database for the keywords that describe your device. The keywords are not always obvious, so you may have to make several attempts. For example, the patient ventilator is known as “gas-machine, anesthesia.” The “review panel” field is the medical specialty applicable to the device. For example, I selected “anesthesiology” and browsed the 174 entries to find “gas-machine, anesthesia.”

The search will describe the risk class for the device, the submission type, and whether it is exempt from GMP requirements or has special requirements. There is usually a link to a regulation number, which is the actual Code of Federal Regulations Title 21 definition of the device and its classification.

Novel devices that are not in the database will require PMA. The next section will address the premarket application process necessary to get the device on the market in the United States.

5.7.2. Premarket Submissions

The process for getting approval for the sale of a medical device in the United States can take one of two pathways. A simplified application process known as a premarket notification or 510(k), after the section of the federal law that allows it, is available for many products already on the market. If a medical device is “equivalent” to a device that was available in 1976, companies can pursue a 510(k) approval path. The purpose of the 510(k) submission is to demonstrate that your device is as safe and effective as a device already being marketed. In other words, you show that your device is “substantially equivalent” to another device. This other device is commonly known as the “predicate device.”

A device is substantially equivalent if it:

• Has the same intended use as the predicate; and has the same technological characteristics as the predicate

   or

• Has the same intended use as the predicate; and has different technological characteristics and the information submitted to FDA. Does not raise new questions of safety and effectiveness; and demonstrates that the device is at least as safe and effective as the legally marketed device [14]

Once FDA personnel are convinced of the substantial equivalence, they will send a letter allowing the device to be marketed. If the FDA determines the device is not substantially equivalent, the manufacturer may submit another 510(k) with additional data, petition for reclassification, or submit a PMA.

There is a fourth pathway for novel but lower-risk-class devices. Prior to 1997, all new devices were Class III by default and required a PMA. The risk class could only be lowered by reclassifying the device. Nowadays, applicants can use the de novo process. This is a written request within 30 days of the judgment of not substantially equivalent that argues for a summary classification into Class I or II. If successful, a 510(k) is sufficient, and the device can now serve as the predicate for other devices as well.

5.7.3. When Is a 510(k) Required?

Obviously, the first time that a device is offered on the market, if it is not exempt or a PMA, it requires a 510(k). This needs to happen at least 90 days before the device goes on sale, to give the FDA sufficient time to respond.

It is not the intent of the FDA that every change must have a 510(k). However, products evolve. There are two significant changes that would trigger a 510(k). First, if the intended use of the device changes. If you make a new claim in the labeling for an application, you would need a 510(k) supporting that claim. So, for instance, if you had a device that screened for liver cancer, and you discovered it could also screen for stomach cancer, you would need another 510(k) to substantiate that the device is indeed capable of screening for stomach cancer.

For this reason, one needs to be careful about claims of “new and improved.” Unlike the people selling laundry soap or public policy, the FDA will require that you offer scientific evidence that the device is in fact improved.

The second type of change that needs a 510(k) would be something that “could significantly affect its safety or effectiveness” [14]. It is up to you and your team as experts in the problem domain and the device to determine whether a change affects safety and effectiveness, but the decision should be documented in the change control records and is subject to audit [7]. The addition of a new alarm might be a reason for a 510(k). Certainly the deletion of a previously detected alarm would need a 510(k) to justify why the alarm is no longer needed.

For software, minor bug fixes would not normally require a 510(k). Changes to a GUI, such as rearranging the controls or refining the text of messages would probably not need one. Changes to requirements would have to be assessed on a case-by-case basis. New requirements that supported a new therapy or indication for use would require a 510(k).

Devices going through the investigation device exemption (IDE) or PMA require approval for changes in advance from the FDA. You do this by making a supplemental IDE or PMA submission.

By the way, companies do not get to decide whether their device is substantially equivalent. Sometimes it is obvious and the FDA is comfortable with the new device and its predicate. Sometimes it is less so. One company that the writer is familiar with sought a 510(k) on an innovative ultrasound technique for breast imaging and cancer screening using transmission technology, claiming other ultrasound imaging devices as the predicate. Unfortunately for this company, the FDA did not accept its predicate since all other ultrasound machines work with reflective technology. The company has since backed off the claim and is using reflective technology. They could also have switched to a PMA, but they would have been required to supply substantially more clinical data. A regulatory approval strategy unacceptable to the FDA can seriously disrupt corporate financial plans and the bottom line.

5.7.4. Premarket Approval

Remember that the FDA’s objective is to evaluate both the safety and effectiveness of a device. This is one way to look at the dividing line between the two approval pathways. If a similar device is already on the market, effectiveness has been established by the market itself—the need for further clinical trials is lessened. The 510(k) is available for devices like this. The judgment of whether a 510(k) is sufficient is complicated by the risk of harm presented by the device. Even if a new product is similar to a device already on the market, if it “support[s] or sustain[s] human life, [is] of substantial importance in preventing impairment of human health, or which present[s] a potential, unreasonable risk of illness or injury” [37], the FDA will require a PMA application to ensure that the device is safe and effective.

The PMA is really about science. While it is necessary to have approval before marketing the device, the PMA is the phase where the manufacturer presents laboratory and clinical data to establish through statistically valid scientific means that the device’s benefits exceed the risks and that it is effective at achieving what it promises.

Brand-new devices will first have to gain an IDE (not interactive development environment) to get approval for use on humans in the United States. Effectiveness of the device is as yet unknown at the start of the clinical trials; the IDE is a method to “try out” new devices to “collect safety and effectiveness data required to support a Premarket Approval (PMA) application or a Premarket Notification [510(k)] submission to FDA” [38]. (An IDE would be rare for a 510(k), unless the device were being investigated for a new intended use.)

A major component of the IDE is a clinical plan that defines the types of patients the device can be used in and the disease indications that it is hoped the device will alleviate. The IDE must be approved by an institutional review board—a clinical committee that evaluates the scientific basis of the study—or by the FDA if the device involves significant risk. In addition, patients must give their informed consent, the device must be labeled for investigational use only, and there are monitoring and reporting requirements. With an approved IDE, the manufacturer can ship the device for investigational use without complying with the quality system regulations, with one important exception: the device must still comply with the requirements for design control.

It is still necessary to provide in the IDE evidence that the device is safe. Any hardware standards for electrostatic discharge (ESD), electromagnetic interference (EMI), susceptibility to other electrical devices in the use environment, and leakage current must be met. A system and software risk analysis is required for the IDE, and the software V&V rigor must be sufficient for the level of concern for the software.

5.7.5. Software and Premarket Applications

The preceding is an overview of the regulatory process, which as a rule not the direct responsibility of software engineers but rather the firm’s regulatory organization. It is still useful to know what you may face if you have an idea for a new device. If you have regulatory responsibility, you would want to look further at the CDRH Device Advice website.

Developing the documentation for the premarket application—whether 510(k) or PMA—is an essential part of the development process for medical devices. The objective is to develop the documentation supporting the assertion that the software is validated as one piece of the larger picture of the device as a safe and effective medical product.

It is helpful to make the submission as reviewer friendly as possible. Keep jargon to a minimum and use an organization that makes sense. For complex products, the submission will probably be evaluated by several reviewers. A test document ought to be understandable by itself, although for software, it is going to be hard not to point to the upstream documents.

In tests, use a minimum of handwriting and don’t do things that would prompt questions, triggering another review cycle. Provide enough information that the reviewers will not feel compelled to ask for more.

The documentation requirements are based on the level of concern for the software, not whether the approval process is 510(k) or PMA. And the documentation and process requirements are no more than are required to build safe, quality software that works, anyway.

5.8. Software Level of Concern

There is an equivalent way to the risk class of the device in thinking about risk as applied to software, known as the software level of concern. This is an evaluation of the extent of the harm that could result to which the software could contribute through failure or latent design flaws. The lowest level of concern is minor—failures are unlikely to cause injury to the operator or patient. Moderate level of concern occurs when there is a risk of minor injury. When the failure could cause serious injury or death, the software is of major level of concern. Failure includes providing incorrect diagnostic data that result in wrong therapy. Serious injury in this context means permanent bodily damage or the need for surgical intervention to prevent permanent damage [3]. (ANSI/AAMI/IEC 62304:2006 has a similar classification for the risk of software, but uses Class A for minor level of concern, Class B for moderate, and Class C for the highest level of concern.)

The level of concern is distinct from the risk class of the device, although it is related to it. A device itself could be Class II, not requiring PMA or clinical trials, but have major level-of-concern software in it. A higher level-of-concern device could have lower level-of-concern software in it, but the software will always take on the level of concern of the device. However, once software is a major level of concern, it is always a major level of concern, even if risk control measures reduce it to a minor level. This is because you must always consider the way that risk control measures could fail and hence not remain a minor level of concern.

An anesthesia gas-machine, which provides life support during surgery, would be one example. Any premarket submission would require documentation for major level-of-concern software. On the other hand, even minor level of concern software requires documentation for major level of concern if it is a component of a Class III device. Remember, the risk class of the device I, II, or III governs the extent of clinical data needed and the postmarket surveillance. The level of concern of the software governs the rigor of the development process and the extent of the documentation provided for the premarket submission.

To figure out the level of concern for the software, go to the CDRH document, “Guidance for the Content of Premarket Submissions,” [3] and answer the questions in Tables 1 and 2.

You will want to write down your rationale for choosing the level of concern. It is the first component of the premarket submission. I have usually seen this done as a repetition of and answer to each of the questions posed.

Not all the software in the device or system has to have the same level of concern. The architecture could be designed such that the safety-critical software is isolated from less critical software, so that a failure in the lower-risk software could not cause harm. “Isolated” means that a malfunction in one piece of software could not possibly affect another—in other words, there has to be some measure of memory protection or better yet, the applications should run on different processors.

This notion of safety partitioning allows the team to concentrate resources on the safety-critical software, rather than doing a superficial job across all of the software because the standards are too high. It also lets you change the less critical software more easily or at least with less extensive regression testing. So, for example, the system design might have a processor devoted to safety-critical functions, and a separate GUI running on a different processor. The two communicate over a serial link. As long as the interface does not change, and the safety-critical software has been shown to meet its safety requirements regardless of failure of the communications link, the criticality of the GUI has been reduced. This is an advantage for the overall safety of the device, since the user interface can be more readily adapted to correct flaws in the human factors design. You will, of course, have to justify the risk classification in the level of concern statement of the different components if you use this strategy.

For hardware, risk analysis includes an estimate of the probability of a hazard occurring. Because software does not fail in statistical ways, software risk analysis is conducted assuming that the probability is 100% [3, 25]. That is, if the failure can occur, it is assumed to occur. Hence software cannot rely on a low probability of occurrence to reduce the need for risk control. Since risk controls for software include a careful, rigorous process, software of higher-risk classes must always be constructed with the process appropriate for that level of risk [25].

Assuming that a software failure will always occur makes sense at one level—if there is a logic flaw in the software, it will always fail when presented with the failure-inducing inputs. On the other hand, there are failures that software can do nothing about—the failure of a CPU register or stack memory during runtime. These are hardware failures and treated with hardware probabilities of occurrence. The risk of errant software is also mitigated with hardware risk-control measures, such as safety interlocks and watchdogs.

ANSI/AAMI/ISO 62304 describes the process requirements for the different risk levels of software. This will generate control, testing, and verification documentation. The FDA’s “Guidance for the Content of Premarket Submissions for Software Contained in Medical Devices” describes the documents that you must provide as part of the premarket submission.

These documents emerge from the software development process for the risk class of software that is part of the regulatory submission. If you are actually putting together the premarket submission, I highly recommend the guidance. However, to provide an overview of what the content looks like, I discuss each level of concern in the following sections.

5.9. Software Documentation Requirements for Premarket Submissions

5.9.1. Premarket Documentation, Minor Level of Concern

The bare minimum documentation required for the premarket submission for all software-controlled devices is the same required for minor level-of-concern software-controlled devices. For higher levels of concern, some of the chapters would have more detail.

• A statement of the level of concern for the software, and the rationale supporting why that level of concern was chosen.

• An overview of the features and environment in which the software operates.

• The device risk analysis for both hardware and software, and the control measures where appropriate.

• Summary of the software functional requirements. This does not need to be the full SRS for minor level-of-concern software. You still should write a full SRS, of course; here you summarize it for the sake of the reviewers.

• Traceability from requirements, especially hazard mitigations, to V&V testing.

• The system- or device-level testing, including the pass/fail criteria and the test results.

• A software revision history. This should summarize major changes to the software and internal releases made during development. It includes a statement of the software release version label and date.

5.9.2. Premarket Documentation, Moderate Level of Concern

Moderate level-of-concern software requires more documentation and more detail in the submission.

• Statement of the level of concern.

• An overview of the features and environment in which the software operates.

• The device risk analysis, as above.

• The complete SRS.

• An architecture design chart. A diagram or similar depiction that shows the relationship of major design entities in the software to each other and to hardware and data flows.

• The software design description. This is the document that describes the how in relation to the what in the SRS. This should have enough detail to make it clear that the design implements the requirements.

• Traceability analysis, as above.

• A summary of the software development environment that explains the software life cycle and the processes used to manage the life cycle activities, including a summary of SCM.

• The complete software system test protocol including the pass/fail criteria and the test results. You will also need to include a summary of the activities conducted for unit-, integration-, and system-level V&V.

• A software revision history. This should summarize major changes to the software and internal releases made during development. It includes a statement of the software release version label and date.

• A report of the software defects remaining in the device along with a statement about their impact on safety or effectiveness, including what effect they may present to operator usage and the human factors environment.

5.9.3. Premarket Documentation, Major Level of Concern

Major level-of-concern software includes everything in the moderate category, but with more detail. The required testing documentation is much more extensive. In particular, you must include the test protocols for unit and integration testing in addition to the software system test protocol. The documents for major level-of-concern software follow:

• A statement of the level of concern.

• An overview of the features and environment in which the software operates.

• The device risk analysis.

• The complete SRS.

• An architecture design chart.

• The software design description.

• Traceability analysis, as above.

• A summary of the software life cycle development plan and the processes used to manage the life cycle activities. Include an annotated list of the control documents generated during development, such as release letters, and the detailed SCM plan and software maintenance plan. These documents could just be copies of your SOPs.

• A description of the activities conducted for unit-, integration-, and system-level V&V. Include the actual unit-, integration-, software system–test protocols with the pass/fail criteria, test results, a test report, and summary.

• A software revision history. This should summarize major changes to the software and internal releases made during development. It includes a statement of the software release version label and date.

• A report of the software defects remaining, as above.

5.9.4. Premarket Documentation, Off-the-Shelf Software

As the software industry matures and software grows in complexity, the argument to use off-the-shelf (OTS) software in medical devices rather than building your own becomes ever more compelling. It has advantages for adding features, time to market, and even reliability. But it does put a burden on the manufacturer to provide evidence that off-the-shelf (or software of unknown provenance) used in the device is of quality commensurate with the level of concern of the software. There is more about V&V of OTS and SOUP software in Section 6.1, of the same name. In this section, I describe the minimum requirements for the documentation to include in the premarket submission.

Fully identify the OTS software, including versions, and what product documentation the end user will get. Describe the hardware specifications that the software is tested with. Explain why the OTS software is appropriate for the device and outline any design limitations.

Describe the configuration and the installation, and plans for controlling the configuration and changing it if an upgrade is needed. For general-purpose computers and operating systems, describe the measures in place to prevent the use in the device of software that the medical application has not been tested with. It may be necessary to disable storage devices to prevent the downloading of nonspecified software.

Provide a description of what the OTS software does—an equivalent to the SRS, but for the OTS software. Include a discussion of the links to outside software, such as networks.

Describe the V&V activities and the test results that demonstrate the OTS software is appropriate for the device hazards associated with it.

Finally, you have to include a hazard analysis of the OTS software as part of your system hazard analysis. This is conducted according to the procedures outlined previously in Section 4, Risk Management. This will result in a list of hazards presented by the OTS software, the steps you are taking to mitigate those risks, and the residual risk.

If the residual risk ranks as a major level of concern, you will have to include more documentation than that described so far. You are expected to:

• Show that the OTS software developer used product development methods that are adequate for a safety-critical device. This is best done with an audit of their construction policies, which are expected to meet FDA requirements for major level of concern software.

• Demonstrate that the V&V is adequate for a safety-critical device, including the V&V that you engaged to qualify the OTS software.

• Provide assurances that the software can still be supported if the original developer disappears. This is best done by owning the source code.

Note: If an audit of the OTS developer’s design processes is not possible and the OTS software remains at major level of concern after mitigation, it may not be appropriate in a medical device application [26].

5.9.5. The Special 510(k)

The FDA has gained enough confidence in design control that, for certain types of changes, a full documentation package for the 510(k) is not required. Another advantage is that the FDA will respond in 30 days [40]. The device must have been designed and approved for market originally with a 510(k), and the manufacturer must provide a statement to that effect. The changes will also have to be made according to design control and kept in the DHF and DMR, subject to audit. But when you submit the 510(k), you only have to include the documentation for the change. You should provide the regression testing which validates that the change did not have unintended consequences, but you only have to provide the “test plans, pass/fail criteria, and summary results rather than test data” [3].

This is a good option for a bug-fixing release. You cannot use a special 510(k) for either of two reasons. First, there cannot be a change in the intended use of the device. In fact, you should include in the submission the new labeling with the changes highlighted so the FDA can check on this. The second reason that would result in traditional 510(k) is a change in the fundamental scientific technology. This would be such things as changing a manual device to automatic, or incorporating feedback into the function of the device. These would require a full 510(k).

5.10. The Review Process and What to Expect from the FDA

Once you have assembled your documentation package, you can submit it to the FDA, along with a fee. (The fee is reduced for small businesses.) You submit two copies, one of which can be electronic. Don’t bind them, because the FDA will just rip them up and rebind them for distribution to the reviewers.

Within 2 weeks you will receive an acknowledgement of receipt and a K number, which is a tracking number for submission. Within 30 days, the FDA will tell you if your submission is not administratively complete, that is, if they think you have left something out. For a 510(k), after 90 days you can start asking the FDA for a status report on your submission. They may have questions; this starts a review cycle. Normally you have 30 days to respond to the questions, although you can request an extension. There is no statutory end to the cycle; if FDA staff remain unconvinced, you will have to answer their questions or withdraw the application and resubmit.

The PMA is similar but the time frame is 180 days. For more information, see the Device Advice web pages at www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
HowtoMarketYourDevice/PremarketSubmissions/
PremarketApprovalPMA/default.htm

6. Special Topics

6.1. Software of Unknown Provenance

Software of unknown provenance (FDA uses the term “Software of Unknown Pedigree”) is, according to the ANSI 62304 definition, a “SOFTWARE ITEM that is already developed and generally available and that has not been developed for the purpose of being incorporated into the MEDICAL DEVICE (also known as ‘off-the-shelf software’) or software previously developed for which adequate records of the development PROCESSES are not available[.]” In other words, SOUP comes in two flavors: third-party software (OTS, off-the-shelf), or internal software that has not been developed using rigorous methods.

Compliance to the process standards that we have been discussing may lead one to think that it is necessary to build every part of the software, not just the application, in order to ensure adequate quality. Companies have done just that—they built not only the application, but the chip set, the compiler for the chip set, the operating system, and the rest of the peripherals.

The industry has moved away from self-development to OTS software because it makes a lot of sense to do so. It is less costly to buy than to build; you can get products to market faster if you don’t have to wait to build it yourself; there is an improvement in the quality because you are buying product (an RTOS, say) from developers who are expert in that problem domain instead of doing it yourself. Moreover, OTS software will have the benefit of many thousands of hours of concurrent testing in many different environments to establish its reliability [41]. It is a basic economic principle for businesses to focus on what they are good at and outsource what they can.

Using OTS software puts a burden on the medical device manufacturer to validate that the software thus used is fit for its intended purpose and used correctly. There is guidance from the FDA about what practices need to be applied [26]. This is discussed in more detail in Section 6.1.2, Third-Party Validation. The documentation for the premarket submission was described in Section 5.9, Software Documentation Requirements for Premarket Submission.

The other form of SOUP is internal software, whether legacy or prototype software. It may be software that has been in use for some time. This “legacy code” was perhaps developed before the revisions to the standards and guidance documents that tightened up development processes or was software used in a lower-risk-class device that is migrating to a higher level of concern.

Prototype software is used to explore indeterminate requirements or demonstrate device feasibility. It is a valuable method for reducing project risk. It can be developed using lighter-weight, agile methods with reduced paperwork and much less error handling or hazard analysis. Or, it could use languages or development environments unsuitable to the final product. The idea is to generate code more quickly so as to determine the scope and feasibility of the project.

However, because prototypes are used to refine and validate requirements, they may have been developed without any formal requirements analysis. Nor are they often developed according to any formal design control. Of greater concern, they are usually done well before the hazard analysis—they can be used as a way to explore the hazards—and hence prototype software is not, by definition, safe.

6.1.1. Prototypes—Throwaway or Evolutionary?

As mentioned, prototyping is a way to reduce project risk. But it can introduce risks to the project as well, if the prototype creates expectations of performance and completeness that are not warranted, or if the company attempts to ship what amounts to a prototype.

Throwaway Prototypes—There are a couple of approaches to take to prototyping. The first is the development of a throwaway prototype. The prototyping is done with methods or languages faster than the final methods, but that are not appropriate for the final delivery. It is particularly valuable for exploring user interfaces. You could, for instance, mock up a user interface in Visual Basic running under Windows that mimics what you will later build for a small screen on an embedded device with much less memory and CPU power. There are no design control issues because it is not going into the final product. Instead, it can be part of the design input or design verification in that it demonstrates the methods and requirements that are going to work.

There are risks with throwaway prototyping. Some might object that it costs too much to do the work in the prototype, throw it away, and then do the work for real. But the point is that you are making a prototype because “it is cheaper to develop a throwaway prototype, learn lessons the cheap way, and then implement the real code with fewer mistakes” [42] than to make the mistakes in your expensive system, and then throw it away anyway or, worse yet, accept the bad solution because it is the only one you had time for.

It is also possible that the real work gets delayed while engineers are playing with the prototype. For this and the cost reason above, it is important to abandon the prototype as soon as it has answered the questions about feasibility or correctness of design it was asked to answer.

Prototyping can be used as an excuse to not follow the rules. But best practices for any software development apply to prototypes as well. For example, use of the VCS on a rapidly changing prototype allows us to back up when a new idea is in the wrong direction. Documenting the code helps us understand it a few weeks later when we want to improve it.

Prototyping is also sometimes used as a way of pressing on with development without talking to customers—their opinions are messy, after all, and they may want something different from what we want to build for them. A huge amount of formality is not required, but it almost never makes sense for engineers in the back room to be cooking something up without having done a preliminary assessment of what it is the customer wants. Numerous products fail because they provide solutions to problems people don’t have. (My food was fine before it was genetically modified, thank you very much.)

The prototype may create unrealistic expectations for how fast development can proceed, or what the final performance might be (because the prototype doesn’t have to do any real work). Consider “crippling” the prototype in some obvious way so that non-technical people don’t take it in their heads that the software is more complete than it is.

But the principal danger is in keeping the throwaway prototype. It sometimes happens that businesses think the prototype is good enough and it would take too long or cost too much to “redo” it. This is even a temptation to the developers. Did the prototype really get thrown away or did the code creep back in? You must be sure to get agreement among the stakeholders that you are building disposable software that will be thrown away. It is not the final product.

The FDA allows that it is reasonable to use a prototype to explore feasibility of an approach before developing the design input requirements. Design control does not apply during the feasibility phase. But they warn against “the trap of equating the prototype design with a finished product design” [16]. Reviewers have learned enough to know what to look for.

Evolutionary Prototyping—If there is an insurmountable likelihood that someone in authority is going to want to ship the prototype, it may be advisable to use this second prototyping method. Evolutionary prototyping is a type of life-cycle model. It is especially valuable when you don’t know exactly what it is you need to build. In it, you implement the riskiest or most visible parts of the system first, and then iteratively refine those subsystems while evolving the rest of the system. You don’t discard code—you evolve it into the delivered system. Unlike throwaway prototyping, an evolutionary prototype’s “code quality needs to be good enough to support extensive modification, which means that it needs to be at least as good as the code of a traditionally developed system” [42].

One of the ramifications of evolutionary prototyping is that you will have to revisit the requirements throughout the development process. As you develop the system and provide answers to what the requirements should really be, you will have to align the upstream documents with the implementation. This may lead you to think that it is best to wait until the end to capture the requirements retrospectively. Not so! Even if the requirements are uncertain, it is still worth analysis to the extent that the team’s knowledge will support it. Requirements become more of a planning tool: this is what we plan to build next, not what we think will be in the final product. The team needs to understand which requirements to build first, and which are likely to change.

Unit test harnesses fit in nicely with evolutionary prototyping. Each unit can be constructed and tested to the highest standard in isolation. You may not know when to sound an alarm, but you can build and test the software to drive a digital to analog converter to make a multi-frequency alarm sound. If the principles of evolutionary prototyping are followed, the code quality will be sufficient to provide a good input into the retrospective V&V process described below.

There are a few things to look out for. As with throwaway prototyping, it is important to manage expectations. Often, quick progress is made on the visible parts of the system. But in safety-critical systems, substantial and difficult work is not visible, yet must be done for the final product. You must make the stakeholders aware of this.

As with any evolutionary development, there is a risk of poor design. You can only change the requirements so much before the design starts to creak. Evolutionary development is still the best method to deal with changing requirements, but you may have to plan some time for refactoring a design when it becomes unwieldy under the weight of changed requirements and expectations. For these reasons, it is important to use experienced personnel for evolutionary prototyping projects, who can better anticipate the ways that requirements may change and build more maintainable, robust designs.

Finally, evolutionary prototyping runs a risk of poor maintainability. Sometimes, if it is developed extremely rapidly, it is also developed sloppily, without due consideration for requirements scrubbing and proper code maintenance requirements. Again, the unit test approach helps here. But don’t let evolutionary prototyping become a mask for code-and-fix development. You still have the tools of the retrospective V&V, if that is required, but you want code quality sufficient at the entry to that process so that you don’t end up throwing it away.

In any case, software not developed according to the software development process of the firm must nonetheless have an equivalent process to validate it for use in the final product. We will discuss what these processes look like next.

6.1.2. Third-Party Validation

If the firm uses software for the manufacture of medical devices, their design, or in the quality system itself, that software also requires validation. This could seem like an overwhelming task, but it need not be. Remember that the purpose of validation is to show that software is fit for its intended purpose. Most of the time, this means that the third-party validation called for is a matter of demonstrating that the software fulfills the needs of your organization and provides the correct, reliable answer, not that it is necessary to validate every implemented functionality.

Computers and desktop applications are universal in modern business practice; it is hard to imagine being able to do our work at all, let alone remain competitive, if we don’t take advantage of the tools that are available. Businesses cannot afford to build all their own applications, nor does it make much sense to do so. A medical equipment company is much better off concentrating on its core competency—medical devices—than attempting to reinvent the wheel (in spite of what your engineers may sometime argue!). Businesses use Microsoft Word to create their process instructions, or may use Access to keep a database of defects. Excel is popular for modeling systems and analyzing data. Do we have to validate the feature clutter of Word? Bioengineers may use the Internet to look at frequently asked questions about our devices—do we need to validate the Internet?

The short answer: No. Windows, Linux, and real-time operating systems are being used in medical devices without the entire operating system having been validated. It is, as usual, a function of the level of concern of the software. Windows 95 is a bad choice for a Class III medical device expected to keep a patient alive. However, it is an adequate choice for a GUI front end to configure a device, provided that the configuration is established to work within the specified limits.

All the features of the operating system will not have been validated. But the device itself will still have to be validated—that is, shown to work for its intended purpose—and the device’s interaction with the underlying operating system needs to be validated, so that we convince ourselves that we have correctly used the OS services.

I think of third-party validation as showing that the way we have set up and use third-party software is correct. It is probably not necessary to choose every menu item and go through every dialog box and press every button to validate an application. We can argue that the vendor has done a satisfactory amount of functional testing. (If this is not the case, the product is so buggy we wouldn’t use it anyway, and the market will winnow out the defective products.) What we need to show is that the way we use the application, and the way it interacts with our process, and the process itself provides the answers we expect.

As mentioned, a VCS of some kind is required for software development. Because it is so essential to the software development process, and it is used as the repository of the code that will go into the device, it needs to be validated as well.

I would argue that I don’t need to exercise every feature or push every toolbar button and show that it does what the manual says. In fact, I don’t care that much whether the GUI works entirely correctly. What really matters is that the tool keeps track of my software and that I am using it correctly.

In the above, I wrote about the methods for managing the software configuration. What is necessary in the validation is to show that the pieces of the VCS that you choose to use do what you intend. So, if you are going to use branches, you would want to show that branching the code gets the version you expect in the location you expect to find it. This is your requirement. Create a test that establishes that this is what happens. Clone the reference files to the branch location and use a difference tool to show that they are the same. Open one or two in an editor and verify that the version is what you expect to see. Change a file on the branch, check it back in, and verify that the information about the check in is as expected. Verify that the code in the main branch is still untouched. Change a file in the same line in both the branch and the main. Integrate the branch change from the main and show that the merge detects a conflicting change; then when you fix it (using the automated tools or whatever process the VCS supports) examine the results and establish that the conflict was solved in a satisfactory way.

These don’t need to be elaborate tests—remember, the vendor did some testing, or they would likely not still be in business. But you want to convince yourself and others that the software is installed correctly and that you are using it correctly.

What is important is that you follow the validation principles and document what you have done. So, you want to write down the requirements—that is to say, the functions you expect the tool to do for you. Then you want to create tests that will demonstrate that the function works. Execute the tests and archive the results in the DHF.

If you use a product for data analysis or modeling, you could of course validate the whole system—but this is going to be difficult because you cannot do the type of white-box testing and construction analysis called for in the guidance documents. But that is not really necessary. Depending on how critical the application is to the safety of your device, you can make four arguments about validation.

• You can validate it all yourself with formal tests. These will of necessity be black-box tests unless you can get access to the source.

• You can seek a certification from the vendor that the software has been validated in a manner similar to the design control policies. Some vendors are stepping up to this, but expect to pay for it.

• You can argue wide distribution and validation through use. That is, if you have been using a product successfully for a long time, and many other people do as well, it can be assumed to work satisfactorily.

• You can argue irrelevance. If you use a word processor to create process instructions, it is merely a tool to create the written document. The written document is the thing that matters, not the typist or the printer it was printed on.

What is important is to validate that your models are correct. Excel is an excellent product; I have no doubt it provides the right answer 99.999999% of the time, provided that the equations are correct. This is where the difficulty lies—in showing that the equations are correct, the cell references are the right ones, the constants have sufficient decimal places and so on. Validating Excel is like validating arithmetic—I think we can trust the Greeks for that one. What we have to do is show that we used it correctly. Many times I have seen people pull out their models without examining the assumptions or even the correctness of the implementation. So rather than worry extensively about what Excel is up to, if you establish that the model gives expected results, you can conclude that Excel did not introduce errors.

What you do want to avoid is the story (okay, I have it second hand) about a large medical company that paid someone to type data from an Excel spreadsheet into a program they used for statistical analysis, because no one had validated CTRL-C/CTRL-V in Windows. Aside from the fact that this sounds so absurd that I wonder how it can be true, why did not someone (1) take, oh, 20 minutes and perform a validation if they were so worried, or (2) realize that having a human retype the data is many times more error-prone than relying on the software could possibly be?

To sum up: off-the-shelf software requires some level of validation. Write down the requirements that you expect the software to fulfill for you. Create some tests to show that the software meets these requirements, and document them. What you are doing is showing why it is that you think your tools are giving you the right answer.

I want next to discuss a couple of issues that arise when the process does not work as smoothly as we could wish. These are retrospective validation and the validation of software of unknown provenance.

6.1.3. Retrospective V&V

Retrospective verification and validation is the reality of systems with legacy code, and sometimes occurs in new product development, when for business reasons the product was not developed strictly according to the regulatory guidelines or my recommendations in this paper. It is allowed, but realize that it is slightly off-color and against the spirit of proper software development. Software should be constructed from the start using careful methods in order to ensure the highest quality.

Sometimes there is a temptation to think that all we have to do is clean up a prototype, perhaps software provided under contract, wave a magic quality wand, and test in quality. This seems like a way to get a product to market early and can create the illusion of sufficient quality to be a safe product. However, this does not lend itself to an evaluation of what the product should do—meet customer needs. Instead, it results in what the product can do being accepted as the requirements. Worse, the testing may demonstrate what the prototype product does do instead of what it ought to do.

Nevertheless, there are times when a retrospective V&V is warranted, or at least safe, especially if it is applied to evolutionary prototyping. However, sometimes due to circumstances beyond our control, we are now faced with legacy code or prototype software that we are expected to incorporate into our device, what do we do?

The difference between legacy and prototype is that we’re stuck with legacy—we have a product and we need to provide a retrospective V&V because it was not done during the construction. With prototype software, we have the opportunity from a project point of view to take what the prototype has taught us about the requirements and the design, throw the prototype away, and start over doing the formal work required for a safety-critical device.

Software of unknown provenance implies that what is desired is software of known provenance. Provenance in this context means not documentation about what the product does (reverse engineering or design recovery) but the documentation of the process used to develop the software. In particular, the evaluation of provenance needs to establish:

• That the software is safe and its ability to cause harm was analyzed and reduced

• That practices were used to ensure adequate quality and maintainability

• That functionality is correct by tracing the implemented result back to the requirements.

We cannot change the design processes used if we accept the product as is. All we can do is enhance them with a post hoc analysis. Just because the provenance is unknown or the software was developed without formal rules, does not mean that it is unusable or lacks quality. The objective is to establish that what we have is good enough to be used in a medical device—that it is safe and effective.

The first step is to proceed with a plan. The problem with the SOUP is probably that it was developed without a plan in the first place. The retrospective validation is an opportunity to rectify the oversight. As with any design control process, start with a documented plan to evaluate the adequacy of the SOUP. There are four stages to the plan:

• Determine what the SOUP does.

• Perform a risk analysis.

• Determine the validity of the requirements the SOUP implements.

• Evaluate the quality attributes of the SOUP.

First, determine what the SOUP claims to do. Find out what you can about the requirements—from the implementation, if necessary—to gain an understanding about the problem the SOUP intends to solve. Figure out how the SOUP works—what the underlying design is. Look into the code and see the way it was constructed. Evaluate any test data or test results—is there something that provides evidence that the SOUP actually works? This information may not exist separately from the code. It may be necessary to reverse engineer the higher level concepts from the implementation.

Perform a risk analysis as you would on any subsystem. Start by evaluating the level of concern of the SOUP. Remember, the higher the level of concern, the more critical the risk evaluation and the care for construction. If the SOUP is of low level of concern, the problem of accepting it becomes less.

The risk analysis should evaluate both the risks in the SOUP itself and any indirect risks it may impose through its own failure. Determine whether the SOUP has an unacceptable risk of a hazard, or whether it can interfere with other parts of the system designed to mitigate hazards. Consider whether the impact of a failure in the SOUP can be reduced through safety partitioning. If SOUP can be segregated to a low level of concern, the risk it presents because of its lack of provenance becomes less. If the software is of moderate or major level of concern and its risk is unacceptable, throw it away. Learn from it, yes, but build it from scratch with the correct risk mitigations.

Once you have determined what the requirements that the SOUP actually implements are, compare these to the requirements needed by the medical device. Are the requirements correct for the intended purpose? Can they be made close enough? If not, don’t use it. Also, if the SOUP came from a non-medical environment, evaluate how it will function in the medical device’s environment. For example, an application might freely share its information with anyone, whereas in a medical environment, some of that information might be electronic protected health information and require controls on whether it can be seen or not.

Finally, consider the quality of the SOUP. Establish which quality attributes matter and the acceptance criteria for adequacy. For example, your coding standard could use Pascal-style variable names, but the SOUP is written in lower case with underbars. This would not affect the quality, really, so you would make an exception. There are other quality attributes, of course, that may have varying levels of importance to you.

Providing provenance to SOUP is a method to incorporate prototypes, external custom code, and legacy code into medical products that were developed without formal controls. This can reduce project risk and speed time to market, but requires a documented process in its own right, and focused hazard analysis, for the product risk to be acceptable.

6.2. Security and Privacy—HIPAA

The Health Insurance Portability and Accountability Act (HIPAA) is U.S. law primarily concerned with portability of health insurance coverage when people change jobs. It also establishes standards for healthcare transactions. Where it is of interest from the point of view of software development is the intent of the HIPAA to protect the privacy of patients and the integrity and privacy of their medical records.

6.2.1. Who Must Comply

Protection of privacy is mostly the responsibility of the healthcare provider [43]; unless you are in the business of providing software that directly handles patient records for reporting or billing, compliance to the provisions of the HIPAA is usually indirect. The healthcare provider will be doing the heavy lifting, but the security provisions may impose requirements on the software that you are creating for their use. (Or it may provide market opportunities for devices useful for protecting medical data or authenticating users.)

The security aspects of the HIPAA are known as the security rule. The Department of Health and Human Services (HHS) under the U.S. government has published a series of introductory papers discussing the security rule on the website, www.cms.hhs.gov/SecurityStandard/. Quoting from the web page, “[the] rule specifies a series of administrative, technical, and physical security procedures for covered entities to use to ensure the confidentiality of electronic protected health information.”

The “covered entities” that the rule applies to are “any provider of medical or other health care services or supplies who transmits any health information in electronic form in connection with a transaction for which HHS has adopted a standard” [44]. The “transactions for which HHS has adopted a standard” is a reference to the Electronic Data Interchange (EDI) definitions having to do with health care that HHS has enumerated.

In fact, there is some ambiguity about to whom the security rule applies. There is an exemption for researchers, for example, provided they are not actually part of the covered entity’s workforce. Insofar as a researcher is a covered entity and deals with Electronic Protected Health Information (EPHI), they would have to comply. Hence, companies researching whether their products are safe and effective in clinical trials would also have to comply if they access EPHI.

This also applies to vendors who have access to EPHI during “testing, development, and repair” [45]. In this circumstance, the vendor is operating as “business associate,” and must implement appropriate security protections. The methods for doing so are flexible, however, so it ought to be possible for the covered entity and the business associate to come up with reasonable methods.

One simple method to achieve compliance with the security rule for vendors or researchers is to “de-identify” the data. “If electronic protected health information [EPHI] is de-identified (as truly anonymous information would be), it is not covered by this rule because it is no longer electronic protected health information” [45]. By making the data anonymous, it is no longer technically electronic protected health information, and thus not subject to the regulations.

Not everything is EPHI anyway. If the data are not in electronic form, they are not covered by the security rule, which does, after all, only apply to electronic protected health information. “Electronic” in this sense are data stored in a computer which itself can be programmed. The issue is the accessibility of the computer, not so much the physical format of the data. Therefore, personal phone calls or faxes are exempt; whereas a system that returned a fax in response to a phone menu system would be EPHI and subject to the rule [45].

Patients themselves are not covered entities and thus are not subject to the rule [45]. It is nice to know that you are allowed to see your own health data, and discuss it with your doctor.

So even though your data may not be subject to the security rule, you would nevertheless want to make reasonable efforts to protect its data against loss, damage, or unauthorized access, if only to prevent competitors from seeing it. But you would not be required to maintain a complete security process including security risk assessment and a security management plan.

The provisions of the security rule may not be directly applicable to a medical device manufacturer. Nevertheless, they will be important to your customers. It may be necessary to provide the technical security solutions so that your customer can implement the required administrative policies. On the other hand, if the purpose of your software is to provide EPHI data handling, you will find that your customer is required to obtain satisfactory written assurances from your business that you will safeguard EPHI. You will need to follow the full set of regulations in the security rule including security risk assessment and a security management plan. If your hardware or software has access to EPHI, the healthcare provider will have to assess whether you also need to comply [46].

6.2.2. Recommended Security Practices

We have established some guidelines for determining the extent to which the security rule may impact your business. We next turn to a discussion of the type of issues that might be important.

Malicious Software. One aspect that may affect anyone providing software into the medical environment is the requirement for the “covered entity [to] implement: ‘Procedures for guarding against, detecting, and reporting malicious software.’ Malicious software can be thought of as any program that harms information systems, such as viruses, Trojan horses or worms” [46]. The reasoning is that malicious software could damage, destroy, or reveal EPHI data. This means that your customers will require of you assurances that your software is not an open door to malicious code that could harm the provider computer network or other devices. You may be required by the customer to provide assurances that your installation software is protected from viruses.

If your device is connected to the Internet, it may be necessary to provide anti-virus software along with regular updates to prevent just such an occurrence. It is probably insufficient to trust the healthcare provider employees to always engage in appropriate safe computing—you might want to consider using an input device special to your device or somehow protected from general use lest it acquire a virus and infect your system. For example, rather than using a standard USB thumb drive, you could use a device that does the same thing but with a custom connector, so that it could not be plugged into an unknown computer that may be infected with a virus.

Malicious software is a more significant issue for software written to run on general-purpose computers. It is less an issue for many embedded systems whose programs execute from read-only memory and hence are difficult or impossible to infect.

Administrative Support. While monitoring log-ins and manage passwords is generally the responsibility of the healthcare provider, device makers sometimes want to limit the access to functionality in the device (i.e., information relevant to engineering or system diagnostics). If the engineering mode provided access to EPHI, a single password to your device that could not be changed would not be an adequate security safeguard.

The administrative policies of covered entities may also require regular reviews of information system activities for internal audits. To do this, they may need your device or software to provide records of log-ins, file accesses, and security accesses [45].

Physical Security. You must have the ability to back up the data or restore it in the event of a disaster, that is, somehow get the data out of the device and into a secure facility if the data are part of health information. For example, if your device contains “electronic medical records, health maintenance and case management information, digital recordings of diagnostic images, [or] electronic test results,” [46] the healthcare provider would need to be able to archive this information. It is also important to provide for obliterating EPHI data from your device at end of use or disposal.

As for physical safeguards, you would want to avoid doing anything that would make it impossible for an organization to impose some standards. For example, you wouldn’t want to broadcast EPHI or make it available on a web page or some other method such that restricting it to only the people who need to know it becomes impossible.

This extends to physical media that might be used to store EPHI. The provider has to establish rules about how the media goes into or out of the facility, how it is re-used, and how it is disposed of so that protected data are not revealed to unauthorized personnel. In the case of re-use, “it is important to remove all EPHI previously stored on the media to prevent unauthorized access to the information” [47]. If you are making a storage device, the provider may want to be able to identify each device individually so that they can track them.

Risk Analysis. As is the case with risk analysis for the safety of the software or the device, depending on how close you are to the EPHI data, you may need to carry out a formal risk assessment, wherein you evaluate the potential threats and vulnerabilities to those threats and develop a risk management plan in response [48].

Threat is twofold: unauthorized access or loss of data. Both must be guarded against. CMS has a good discussion and example of risk analysis as applied to security concerns. Interestingly enough, many of the same issues and analytical practices are relevant to device risk analysis. The example has good hints for both. The document HIPAA Security Guidance for Remote Use of and Access to Electronic Protected Health Information, available at www.cms.hhs.gov/SecurityStandard/ is a useful specific discussion of remote vulnerabilities and possible risk management strategies.

The security rule is enforced by the Office for Civil Rights; violation may bring down civil monetary penalties, not to mention possible tort awards. Moreover, there is something of an ethical obligation for healthcare providers and others in the medical industry to exercise due care with private information.

While security is often not a direct concern to the manufacturers of medical devices, as information technology evolves and the desire to share information from individual diagnostic devices increases, it will become increasingly important. In addition, there are best practices for protecting data—such as guarding against viruses, unauthorized access, or data corruption—that are the sorts of things we should be doing anyway. We want our medical devices to be of the highest quality and serve customer needs; some measure of data integrity ought to be a given.

7. Summary

The development of software for medical systems is not significantly different from rigorous methods traditionally applied for other safety-critical systems. It comprises the stepwise refinement of requirements and design artifacts until delivery of the final system. The quality systems are focused on customer needs and intended uses to the point of requiring scientific evidence that customer needs are fulfilled and the device is fit for its intended purpose.

There are extensive regulations, standards, required procedures, and guidance documents with only minimal interaction with software. I presented an overview of quality system regulation and ISO 9001. Thanks to the Global Harmonization Task Force, requirements for the U.S. market and the rest of the world are not very different. You would clearly want to know more about aspects of the regulations if you had regulatory responsibility in your firm, but for the purposes of software development it is really design control and the guidance documents related to software that are of the most interest.

Design control provides a waterfall model of development for purposes of discussing the phases of development and the deliverables development should produce. It consists of design input, where you refine customer needs into engineering specifications; design output, where you build designs that meet the specifications; and design reviews to establish that the design output meets the design input. An important life cycle activity for any safety-critical system is hazard analysis and risk management. This is a major source of requirements during the design input phase, and must be constantly verified for correctness and completeness during the design output phase.

Since software is so easy to change, software change management, including a software problem resolution process as the source of software changes, is critical to show that defects are chased to their conclusion, and no changes are made that are not fully verified and validated. Verification and validation are important activities throughout the development process, and we have seen some of the methods and deliverables associated with this. To close the loop on requirements, we have seen how important it is to trace requirements through implementation and verification to show that what we said we would do, we in fact did.

The same kinds of principles apply to off-the-shelf software. We cannot control the construction of software obtained from third parties. It is preferable if it was constructed using techniques like design control. But if we don’t know or can’t prove that, we supply a risk assessment to determine whether we can use it or have to build it properly ourselves.

The principal interaction of software development with the regulatory system, aside from audits, is the premarket submission. These take three forms, with the intensity of the documentation based on the level of concern of the software. The level of concern is minor if the software by itself cannot cause harm to the patient or user. If the software could cause a minor injury, the level of concern is moderate. If the software, by itself, could cause serious injury or death, the level of concern is major. The goal of the software risk management process is to reduce the risk after mitigations to a minor level of concern. The purpose of the premarket submission documents is to show that the software will reduce the risk to a minor level of concern, through the application of careful construction and design control procedures to ensure that careful construction.

Preparing the premarket submission should be trivial. Provided that all the V&V activity has been captured in the DHF, the submission is just a matter of collecting the documents from this file. This is the ideal world, but for some reason there is a great temptation to stay in “feasibility”—when design control doesn’t apply—until the whole thing is working, and then trying to apply enough retrospective V&V to get the design through the approval process.

But writing it all up and then reverse engineering the requirements and other deliverables out of the completed product is not the most efficient way to operate. This may superficially appear to be the case, because there is overhead with change management and with writing things down in multiple places and then having to trace through and find and fix every occurrence when a change is called for.

But if the change management is so cumbersome, perhaps you need to revisit your change management methods and tools. The process ought not to be so obnoxious that the main effort in our working lives is to circumvent it. The test of process is that it should make our lives easier. We version code from the beginning of development so that we can backtrack if we go down a flawed pathway. We do code reviews close to construction so that we can find defects when the code is fresh in our minds and before we invest a lot of time and effort in testing. We build in units so that we can debug them in a simple, controlled environment, rather than debug the units when we have integrated them with the other subsystems, confounding the unit bugs with the integration bugs.

The irony of software quality is that it serves to shorten schedules. I cannot say it better than Steve McConnell:

The General Principle of Software Quality is that improving quality reduces development costs . . . . The single biggest activity on most projects is debugging and correcting code that doesn’t work properly. Debugging and associated refactoring and other rework consume about 50 percent of the time on a traditional, naïve software-development cycle . . . . Reducing debugging by preventing errors improves productivity. Therefore, the most obvious method of shortening a development schedule is to improve the quality of the product and decrease the amount of time spent debugging and reworking the software. [42]

In circumventing or avoiding software quality, you are missing out on the true power of design control. It is not some overbearing process that the FDA came up with to make engineers’ lives difficult. It is not even a regulatory hurdle designed to protect established firms as a barrier to entry to competitors (although some would have you believe that is its main purpose). In fact, it is just good engineering practice.

Requirements are important. Why would a development team not want to decide what they are going to build? Wouldn’t it benefit everyone to have as much agreement up front as possible? Then the marketing representatives can be careful about over-promising. The engineers can build what their customers want rather than guessing, gold-plating, or just building whatever they like. The business can avoid paying more for development than the plans call for. Managers can make reasonable decisions about which product to pursue based on good estimates of the relative costs and risks.

To develop requirements after the fact is missing their value. I am going to say something extreme here. Requirements are vitally important as a planning and negotiating tool before product development begins. Their utility declines as the product is built. By the time the product is finished, they are useless: if you want to know how the product works, go use it.

This is an exaggeration of course, and biased to the point of view of a design engineer. There are more customers for requirements than the development staff, including the testers and test plan writers. They need to know what the development team intended written in a human language, so that they can compare the result with the assertion of performance.

But my point is that the upstream phases should not be short-changed. They have enormous value for creating the right product, for implementing it correctly, and for keeping to schedules. They comprise the means through which we build products that are safe, effective, and fit to their purpose.

I have done my utmost to ensure that the information in this chapter is accurate. I have proposed processes and methods that I believe will result in validated software that is safe, effective, and appropriate for its intended use. It is my intent that these processes be adequate to establish to regulatory bodies that the software is safe and effective. But the circumstances that you will see with your device and your company vary from mine. No method guarantees approval. Remember to use your own judgment, and be diligent.

8. FAQS

Mine is a very simple device that cannot harm a patient, but does contain software. Do I have to comply with the QSR?

• Yes, any device automated with software must meet the requirements of 21CFR §820.30 design controls, including Class I devices.

My software is an addition to a finished medical device. Do I have to comply with the QSR?

• Yes, a product meant to augment the performance of a medical device is an accessory and as such is subject to the QS regulation.

I am making a modification to existing software in a legacy product that was designed without design control. Do I have to bother?

• Yes. The design control requirements apply to changes to existing designs even if the original design was not subject to the requirements.

I used Microsoft Excel to help me design my product. Do I have to validate Excel?

• No, but you do have to validate that the answers Excel provided were the correct ones through an inspection, sample comparison to hand calculations, or some reasonable method to verify that it was calculating correctly and you were using in correctly.

Am I finished tracing when I have traced requirements to design and tests?

• No. You want to trace from the design back to requirements as well. This is so that you can uncover tacit requirements or gold-plating. [1]

How do I achieve 100% code coverage in a functional test? Some of the conditions are defensive code that should never appear.

• Coverage testing is best done in white-box (structural) testing, stepping through the code on a debugger.

I have requirements that cannot be tested with functional tests, such as a requirement for the initial state of microcontroller registers. Should I leave out the requirements or the tests from the software system test plan?

• Not all requirements are functional requirements. Other requirements are the processor that the software will run on, or the development language. These still need to be verified, but they will be verified by means other than functional tests, such as an analysis or inspection.

What are the differences between the software documentation requirements for a 510(k) submission versus a PMA?

• The software documentation requirements are governed by the level of concern of the software, not the premarket submission. Each risk class of software has specific documentation requirements, and it does not differ with the premarket process.

Okay then, what are the software documentation differences between moderate and major levels of concern software?

• Mostly the provision of unit test data. For a full discussion, see Section 5.9.3, Premarket Documentation, Major Level of Concern.

My software is merely a diagnostic function for the device to help field service troubleshoot it (hence, minor level of concern). But the device itself is Class III. Does my software have to be treated as major level of concern?

• Software associated with a major level of concern device is itself major level of concern. This means that it requires a full hazard analysis. Of course, if it cannot harm anyone, it will already have achieved reduction to minor level of concern.

What are the differences between the FDA and other international regulatory bodies?

• They differ slightly in detail, with the FDA being more specific. (The FDA requires a design history file, for example.) Most of the differences are relative to the quality system for the organization as a whole. There are no practical differences between the methods that you would use to produce quality software for the United States versus the rest of the world.

How long can I expect before I get a response to my premarket submission?

• The FDA is required to provide a response to the premarket submission within 30 days. This can be as simple as an acknowledgment that the package is in order. More likely, it is a list of further questions, asking for further responses from you. PMA normally takes 180 days for approval, although it may take longer; 510k takes 90 days, according to the FDA. In practice, the rounds of requests for more information can take an unpredictable amount of time.

My device has proprietary algorithms protected from competitors as a trade secret. Do I have to show them to the FDA?

• While FDA inspectors may ask for design control procedures, blank forms, and required design review and design verification or validation records, where confidential information appears it may be blacked out. You will have to show the information if the inspection is related to a marketing submission, however. [7]

How can I most easily comply with the security policies of the HIPAA?

• Not everyone falls under the limits of the HIPAA and not all data are protected. But if you have any doubts, the easiest thing to do is to “de-identify” the data. If the data cannot be associated with an individual, it is exempt from the security rule.

Is encryption required for electronic protected health information?

• No, it would not be essential for a dial-up connection. But it should be seriously considered for transmitting medical data, especially over the Internet.

What are the sanctions for violating the HIPAA?

• The Office for Civil Rights is the enforcement arm, and this agency can levy civil financial penalties. In addition, companies may fire workers whom they feel have violated HIPAA policies.

References

[1] Center for Devices and Radiological Health. General principles of software validation. Final guidance for industry and FDA staff, Washington, DC: FDA; 2002. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
GuidanceDocuments/ucm085281.htm
.

[2] Allen L. The Hubble Space Telescope optical systems failure. NASA Technical Report, NASA-TM-103443, Washington, DC: National Aeronautics and Space Administration; 1990. Available at: ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19910003124_1991003124.pdf.

[3] Center for Devices and Radiological Health. Guidance for the content of premarket submissions for software contained in medical devices, Washington, DC: FDA; 2005. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
GuidanceDocuments/ucm089543.htm
.

[4] International Organization for Standardization. Home page. Available at: www.iso.org/iso/home.htm.

[5] Seeley RS. Exporting: getting small device companies through the CE marking maze. MD&DI (October). Available at: www.devicelink.com/mddi/archive/95/10/014.html. 1995.

[6] Trautman KA. The FDA and worldwide quality systems requirements guidebook for medical devices. Milwaukee (WI): ASQ Quality Press; 1997.

[7] Lowery A, et al. Medical device quality systems manual: a small entity compliance guide, Washington, DC: Center for Devices and Radiological Health; Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
PostmarketRequirements/QualitySystemsRegulations/
MedicalDeviceQualitySystemsManual/default.htm
. 1996.

[8] U.S. 21 CFR §820.1 Scope. Guidance document.

[9] U.S. 21 CFR §821. Medical Device Tracking Requirements.

[10] Food and Drug Administration. Statement on Medtronic’s Voluntary Market Suspension of Their Sprint Fidelis Defibrillator Leads, Oct 15, 2007. Available at: www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/default.htm.

[11] U.S. 21 CFR §820.70(i) Automated processes. Guidance document.

[12] U.S. 21 CFR §820.90(a) Control of nonconforming product.

[13] U.S. 21 CFR §820.198 Complaint Files.

[14] Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
HowtoMarketYourDevice/PremarketSubmissions/
PremarketNotification510k/default.htm
.

[15] Available at: www.gpoaccess.gov/CFR.

[16] Center for Devices and Radiological Health. Design control guidance for medical device manufacturers, Washington, DC: FDA; 1997. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
GuidanceDocuments/ucm070627.htm
.

[17] U.S. 21 CFR §820.30(a) General.

[18] U.S. 21 CFR §820.30 Design controls. Guidance document.

[19] U.S. 21 CFR §820.30(c) Design input, Guidance document.

[20] Leveson NG. Safeware: system safety and computers. Upper Saddle River, NJ: Addison-Wesley; 1995.

[21] U.S. 21 CFR §820.30(d) Design output.

[22] Global Harmonization Task Force. GHTF Guidance, 4.4.7 Design verification.

[23] U.S. 21 CFR §820.30(g) Design validation.

[24] ANSI/AAMI/IEC 62304:2006. Medical device software—Software life cycle processes.

[25] Center for Devices and Radiological Health. Guidance for industry, FDA reviewers and compliance on off-the-shelf software use in medical devices, Washington, DC: FDA; 1999. September 9, Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
GuidanceDocuments/ucm073778.htm
.

[26] Economics focus: an unhealthy burden. The Economist. June 30, 2007, p. 88.

[27] Freedman DP, Weinberg G. Handbook of walkthroughs, inspections, and technical reviews. 3rd ed. New York: Dorset House; 1990.

[28] Wiegers K. Peer reviews in software: a practical guide. Boston: Addison-Wesley; 2002.

[29] Beck K. Test-driven development: by example. Boston: Addison-Wesley; 2003.

[30] Food and Drug Administration. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/Overview/
ClassifyYourDevice/ucm051512.htm
.

[31] Food and Drug Administration. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/Overview/
ClassifyYourDevice/ucm051512.htm
.

[32] Food and Drug Administration. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
Overview/default.htm
.

[33] Food and Drug Administration. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/Overview/
ClassifyYourDevice/ucm051549.htm
.

[34] Food and Drug Administration. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/Overview/
GeneralandSpecialControls/default.htm
.

[35] Food and Drug Administration. Product classification. Search classification database. Available at: www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfPCD/classification.cfm.

[36] Food and Drug Administration. Available at: www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
HowtoMarketYourDevice/PremarketSubmissions/
PremarketNotification510k/ucm070201.htm
.

[37] Dunn WR. Practical design of safety-critical computer systems. Solvang, CA: Reliability Press; 2002.

[38] McConnell S. Rapid development: taming wild software schedules. Redmond, WA: Microsoft Press; 1996.

[39] U.S. 45 CFR 160.103 General Adminstrative Requirements. Available at: www.gpoaccess.gov/CFR/retrieve.html.

[40] www.cms.hhs.gov/EducationMaterials/Downloads/
Security101forCoveredEntities.pdf
, vol. 2, Paper 1, p. 2.

[41] Federal Register, vol. 68, no. 34. Thursday, February 20, 2003. Rules and regulations 8361.

[42] U.S. Department of Health and Human Services. Security standards: administrative safeguards. HIPAA Security Series, vol. 2. Paper 2. Washington, DC: DHHS; May 2005, rev. March 2007. Available at: www.cms.hhs.gov/EducationMaterials/Downloads/
SecurityStandardsAdministrativeSafeguards.pdf
.

[43] U.S. Department of Health and Human Services. Security standards: physical safeguards. HIPAA Security Series, vol. 2. Paper 3. Washington, DC: DHHS; February 2005, rev. March 2007. Available at: www.cms.hhs.gov/EducationMaterials/Downloads/
SecurityStandardsPhysicalSafeguards.pdf
.

[44] U.S. Department of Health and Human Services. Basics of risk analysis and risk management. HIPAA Security Series, vol. 2. Paper 6. Washington, DC: DHHS; June 2005, rev. March 2007. Available at: www.cms.hhs.gov/EducationMaterials/Downloads/
BasicsofRiskAnalysisandRiskManagement.pdf
.

Bibliography

Beck K. Extreme programming explained. Boston: Addison-Wesley; 2000.

Boehm B, Turner R. Balancing agility and discipline: a guide for the perplexed. Boston: Addison-Wesley; 2004.

Burnstein I. Practical software testing. New York: Springer; 2003.

Food and Drug Administration. CFR—Code of Federal Regulations Title 21. Quality system regulation 21 CFR Part 820. Available at: www.gpoaccess.gov/CFR/retrieve.html.

Food and Drug Administration. Device advice: device regulation and guidance. Home page. Available at: www.fda.gov/cdrh/devadvice.

International Organization for Standardization. ISO 14971:2007. Medical devices—Application of risk management to medical devices.

McConnell S. Code complete 2nd ed. Redmond, WA: Microsoft Press; 2004.

Plum T. Reliable data structures in C. Cardiff, NJ: Plum Hall; 1985.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.66.206