CHAPTER 52

APPLICATION CONTROLS

Myles Walsh

52.1 PROTECTION IN APPLICATION DEVELOPMENT

52.2 PROTECTING ONLINE FILES

52.2.1 Types of Data Corruption

52.2.2 Database Management Subsystems

52.2.3 Lock on Update

52.2.4 Two-Phase Commit

52.2.5 Backup Files and System Logs

52.2.6 Recovery and Restart

52.2.7 Backout

52.2.8 Roll-Forward Recovery

52.2.9 Distributed Databases

52.3 PROTECTING BATCH FILES

52.3.1 Backup File Creation

52.3.2 Audit Controls

52.4 ENSURING THAT INFORMATION IN THE SYSTEM IS VALID

52.4.1 Validation Controls

52.4.2 Diagnostic Utilities

52.5 CONCLUDING REMARKS

52.6 FURTHER READING

52.7 NOTE

52.1 PROTECTION IN APPLICATION DEVELOPMENT.

In computer installations where systems development takes place, there are technologies that tend to enhance security. These technologies, together with mandatory organizational procedures and standards, force analysts and programmers to adhere to guidelines when they are developing in-house applications or systems to be marketed. This chapter reviews some of the methods programmers use to prevent and identify problems involving data corruption or unavailability.

One of the underpinnings of modern programming is the technology known as the database management system (DBMS). Many applications are developed using this technology. A contemporary RDBMS supports relational databases. Relational databases themselves are based on an underlying technology developed in the 1960s and implemented through the remainder of the twentieth century. It seems certain that the technology will continue to be used for the foreseeable future.

RDBMSs are sets of programs that provide users with the tools to perform these tasks:

  • Create database structures (file or table layouts, and screens or forms).
  • Enter information into the structures.
  • Establish cross-references among the files or tables.
  • Manipulate (sort, index, and summarize) the information in the structures.
  • Import information from nonrelational database structures and export information to nondatabase structures. This must allow for interfaces between applications using an RDBMS and applications using conventional files structures.
  • Provide for the security and the integrity of the database.

Many DBMSs also include the tools to create a data repository. Data repositories, expanded versions of data dictionaries that store information about databases, are documentation databases that include descriptive information about all the resources included in an information systems environment. Data dictionaries are often integrated with an RDBMS, and, if used as intended, force documentation standards in information systems environments. Common file or table structures, common program modules, and common field definitions contribute significantly to the reduction of confusion and of communication breakdowns in an enterprise.

There are two primary methods of organizing files and databases for access and processing: batch and online. Protection of online files and databases requires additional planning when the systems using them are being designed, and special precautions when they are being used. Protection of batch files is more straightforward because the creation of backup copies is an inherent part of the batch process.

Another issue is data validation during data entry or during data transfer from older systems. Every attempt should be made to ensure that data are entered or transferred and validated expeditiously. Failure to get correct data into a system quickly delays the creation and display of the information, the very purpose of almost all systems. When improper attention is directed to validation of the data as it enters the system, incorrect information is created by the system. The acronym “GIGO” (garbage in, garbage out) has become part of the national lexicon of everyday language. Unfortunately, there are some techniques that, when improperly used, contribute to the creation of incorrect information; examples include prototyping, joint application development (JAD), and rapid application development (RAD). By definition, a prototype is a model that is put together to demonstrate that the concept of a proposed system is valid. Basically, prevalidated test data are used to test the concept. However, too often a successfully tested prototype system is put into production without adequate validation procedures. The same situation can arise with RAD and JAD projects, which can tempt practitioners to implement new software products without considering their impact on other systems within an organization.

For related topics, see Chapter 38 in this Handbook on writing secure code, Chapter 39 on software development and quality assurance, Chapter 40 on managing software patches, and Chapter 47 on operations security and production controls.

52.2 PROTECTING ONLINE FILES.

Data can be damaged through hardware problems, errors in software, and problems during concurrent access by two or more users or processes.

52.2.1 Types of Data Corruption.

Data corruption implies incorrect data. Corruption can occur because of physical factors or logical errors in programs.

Physical corruption occurs through breakdown or other failures of hardware such as computers and network equipment, especially of mass-storage devices such as magnetic disks, tapes and cartridges, and optical drives. Data corruption during transmission can occur through electromagnetic perturbations of communications cables and radio-frequency noise that affects wireless transmission. Fiber optic cables are susceptible to cross-talk and to disruption caused by physical bends or kinks in the cables; CD-ROM (compact-disk read-only memory), CD-WORM (write-once, read-many), and CD-RW (read-write) disks are susceptible to dirt and abrasion. In addition to problems in the transmission or storage media, improper settings, or the effects of wear in the equipment, can cause errors. Examples include misalignment of magnetic heads on disks and tapes; bad contacts in wireless transmission equipment, causing noise; and improper positioning of lasers in optical media. See Chapters 22 and 23 for extensive discussion of physical factors that can cause data corruption.

Logical corruption occurs through programming errors such as incorrect sums, bad arithmetic formulas, incorrect logical conditions, bad data in look-up tables, out-of-bounds conditions allowing reading and writing in the wrong areas of memory, and the effects of malicious software.

It is sometimes possible to identify the source of errors by examining their characteristics. For example, physical corruption characteristically shows data written inappropriately across blocks rather than in single fields defined by software applications. Such physical corruption usually has no relationship with the damaged records other than physical proximity on the disk; therefore, a cardinal sign of physical corruption is damage to files from completely different applications—a block of database records contiguous to a block of text files, for example.

In contrast, logical corruption characteristically shows errors in the same field in multiple records. Another class of logical errors shows bad values for boundary conditions (e.g., the smallest or largest possible value) but correct data within the range. Such errors rarely cross application boundaries unless there is a logical relationship among the damaged files; for example, an error in a spreadsheet (almost always due to input or programming errors by the user) may propagate to text files if object linking and embedding (OLE) are used to insert the spreadsheet results into the document. Other documents will remain unaffected by such logical errors. For more extensive discussion of logical errors, see Chapter 39 on software development and quality assurance.

52.2.2 Database Management Subsystems.

In the 1960s, representing complex systems, such as those based on accounting records, forced programmers to define their own file structures to represent relationships among the data; for example, an order header file would be linked to the corresponding order detail records through hard-coded relationships in the application programs. Each programmer or programming team had to define its own data access paths and code them explicitly. Coordinating access to multiple individually named files caused headaches for programmers. For example, it was easy to make errors such as forgetting to increment counters representing how many detail records corresponded to a master (index) record (i.e., line counts in order headers disagreeing with the actual number of detail lines). Because there was no particular protection for the files in such systems, it was easy to replace or delete individual files by mistake, leading to massive logical corruption. Deletion of header records could leave a fragment of inaccessible details in associated files. Programs had to keep pointers up to date for forward and backward chaining. Backups sometimes failed through operator error when not all related files were included. Furthermore, every system had its own unique methods for managing the data, causing maintenance and learning headaches.

The late 1960s saw many large programming shops defining their data access methods by using library routines that all programmers could share, but there was still a heavy investment in learning the new rules whenever a programmer changed jobs. In the early 1970s, the programming field saw explosive growth in implementation of database management subsystems, where the interface to the database controlled utility functions such as indexing, pointing, and chaining. A typical DBMS would protect all files, now called data sets, against accidental erasure, and would force all reads, writes, appends, and locks to be mediated by the DBMS routines. Such interfaces usually provide for identification and authorization codes to control access to the data. In addition to these barriers to accidental or uncontrolled access, a DBMS also typically provides backup utilities to ensure that all data sets are copied together and to prevent accidental restoration of the wrong version of a data set. A DBMS also often provides ad hoc query tools that can substitute for specially written programs to accomplish simple requests; some also have more powerful report functions for formatting output with headers, footers, and calculations. Finally, a DBMS usually provides logging facilities to keep records of different types of access to the database.

The most important rules enforced by a DBMS are referential integrity to prevent common logical data corruption, automatic uniqueness constraints to preclude duplicate and conflicting records, and locking for safe concurrent access.

  • Referential integrity. Referential integrity in DBMS design ensures that every dependent record has a primary key value that matches an existing primary key in the master file. In an order database, for example, order numbers in a header file (often called the order master) are the primary keys. Each order header record contains unique information about an order, such as customer number, date placed, total price of materials, taxes, and shipping costs. The order detail file contains the dependent records, each of which can be located using its order number as the primary key. Each detail record contains information about a specific part of the corresponding order, such as an item number, quantity ordered, price, extended price, and special charges or discounts. If an order header record is to be deleted, all the order detail records must be deleted first; otherwise, the detail records would be left with no way to locate them through their primary key value. Similarly, no detail record can be added unless the master record with the same primary key already exists.
  • Uniqueness constraints. A modern DBMS allows configuration of nonrepeating primary keys; for example, in an order database, the order number would typically be a nonrepeating or unique key, because there should never be two orders with the same identifying number. Setting the uniqueness property would preclude adding a second header record with the same value in the order number field as another order.

52.2.3 Lock on Update.

When more than one user accesses a database, it is possible to experience conflicts over the use of specific records. A classic example occurs in an inventory database, where there are 15 units of part 1 in the inventory. User Albert needs to take 5 units of part 1 out of inventory, leaving a total of 10. The inventory program reads the inventory record for part 1 and modifies record 1 to show only 10 units. However, if, while this is going on, user Betty needs 3 units of part 1, the inventory record still shows 15 units available because user Albert has not yet updated that record. After user Albert's program completes its update, the record shows 10 units available, but after user Betty's program overwrites that record to show 12 available units for part 1, the inventory total is off by 5 units. This situation is an example of a race condition as described in Section 39.4.1.6 of Chapter 39 in this Handbook.

To avoid this kind of logical corruption, a DBMS provides facilities for locking parts of the database. In the inventory example, user A's program would lock the inventory record for part 1 (or the entire inventory data set) until the update is completed. That way, user B's program would have to wait for user A's program to unlock the data before being able to act on the inventory record.

The obvious symptom of a bad locking strategy is a discrepancy between the database value and the real-world value. However, such a discrepancy is not by itself proof of logical corruption, because the same divergence could arise from events in the real world not reflected in the database. In the inventory example, the actual inventory may have been reduced by theft or increased by an unrecorded addition of materials.

52.2.3.1 Unconditional versus Conditional Locking.

There are two types of locking strategy: conditional locking and unconditional locking. Conditional locking attempts to obtain a lock, but if the required record or the entire data set is already locked, the DBMS returns control to the calling program with a status indicator of this condition. The application program can then be written to loop until the lock is obtained. The unconditional lock request hangs the program until the lock is granted. The DBMS or the operating system provides automatic queuing using a first-in, first-out queue.

52.2.3.2 Deadlocks.

Unconditional locking carries risks if multiple resources are locked by programs. For example, if program A unconditionally locks resource 1 and program B locks resource 2, trouble will occur when program A then attempts to lock resource 2 while program B tries to lock resource 1. Neither program will release the resource it has locked until it is released, and so both will wait forever or until one of the programs is forcibly terminated. Such a situation is known as a deadlock or more colorfully as a deadly embrace. If programmers insist on using unconditional locking, the deadlock prevention strategy is to ensure that all programs accessing the database must lock resources in the same order (e.g., lock 1, then lock 2) and must unlock in the reverse order (unlock 2, then unlock 1).

Other strategies are to keep transactions as short as possible and to avoid the necessity for operator interactions that would keep the records locked for long periods of time.

52.2.4 Two-Phase Commit.

Sometimes many records or data sets must be locked for complex transactions to be completed. For example, in a hospital's clinical systems database, discharging a patient could require modifications in data sets such as the patient-master, treatment-detail, nursing assignment master and details, doctor assignment master and details, and data sets for the financial functions. Locking everything that might be needed and waiting for a human being to enter all the appropriate data could take seconds to minutes, during which time all the affected records would be locked and unavailable to everyone else on the system. In the extreme, if an operator were to leave in the middle of a transaction, other users could be blocked out of large parts of the database for an indeterminate length of time, even for hours. The delays resulting from such locking around human intervention led to the principle that no transaction can be allowed to lock around a human intervention.

Another problem with locking around human intervention is that a system failure could terminate processing while the database was in an inconsistent state. For example, in the inventory case, the DBMS might have updated the order detail by adding an item but not yet updated the order header to show the new total cost of the order. To reduce the likelihood of such an occurrence, a DBMS can support the two-phase commit as an aid to making changes as fast as possible and thus reducing the window of vulnerability for interruption. In the two-phase commit, the DBMS obtains copies of all the records needed when the operator begins a transaction. Once the operator has taken all necessary steps for completing the transaction, the DBMS locks and reads all the changed records again and compares the current values with the initial values; if there are no differences, the DBMS makes all the required changes and immediately unlocks all the records. However, if the current values have been modified since the operator requested the initial copies, then some other process has been active and so the DBMS reports that verification is needed. The operator typically is given the choice of how to proceed; for example, if there are no items left in inventory, the order may have to be delayed or canceled, whereas if there are enough items, the operator need merely reinitiate the transaction with the new initial values.

52.2.5 Backup Files and System Logs.

When information in an online file or database is updated, the old information—that is, the information that was in the record before the change was made—is overlaid; unless steps are taken, it disappears without a trace. For this reason, many DBMSs allow an image of the original record to be copied to the transaction log file. In other cases, where the exact history of a particular group of records must be preserved, as with insurance and medical data, an application may append new records but not delete old ones.

If online files and databases were never damaged or lost, data loss would be of no concern. However, in an imperfect world, steps must be taken so that damaged or lost online files and databases can be recovered. For further details of data backup and recovery, see Chapter 57 in this Handbook.

In order to recover online files and databases, it is first necessary to make periodic backup copies and to make log copies of records that have been updated in the time between making the backups. How often a backup copy is made depends on how dynamic the files or databases are. In most enterprises, a significant number of total or partial files and databases are copied daily. It is not uncommon for computer operations departments to spend several hours each day doing backups. Whenever a backup copy of a file or database is created, there are two correct copies at that point in time. To explain how online file and database backup and system logging work, think of a single online file. The file is taken off-line and is no longer accessible to online transactions at 4 A.M., and a copy is made. Both copies of this file are identical at that time. At 6 A.M. the original file is put back online, and transactions recorded during that interval are run to update that file. From that point on, with each update transaction, the differences between that file and its backup increase.

At 2 P.M., if for some reason the file is no longer usable, the backup file is then eight hours behind. At this point, the log file becomes critical in the restoration process. The log file is in sequence by time and contains copies of the updated records both before and after update was performed. It also contains copies of the transaction record.

52.2.6 Recovery and Restart.

After it has been determined that an online file or database has been corrupted or destroyed, a procedure known as recovery and restart is initiated. The first step in this procedure is to copy the backup to create a new original. The next step uses the log file to reapply, in time sequence, all the transactions that had been executed since the backup copy was made.

Contemporary database systems have files that cannot be taken off-line. They are online 24 hours a day, seven days a week. In order to make backup copies, parts of the database are copied periodically (dynamic backup). Conceptually, a database can be broken up into parts. Each part can be backed up separately at different time periods. A number of schemes can be devised to back up some of the records. For example, copy to a backup file every fifth record of the database in one time period, say records 5, 10, 15, 20, and so on. A bit later, copy to another backup file records 1, 6, 11, 16, 21, and so on. If a conflict occurs between copying the record for backup and a transaction attempting to update the record, have an appropriate procedure established to let one or the other take place first. Recovery and restart, as well as backout procedures, work the same way, with the additional complexity of establishing priorities for which gets done first when conflicts occur. Even though these conflicts increase complexity, they are resolved when the procedures for recovery, restart, and backout are created.

52.2.7 Backout.

The log file is also used in a process known as backout. This process is initiated when online update transactions fail to complete after making incomplete or partial updates to files or databases. For example, an update transaction in which there are additions to three fields in three separate files is supposed to take place. After making two out of the three updates, the transaction terminates abnormally because of a program malfunction. Eventually the program gets corrected, but something has to be done to undo the two partial updates. Otherwise, when the program is fixed and the transaction is rerun, those two updates would be reapplied, which would result in erroneous duplication. The backout procedure is initiated for those transactions that, if they had not completed properly, would generate errors.

Recovery using log files requires marking the start and end of every transaction; if the log file records show a transaction start without the corresponding transaction end, recovery processes can recognize the transaction as incomplete. Such markers correspond to what are called checkpoints in the program design.

52.2.8 Roll-Forward Recovery.

Another approach to recovery is to start from a known good state and redo all transactions that are known to have been accomplished correctly. This roll-forward recovery requires a database backup and a transaction log file to be synchronized, so that the first record in the transaction log file represents the first transaction immediately following a database backup. With these data in hand, it is possible to reconstruct all the modifications up to, but not including, the last complete transaction. All incomplete transactions are discarded, although the recovery program typically prints out all the details available for transactions that were not completed.

52.2.9 Distributed Databases.

A distributed database is one that is stored in different databases on different computer platforms. Databases can be distributed over many sites, using several different architectures.

The simplest and most susceptible to failure is the single database server that houses the database and shares its contents among several local area networks. Whenever the server or the database is off-line, or during a crash, all clients lose access to the database. The backup and recovery of a single server database follows the procedures described earlier.

A second architecture is the replicated database server, in which there are duplicate copies of the database on multiple servers. This type of system allows clients to access information from any of several copies of the database. With a replicated server environment, clients still can have access to the database when one of the servers is offline. Such replicated systems usually have an additional benefit of being able to balance transaction traffic so as to keep any one server from being overwhelmed. The trade-off is the database synchronization process that increases the complexity of the replicated server architecture. Although the backup and recovery of a distributed database also can follow the procedures described earlier, it is complicated by the synchronization requirement.

A third architecture is known as a partitioned database server, in which specific subsets of a database are on two or more database servers. There could be a marketing database on one server, an accounting and finance database on a second server, and an inventory database on a third server. Because it is generally not possible to have mutually exclusive subsets of databases, synchronization must still be part of this architecture. Backup and recovery of partitioned databases requires application of the procedures just described for each of the subsets.

There is another distributed architecture known as federated database servers. This architecture is used in support of databases on two or more servers, made up of ordinarily incompatible storage models such as hierarchical and relational models, supported by different DBMSs. With the federated architecture, a single unified database definition, or schema, is created and stored on a combined database server. That server acts as an interface between application programs and the databases residing on the other servers. Queries and other transactions are sent to the combined database server, which translates these into queries and transactions to the underlying databases. Responses then return from the underlying databases to the unified schema to formulate a response to the user. Backup and recovery procedures such as those described earlier are used for the underlying databases.

Although federated database architecture can be complex and expensive to maintain, it can be less complex and less expensive than the process of supporting application programs for each of the underlying DBMSs. Federated architecture is often used in the implementation of data warehouses, which are used to extract information from multiple internal databases, as well as external databases, to support management decision making.

52.3 PROTECTING BATCH FILES

52.3.1 Backup File Creation.

The protection of files that are updated by batch processing programs is automatic, because the file is completely recopied and the original is left in the form it was in before the program executed. Therefore, each processing cycle leaves its own backup. The name “batch processing” comes from the idea of sequentially processing transactions in a group. There are always two or more files involved: the master file that is being updated and one or more transaction files that contain the information used to update the master. The process copies master file records that have no update activity into a new master file, updates the master records that do have activity and copies them into the new master file, and does not copy those master records that are flagged as deletions in the activity file. When the process completes, there are at least three files: the original master, the activity file(s), and the new master. Backups are the original master and the activity file(s). In the next processing cycle, the new master becomes input, together with that cycle's activity file(s). If a problem is encountered with the master file, the prior cycle's master file and all subsequent activity files can be used to generate an updated master. Keeping two or three generations of master and activity files is a common practice. The term “generation” is often applied to the batch processing cycles—generation 0, generation 1, generation 2—or as grandfather, father, and son. Recently these latter terms have been replaced by “grandparent,” “parent,” and “child.”

52.3.2 Audit Controls.

Another security measure applied when working with batch files involves using control totals to ensure that the batch process has executed with accuracy. Specific controls may include counts of records. There also may be control totals of the values in specific fields of the input records to compare with totals in the output, after processing additions and subtractions. For example, in a payroll system, information from time cards is submitted to a computer for processing. Along with the time cards is a transmittal that contains a count of the time cards, and totals of all the straight time and overtime hours compiled by the department that originated the time cards. A procedure is run in which the control totals on the transmittal are checked against the totals actually input from the time cards, to ensure correctness before initiating the payroll process. If discrepancies are encountered, they need to be investigated and corrected before taking the next step in processing the payroll.

52.4 ENSURING THAT INFORMATION IN THE SYSTEM IS VALID.

The next sections present a brief overview of validation; for more extensive coverage of this topic, see Chapter 47 in this Handbook on operations security and production controls.

52.4.1 Validation Controls.

Whether an application updates its files by means of batch or online processing, or by some combination of both, validation of input is paramount. The term “GIGO” (garbage in, garbage out) is not heard as often as it used to be, yet it is as true as ever. Most contemporary systems use online files and databases, with information entered interactively, allowing for validation at the source. Several specific validation techniques have been applied to reduce the amount of incorrect information input into files and databases.

52.4.1.1 Methods for Identifying Input Errors and Unauthorized Modifications.

Data input errors are common. Fours and nines, ones and sevens, and sixes and zeroes can easily be mistaken for one another. Operators can leave out a digit or insert an extraneous one. Transposition is another type of mistake that is made from time to time. For example, an order entry operator may key in 3286 instead of 3826 (a transposition error), or perhaps 7790 instead of 7796 (a transcription error). Such errors would be reported when the check-digit calculation produces a number that differs from the input check digit. A check digit is an extra digit that is added to a generated variable as a suffix. For example, a five-digit stock number could have an extra digit; the sixth digit or check digit, as it is called, is calculated by applying an algorithm (performing some arithmetic) on the first five digits, resulting in a single-digit result. To minimize input errors, check digits can be used on any variables that are program-generated: for example, part numbers, employee numbers, and codes of various types. For more on check digits, see Chapter 4.

A single check digit can sometimes conceal the existence of double or multiple input errors in an input string because of the relatively simple arithmetic schemes. More complex versions of the check digit generate a sequence of digits called a check sum, which provides greater power to help identify input errors or fraud; credit card numbers typically include a four-digit check sum.

An extension of the check sum, the hash total, is a common tool for identifying logical or physical data corruption. A hash total is simply the meaningless sum of numerical values, such as part numbers, rather than the meaningful sum of quantities. Recalculation of a hash total can usually indicate if the data have been corrupted since the last time the hash total was computed.

A digital signature is a cryptographically generated value based on an encryption key. Applying the same digital signature process to the data should generate an identical signature. In addition, as described in Chapters 7 and 37 in this Handbook, using the public key cryptosystem allows one to verify the authenticity as well as the integrity of signed data.

52.4.1.2 Range Checks.

Range checks offer another way of validating information at its source. In situations where quantities and currency values are being entered, lower and upper limits can be established. For example, a range check could be used on quantity ordered to ensure that the value entered is within specific bounds, such as no less than 10 or no more than 60 of a certain item. Although not completely error proof, such limits can at least eliminate obvious errors and reduce those not so blatant. Range checking with values of 10 and 40 will eliminate the extra 0 error, where an operator enters 100 or 600, or 1 or 4.

52.4.1.3 Validity Checks Using Tables.

Using tables of values or valid codes is one of the best ways to ensure that only correct information is entered into files and databases. In contemporary files and databases, data type is specified as a property of the field during the design phase, and these properties can be used to make sure that text information is not entered into numeric fields and vice versa. Properties are also used to filter out invalid data configurations. Although impossible to filter out misspelled last names and other open-ended types of information, it is possible to filter out invalid state codes. A table containing the 51 (DC is the 51st) valid state codes can be used to ensure that none but those 51 can be entered. Other such entities (e.g., department codes, product codes or types, and price classes) are primary candidates for table validation.

With tables, it is also possible to make combination tables. For example, if a certain product code fell within a specific range, then only four price classes would be allowable. To be concrete, when a product code falls within the 600 to 699 range, then the price class could only be R, PR, OC, and CD. Such tables are used to validate information as it is initially entered into files and databases. Entry of invalid information, whether intentional or inadvertent, is inhibited.

Tables of forbidden combinations are particularly important in real-time control systems for processes that can go seriously wrong as a result of bad inputs. For example, in December 1992, an explosion at a chemical plant in the Netherlands was traced to a data input error that could have been prevented by an edit check that verified a table of forbidden chemical mixtures. An excerpt from a report discussing the event by Meine van der Meulen of The Netherlands Organization for Applied Scientific Research, Department of Industrial Safety, states:

The accident started with a typing error in a prescription by a Laboratory worker. Instead of tank 632 he typed tank 634. In tank 632 there was stored resin feed classic (UN-1268) and normally used in the batch process. In tank 634 DCDP (dicyclopentadiene) was stored. The operator, who had to check if the tank contents was equal with the prescription, filled the reactor with the wrong chemicals.

…a heavy explosion occurred which caused the death of 3 firemen of the works fire brigade and injured 11 workers [including] 4 firemen of the works fire brigade. The damage was estimated at several [10s] of millions NL guilders.1

Tables generally should be used only for information that is relatively static, because of the need for maintenance. Frequent table maintenance can lead to errors that have a cascading effect. If a table is incorrect, then all the information supposedly validated by that table could be incorrect throughout the files and databases.

52.4.2 Diagnostic Utilities.

Production programs need to operate on verifiably correct data. Every production program should include diagnostic utilities to scan databases for logically impossible values or deviations from reality. For example, a diagnostic routine might check to be sure that every header total matches the total computed from the linked line item values. Because older data may have been accepted under older range-validation rules, all records should be checked to ensure that they conform to the current range constraints. Similarly, any other logical constraints in an application system should be explicitly addressed in the diagnostic program. The output from the program should give all the details appropriate for identifying and correcting the incorrect data. For example, instead of simply indicating

BAD VALUE IN PRICE FIELD, RECORD 1234

the output should explain something like

PRICE FOR ITEM 1234 = $45.67 WHICH EXCEEDS CONFIGURED
LIMIT OF $40.00 SHOWN IN INVENTORY MASTER RECORD 78999

52.5 CONCLUDING REMARKS.

In another application of the 80/20 rule, 20 percent of the design and development work is expended in handling 80 percent of the data (the normal and the correct) going through a system, whereas 80 percent of the design and development work is expended for handling 20 percent of the data (the errors, the exceptions, and the unusual situations) going through the system. It is safe to say that more effort should be expended on validation, and on backup and recovery procedures, than is spent on those processes that handle validated data and the routine tasks of application processing.

Failure to include adequate backup and recovery procedures will jeopardize the operational integrity of an entire organization. The reliability of contemporary computer hardware together with built-in redundancy makes outages due to equipment failure less likely. Unfortunately, the same cannot be said for software and procedures. In any case, operational systems do fail, although rarely. Adequate backup and recovery processes and procedures have to be put in place to handle system failures. These processes and procedures range from disaster recovery to single system or application recovery. To ensure that sites, enterprises, and applications can be brought back after failures requires that control processes and procedures be put in place during the design and development process and kept up to date as the situation requires.

52.6 FURTHER READING

Post, G. V. Database Management Systems: Designing and Building Business Applications. New York: McGraw-Hill, 2004.

Romney, M. B., and P. J. Steinbart. Accounting Information Systems, 11th ed. Upper Saddle River, NJ: Prentice Hall, 2008.

Shelley, G. B., T. J. Cashman, and H. J. Rosenblatt. Systems Analysis and Design, 4th ed. Boston: Course Technology, 2001.

Sommerville, I. Software Engineering, 8th ed. Boston: Addison Wesley, 2006.

Stair, R. M., and G. W. Reynolds. Principles of Information Systems, 5th ed. Boston: Course Technology, 2001.

52.7 NOTE

1. Meine van der Meulen, “Large Accident at CINDU Plant in The Netherlands (additional info),” RISKS 14, No. 22, January 4, 1993.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.15.177