Choosing between MODIFY and UPDATE

You can use either the MODIFY or UPDATE statement to update a master data set with information in a transaction data set. Chapter 6 includes examples that use the UPDATE statement. Chapter 7 includes examples that use the MODIFY statement.

The MODIFY statement has many applications while the UPDATE statement is limited to updating a master data set. You can use the MODIFY statement to perform the following tasks:

  • process a file sequentially to apply updates in place (without a BY statement)

  • make changes to a master data set in place by applying transactions from a transaction data set

  • update the values of variables by directly accessing observations based on observation numbers

  • update the values of variables by directly accessing observations based on the values of one or more key variables

Only one application of MODIFY is comparable to UPDATE: using MODIFY with the BY statement to apply transactions to a data set. While MODIFY is a more powerful tool than UPDATE, UPDATE is still the tool of choice in some cases. Table 1.4 helps you choose whether to use UPDATE or MODIFY with BY.

Table 1.4. UPDATE versus MODIFY with BY
IssueMODIFY with BYUPDATE
Disk spaceSaves disk space because it updates data in place.Requires more disk space because it produces an updated copy of the data set.
Sort and indexFor good performance, it is strongly recommended that both data sets be sorted and that the master data set be indexed.Requires that both data sets be sorted.
When to useUse only when you expect to process a small portion of the data set.Use if you expect to process most of the data set.
Duplicate BY valuesAllows duplicate BY values in both the master and transaction data sets.Allows duplicate BY values in only the transaction data set.
Scope of changesCannot change the data set descriptor information, so changes such as adding or deleting variables or variable labels are not valid.Can make changes that require a change in the descriptor portion of a data set, such as adding new variables.
Error checkingAutomatically generates the _IORC_ return code variable whose value can be examined for error checking.Needs no error checking because transactions without a corresponding master record are not applied, but are added to the data set.
Data set integrityData can only be partially updated due to an abnormal task termination.No data loss occurs because UPDATE works on a copy of the data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.180.131