Update a master data set with values from another data set. Do not replace a nonmissing value in the master data set with a missing value when the value for a common variable in the transaction data is missing. Also add a new variable to the updated master data set.
Examples 6.1, 6.2, 6.3, 6.4, and 6.5 illustrate ways to update common variables when combining data sets. The five examples apply the methods to the same master data set. Different transaction data sets are used in Examples 6.1, 6.2, and 6.3. Examples 6.4 and 6.5 use the same transaction data set and produce different resulting data sets.
Featured Step | DATA step |
Featured Step Options and Statements | UPDATE and BY statements |
Related Technique | PROC SQL, full join using the ON clause, COALESCE function |
A Closer Look | Allowing Missing Values to Replace Existing Values in the Master Data Set When Updating a Master Data Set |
Comparing the UPDATE and MERGE Statements | |
Using UPDATE and MERGE When There Are Multiple Observations in a BY Group |
Master data set BOOKLIST contains information about the chapters of three books. Transaction data set BOOKUPDATES contains updates of information that should be applied to BOOKLIST. It also has variable UPDATEDBY, which does not exist in BOOKLIST. Both data sets have unique occurrences of the combinations of values of BOOKID and CHAPTER.
BOOKLIST Obs bookid booktitle chapter author duedate editor 1 NF0586 Current Narratives 1 Smith, Rebecca 09/04/2010 Williams, Stephanie 2 NF0586 Current Narratives 2 Williams, Susan 09/04/2010 Williams, Stephanie 3 NF0586 Current Narratives 3 Torres, Christopher 09/11/2010 Williams, Stephanie 4 NF0586 Current Narratives 4 Torres, Christopher 09/11/2010 Williams, Stephanie 5 NF0586 Current Narratives 5 Powell, George 09/11/2010 Williams, Stephanie 6 NF0586 Current Narratives 6 Thompson, Tonya 09/11/2010 Williams, Stephanie 7 NF0586 Current Narratives 7 Allen, Linda 09/11/2010 Williams, Stephanie 8 NF0586 Current Narratives 8 Johnson, Tammy 09/11/2010 Williams, Stephanie 9 NF0586 Current Narratives 9 Kelly, Melissa 09/11/2010 Williams, Stephanie 10 NF0586 Current Narratives 10 Thompson, Tonya 09/11/2010 Williams, Stephanie 11 NF2413 Political Comments 1 Jones, Robin 07/31/2010 White, Michelle 12 NF2413 Political Comments 2 Sanchez, Brandon 08/07/2010 White, Michelle 13 NF2413 Political Comments 3 Jones, Robin 07/31/2010 White, Michelle 14 NF2413 Political Comments 4 Perez, Joshua 07/31/2010 White, Michelle 15 NF2413 Political Comments 5 Williams, Nicholas 07/31/2010 White, Michelle 16 NF2413 Political Comments 6 Patterson, Mary 08/14/2010 White, Michelle 17 NF2413 Political Comments 7 Torres, Christopher 08/07/2010 White, Michelle 18 NF2413 Political Comments 8 Robinson, Bonnie 08/07/2010 White, Michelle 19 NF2413 Political Comments 9 Brown, Patricia 08/07/2010 White, Michelle 20 NF8141 Favorite Essays 1 Clark, Todd 10/02/2010 Patterson, Daniel 21 NF8141 Favorite Essays 2 Barnes, David 10/02/2010 Patterson, Daniel 22 NF8141 Favorite Essays 3 Young, Richard 09/18/2010 Patterson, Daniel 23 NF8141 Favorite Essays 4 Barnes, David 10/02/2010 Patterson, Daniel 24 NF8141 Favorite Essays 5 Anderson, Daniel 09/18/2010 Patterson, Daniel 25 NF8141 Favorite Essays 6 Anderson, Daniel 09/18/2010 Patterson, Daniel 26 NF8141 Favorite Essays 7 Morris, Laura 09/18/2010 Patterson, Daniel 27 NF8141 Favorite Essays 8 Powell, George 09/18/2010 Patterson, Daniel BOOKUPDATES Obs bookid chapter author duedate updatedby 1 NF0586 4 Banks, James . JWE 2 NF0586 9 King, Weston 09/18/2010 JWE 3 NF8141 6 10/02/2010 SAW
Output 6.1a REVISEDLIST Data SetExample 6.1 REVISEDLIST Data Set Created with DATA Step Obs bookid booktitle chapter author duedate editor updatedby 1 NF0586 Current Narratives 1 Smith, Rebecca 09/04/2010 Williams, Stephanie 2 NF0586 Current Narratives 2 Williams, Susan 09/04/2010 Williams, Stephanie 3 NF0586 Current Narratives 3 Torres, Christopher 09/11/2010 Williams, Stephanie 4 NF0586 Current Narratives 4 Banks, James 09/11/2010 Williams, Stephanie JWE 5 NF0586 Current Narratives 5 Powell, George 09/11/2010 Williams, Stephanie 6 NF0586 Current Narratives 6 Thompson, Tonya 09/11/2010 Williams, Stephanie 7 NF0586 Current Narratives 7 Allen, Linda 09/11/2010 Williams, Stephanie 8 NF0586 Current Narratives 8 Johnson, Tammy 09/11/2010 Williams, Stephanie 9 NF0586 Current Narratives 9 King, Weston 09/18/2010 Williams, Stephanie JWE 10 NF0586 Current Narratives 10 Thompson, Tonya 09/11/2010 Williams, Stephanie 11 NF2413 Political Comments 1 Jones, Robin 07/31/2010 White, Michelle 12 NF2413 Political Comments 2 Sanchez, Brandon 08/07/2010 White, Michelle 13 NF2413 Political Comments 3 Jones, Robin 07/31/2010 White, Michelle 14 NF2413 Political Comments 4 Perez, Joshua 07/31/2010 White, Michelle 15 NF2413 Political Comments 5 Williams, Nicholas 07/31/2010 White, Michelle 16 NF2413 Political Comments 6 Patterson, Mary 08/14/2010 White, Michelle 17 NF2413 Political Comments 7 Torres, Christopher 08/07/2010 White, Michelle 18 NF2413 Political Comments 8 Robinson, Bonnie 08/07/2010 White, Michelle 19 NF2413 Political Comments 9 Brown, Patricia 08/07/2010 White, Michelle 20 NF8141 Favorite Essays 1 Clark, Todd 10/02/2010 Patterson, Daniel 21 NF8141 Favorite Essays 2 Barnes, David 10/02/2010 Patterson, Daniel 22 NF8141 Favorite Essays 3 Young, Richard 09/18/2010 Patterson, Daniel 23 NF8141 Favorite Essays 4 Barnes, David 10/02/2010 Patterson, Daniel 24 NF8141 Favorite Essays 5 Anderson, Daniel 09/18/2010 Patterson, Daniel 25 NF8141 Favorite Essays 6 Anderson, Daniel 10/02/2010 Patterson, Daniel SAW 26 NF8141 Favorite Essays 7 Morris, Laura 09/18/2010 Patterson, Daniel 27 NF8141 Favorite Essays 8 Powell, George 09/18/2010 Patterson, Daniel |
This example shows you how to update a master data set with only nonmissing values from observations in a transaction data set. The UPDATE and BY statements in the DATA step update the master data set for observations in the two data sets that match according to the values of the BY variables.
Data set BOOKLIST is the master data set, and it tracks information for the chapters of three books. The transaction data set BOOKUPDATES contains updated information about selected chapters in the books.
No duplicates are present in either the master data set or the transaction data set. Additional programming statements are required when duplicate BY values exist. The only variables that are present in the transaction data set besides the matching variables are the variables to be updated. This program also adds one new variable to the output data set that exists in the transaction data set.
The DATA step updates the two variables, AUTHOR and DUEDATE, in data set BOOKLIST with information from BOOKUPDATES when a variable value is not missing. It uses the default action of the UPDATE and BY statements to perform the updates. Observations are matched by two BY variables, BOOKID and CHAPTER.
Missing values are present in the first and third observations of the transaction data set BOOKUPDATES.
Assume both data sets were sorted or indexed by BOOKID and CHAPTER prior to the DATA step.
A MERGE statement could be used instead of the UPDATE statement. However, in this example, it is simpler to use the UPDATE statement. You can take advantage of its default action of not replacing a variable's nonmissing value in the master data set with a missing value when the variable value is missing in the matching observation in the transaction data set. For more information about when to choose UPDATE or MERGE, see "Comparing the UPDATE and MERGE Statements" in the following "A Closer Look" section and SAS documentation.
Create data set REVISEDLIST. Update data set BOOKLIST with information in BOOKUPDATES. Place the master data set first in the UPDATE statement. Specify how to match observations in the two data sets.
data revisedlist; update booklist bookupdates; by bookid chapter; run;
The following PROC SQL step creates a table equivalent to the data set that was created by the DATA step in the main example. A full join using the ON clause combines the two tables so that all matching and nonmatching rows in the two input tables are present in output table REVISEDBOOKLIST along with updates of specific rows that were contributed from BOOKUPDATES.
The COALESCE function performs the update of columns AUTHOR and DUEDATE in BOOKLIST. The first argument to each call to COALESCE is the column from BOOKUPDATES. The second argument is the column from BOOKLIST. The COALESCE function checks each of its arguments from left to right until it finds a nonmissing value. When the first argument is nonmissing, PROC SQL replaces the value in BOOKLIST with the value from BOOKUPDATES. When the first argument is missing, the value in BOOKLIST for the column remains the same.
The step also applies the COALESCE function to the matching columns, BOOKID and CHAPTER. In this example, you could select BOOKID and CHAPTER from the master table (B.BOOKID and B.CHAPTER) because all the transactions have a match in the master data set. However, if you have rows in your transaction table that do not have a match in your master table, the values of B.BOOKID are missing. Using the COALESCE function ensures that the unmatched values for the matching columns are copied to the output table.
Missing values are present in the first and third rows of table BOOKUPDATES. The missing values are not copied to REVISEDBOOKLIST.
The SELECT statement includes column UPDATEDBY, which is found only in the transaction table. Compared to the main example, you need to specify all columns you want to include in the output table when using PROC SQL. Unless you use the DROP or KEEP statements or options, the UPDATE statement in the main example copies all new variables in the transaction data set to the updated data set.
Create table REVISEDLIST. Apply the COALESCE function to the two matching columns so that if there are nonmatches in the transaction table, the nonmissing unmatched value is copied to the output table. Apply the COALESCE function to the columns that are being updated. Put the column contributed from the transaction table BOOKUPDATES first so that when it is nonmissing, it replaces the value in BOOKLIST.
Combine the two tables with a full join so that all matching and nonmatching rows are present in the output table. Specify an alias for each table. Specify the columns that match the two tables.
proc sql; create table revisedlist as select coalesce(u.bookid,b.bookid) as bookid, booktitle, coalesce(u.chapter,b.chapter) as chapter, coalesce(u.author,b.author) as author, coalesce(u.duedate,b.duedate) as duedate format=mmddyy10., editor, updatedby from booklist b full join bookupdates u on b.bookid=u.bookid and b.chapter=u.chapter; quit;
The DATA step in the preceding main example relies on the default action of the UPDATE statement that prevents missing values in the transaction data set from replacing existing values in the master data set. This action is controlled by UPDATE statement option UPDATEMODE=. This option can have one of two values: MISSINGCHECK or NOMISSINGCHECK. The default value for UPDATEMODE= is MISSINGCHECK. Because the default action is what this example requires, it was not necessary to add option UPDATEMODE= to the UPDATE statement.
If you want to allow missing values in the transaction data set to replace existing values in the master data set, add UPDATEMODE=NOMISSINGCHECK to the UPDATE statement. The DATA step in the main example is modified here to include this option:
data revisedmissing; update booklist bookupdates updatemode=nomissingcheck; by bookid chapter; run;
Output 6.1b displays a PROC PRINT of the output data set REVISEDMISSING that was created by the preceding DATA step. Two missing values have replaced existing values in the updated data set. These correspond to the two missing values in the transaction data set:
variable DUEDATE for BOOKID= "NF0586" CHAPTER=4
variable AUTHOR for BOOKID= "NF8141" CHAPTER=6
Output 6.1b REVISEDMISSING Data SetExample 6.1 REVISEDMISSING Data Set Created with DATA Step Obs bookid booktitle chapter author duedate editor updatedby 1 NF0586 Current Narratives 1 Smith, Rebecca 09/04/2010 Williams, Stephanie 2 NF0586 Current Narratives 2 Williams, Susan 09/04/2010 Williams, Stephanie 3 NF0586 Current Narratives 3 Torres, Christopher 09/11/2010 Williams, Stephanie 4 NF0586 Current Narratives 4 Banks, James . Williams, Stephanie JWE 5 NF0586 Current Narratives 5 Powell, George 09/11/2010 Williams, Stephanie 6 NF0586 Current Narratives 6 Thompson, Tonya 09/11/2010 Williams, Stephanie 7 NF0586 Current Narratives 7 Allen, Linda 09/11/2010 Williams, Stephanie 8 NF0586 Current Narratives 8 Johnson, Tammy 09/11/2010 Williams, Stephanie 9 NF0586 Current Narratives 9 King, Weston 09/18/2010 Williams, Stephanie JWE 10 NF0586 Current Narratives 10 Thompson, Tonya 09/11/2010 Williams, Stephanie 11 NF2413 Political Comments 1 Jones, Robin 07/31/2010 White, Michelle 12 NF2413 Political Comments 2 Sanchez, Brandon 08/07/2010 White, Michelle 13 NF2413 Political Comments 3 Jones, Robin 07/31/2010 White, Michelle 14 NF2413 Political Comments 4 Perez, Joshua 07/31/2010 White, Michelle 15 NF2413 Political Comments 5 Williams, Nicholas 07/31/2010 White, Michelle 16 NF2413 Political Comments 6 Patterson, Mary 08/14/2010 White, Michelle 17 NF2413 Political Comments 7 Torres, Christopher 08/07/2010 White, Michelle 18 NF2413 Political Comments 8 Robinson, Bonnie 08/07/2010 White, Michelle 19 NF2413 Political Comments 9 Brown, Patricia 08/07/2010 White, Michelle 20 NF8141 Favorite Essays 1 Clark, Todd 10/02/2010 Patterson, Daniel 21 NF8141 Favorite Essays 2 Barnes, David 10/02/2010 Patterson, Daniel 22 NF8141 Favorite Essays 3 Young, Richard 09/18/2010 Patterson, Daniel 23 NF8141 Favorite Essays 4 Barnes, David 10/02/2010 Patterson, Daniel 24 NF8141 Favorite Essays 5 Anderson, Daniel 09/18/2010 Patterson, Daniel 25 NF8141 Favorite Essays 6 10/02/2010 Patterson, Daniel SAW 26 NF8141 Favorite Essays 7 Morris, Laura 09/18/2010 Patterson, Daniel 27 NF8141 Favorite Essays 8 Powell, George 09/18/2010 Patterson, Daniel |
The UPDATE and MERGE statements can perform the similar task of updating a master data set with information that is supplied in a transaction data set. It is important to understand their usage differences so that you choose the better tool for your application. Table 6.1 presents an abridged comparison of the two statements. For detailed information about these statements, see SAS documentation.
UPDATE Statement | MERGE Statement |
---|---|
Can process only two data sets at a time | Can process an unlimited number of data sets |
Can update and add observations to the master data set | Can update and add observations to the output data set |
Can add and delete variables in the master data set | Can add and delete variables in the output data set |
Must list master data set first | Usually list master data set first, but not required |
Requires a BY statement | Updating usually uses a BY statement, but not required |
Requires the master and transaction data sets be sorted or indexed by the matching variables in the BY statement | Updating usually expects the input data sets to be sorted or indexed by the matching variables in the BY statement, but not required |
Must be no more than one observation in the BY group in the master data set (the DATA step will execute but with warnings, and SAS sets the ERROR variable to 1) | Updating usually expects no more than one observation in the BY group in the master data set, but not required (this condition does not set the ERROR variable to 1) |
Outputs observation at the end of the BY group | Outputs each observation at the bottom of the DATA step or by explicit OUTPUT statement |
Does not replace missing values in the master data set with missing values found in the matched observations in the transaction data set unless the UPDATE statement option UPDATEMODE= is set to NOMISSINGCHECK or if using special missing values | Replaces existing values in the first data set with missing values found in the matched observations in the second data set if the variables have the same names |
The next section describes with examples how the UPDATE and MERGE statements process your master and transaction data sets when they have multiple observations in a BY group.
The UPDATE and MERGE statements differ in how they combine data sets when either the master or transaction data set contains multiple observations per BY group. The next code samples contrast how the two statements handle these duplicates.
Transaction data set with multiple observations in a BY group: When you use the UPDATE statement and have more than one observation in the transaction data set in a BY group, only the last in the series of multiple observations updates the matching observation in the master data set. When you use the MERGE statement and have more than one observation in the transaction data set in a BY group, all transaction observations in that BY group are added to the master data set.
For example, the following DATA step creates the transaction data set DUPDATES and it has three observations in the BY group of BOOKID= "NF8141" and CHAPTER=6.
data dupdates; input bookid $ 1-6 chapter duedate : mmddyy10.; format duedate mmddyy10.; datalines; NF8141 6 10/02/2010 NF8141 6 10/22/2010 NF8141 6 11/01/2010 ;;;;
A DATA step that uses the UPDATE statement and a DATA step that uses the MERGE statement follow. Both DATA steps apply the transactions in DUPDATES to the original BOOKLIST data set that was created in the main example, and the two DATA steps produce different output data sets.
Using the UPDATE Statement | Using the MERGE Statement |
---|---|
data upddups; update booklist dupdates; by bookid chapter; run; | data mrgdups; merge booklist dupdates; by bookid chapter; run; |
PROC PRINTs of UPDDUPS and MRGDUPS show the differences in the results. Data set UPDDUPS has only one observation for BOOKID= "NF8141" CHAPTER=6, and this observation has the value for DUEDATE equal to the value of DUEDATE in the las observation in DUPDATES (11/01/2010). Data set MRGDUPS has three observations for BOOKID= "NF8141" CHAPTER=6, one for each of the three observations in MRGDUPS.
The output lists only variables BOOKID, CHAPTER, and DUEDATE for the observations where BOOKID= "NF8141". The rows for CHAPTER=6 in each output display are highlighted.
Using the UPDATE Statement | Using the MERGE Statement |
---|---|
UPDDUPS Obs bookid chapter duedate 20 NF8141 1 10/02/2010 21 NF8141 2 10/02/2010 22 NF8141 3 09/18/2010 23 NF8141 4 10/02/2010 24 NF8141 5 09/18/2010 25 NF8141 6 11/01/2010 26 NF8141 7 09/18/2010 27 NF8141 8 09/18/2010 | MRGDUPS Obs bookid chapter duedate 20 NF8141 1 10/02/2010 21 NF8141 2 10/02/2010 22 NF8141 3 09/18/2010 23 NF8141 4 10/02/2010 24 NF8141 5 09/18/2010 25 NF8141 6 10/02/2010 26 NF8141 6 10/22/2010 27 NF8141 6 11/01/2010 28 NF8141 7 09/18/2010 29 NF8141 7 09/18/2010 |
Master data set with multiple observations in a BY group: When you use the UPDATE statement and have more than one observation in the master data set in a BY group, the variables in common between the master and transaction data sets on the first matching observation in the BY group are updated. The variables in common between the master and transaction data sets in subsequent matching observations in that BY group of multiple observations are not updated. Additionally, values for any new variables that are present in the transaction data set are copied only to the first matching observation in a BY group with multiple observations. Missing values are assigned to these new variables for all subsequent matching observations in that BY group.
SAS issues a WARNING and sets the automatic variable _ERROR_ to 1 when your master data set contains more than one observation in a BY group. The DATA step can execute, but this is a process you should program carefully because the design of the UPDATE statement is to process only one observation per BY group in the master data set. Here is the WARNING message that SAS displays in the situation of multiple observations per BY group:
WARNING: The MASTER data set contains more than one observation for a BY group.
When you use the MERGE statement and have more than one observation in the master data set in a BY group, only the variables in common between the master and transaction data sets in the first matching observation in the BY group are updated. This part of the MERGE statement process is identical to that of the UPDATE statement.
The process of adding new variables when using the MERGE statement differs from that of the UPDATE statement. With the MERGE statement, the values for any new variables that are present in the transaction data set are copied to all matching observations in the series of duplicates, not just the first observation as is done with the UPDATE statement. The DATA step loads the observation in the transaction data set into the Program Data Vector (PDV). With new variables found only in the transaction data set, the values for those variables remain in the PDV until the next observation in the transaction data set is processed (unless you change them with SAS language statements in the DATA step).
Unlike the UPDATE statement, the MERGE statement does not consider multiple observations in a BY group to be an error, and it does not write a message to the SAS log. However, SAS does write a warning if more than one of the data sets in your MERGE statement has multiple observations in a BY group.
For example, the following DATA step creates the transaction data set NEWTITLES, and it has one observation with the BY-variable value of BOOKID= "NF8141." The variable VERSION in NEWTITLES is not present in the master data set BOOKLIST.
data newtitles; input bookid $ 1-6 booktitle $ 8-25 version; datalines; NF8141 Popular Essays 2 ;;;;
The master data set BOOKLIST has eight observations with the BY-variable value of BOOKID= "NF8141".
A DATA step that uses the UPDATE statement and a DATA step that uses the MERGE statement follow. Both DATA steps apply the transactions in NEWTITLES to the original BOOKLIST data set that was created in the main example, and they produce different output data sets. Note that these DATA steps have only one BY variable (BOOKID) compared to the two BY variables, BOOKID and CHAPTER, in the preceding DATA steps.
Using the UPDATE Statement | Using the MERGE Statement |
---|---|
data updtitles; update booklist newtitles; by bookid; run; | data mrgtitles; merge booklist newtitles; by bookid; run; |
PROC PRINTs of UPDTITLES and MRGTITLES show the similarities and differences in the results. Both data sets have an updated value for BOOKTITLE in only the first of the eight observations for BOOKID= "NF8141". A nonmissing value for VERSION is found only on the first of the eight observations for BOOKID= "NF8141" in data set UPDTITLES, while all eight observations for BOOKID= "NF8141" in data set MRGTITLES have a value for VERSION.
The following output lists only variables BOOKID, BOOKTITLE, CHAPTER, and VERSION for the observations where BOOKID= "NF8141".
Remember that using the UPDATE statement when there are multiple observations in a BY group in the master data set causes SAS to write warnings to the SAS log and to set the automatic variable _ERROR_ to 1.
Using the UPDATE Statement | Using the MERGE Statement |
---|---|
UPDTITLES Obs bookid booktitle chapter version 20 NF8141 Popular Essays 1 2 21 NF8141 Favorite Essays 2 . 22 NF8141 Favorite Essays 3 . 23 NF8141 Favorite Essays 4 . 24 NF8141 Favorite Essays 5 . 25 NF8141 Favorite Essays 6 . 26 NF8141 Favorite Essays 7 . 27 NF8141 Favorite Essays 8 . | MRGTITLES Obs bookid booktitle chapter version 20 NF8141 Popular Essays 1 2 21 NF8141 Favorite Essays 2 2 22 NF8141 Favorite Essays 3 2 23 NF8141 Favorite Essays 4 2 24 NF8141 Favorite Essays 5 2 25 NF8141 Favorite Essays 6 2 26 NF8141 Favorite Essays 7 2 27 NF8141 Favorite Essays 8 2 |
See Example 6.4 for how to write a DATA step that uses the MERGE statement to update a master data set that has multiple observations in a BY group.
Master and transaction data sets with multiple observations in a BY group: Updating or merging data sets when both master and transaction data sets have multiple observations in a BY group can be problematic. You should understand your data well before programming a DATA step where this situation exists.
The concepts that were illustrated in the previous two sections apply. When using the UPDATE statement, only the last observation in a BY group of multiple observations updates the first observation in the matching BY group of multiple observations. When using the MERGE statement, one-to-one matching occurs between observations in the master and transaction data sets within the matching BY group of multiple observations.
For example, the following DATA step creates the transaction data set MULTTITLES, and it has two observations with the BY-variable value of BOOKID= "NF8141". The variable VERSION in MULTTITLES is not present in the master data set BOOKLIST.
data multtitles; input bookid $ 1-6 booktitle $ 8-25 version; datalines; NF8141 Popular Essays 2 NF8141 Essays for All 3 ;;;;
The master data set BOOKLIST has eight observations with the BY-variable value of BOOKID= "NF8141".
A DATA step that uses the UPDATE statement and a DATA step that uses the MERGE statement follow. Both DATA steps apply the transactions in MULTTITLES to the original BOOKLIST data set that was created in the main example, and they produce different output data sets. Note that these DATA steps have only one BY variable (BOOKID) compared to the two BY variables, BOOKID and CHAPTER, in some of the preceding DATA steps.
Using the UPDATE Statement | Using the MERGE Statement |
---|---|
data updmult; update booklist multtitles; by bookid; run; | data mrgmult; merge booklist multtitles; by bookid; run; |
The following output for data set UPDMULT shows that only the first of the eight observations for BOOKID= "NF8141" has an updated value for BOOKTITLE and that this value comes from the second observation in MULTTITLES. The value of VERSION on the first observation also comes from the second observation in MULTTITLES.
The output for MRGMULT shows for BY group BOOKID= "NF8141" one-to-one matching of the two observations in MULTTITLES to the first two observations in BOOKLIST. The first observation in the BY group has values for BOOKTITLE and VERSION that were copied from the first observation in MULTTITLES. The second observation in the BY group has values for BOOKTITLE and VERSION copied from the second observation in MULTTITLES. The remaining six observations also have a value of VERSION from the second observation in MULTTITLES. Similar to the previous MERGE statement example, the value of VERSION from the second observation in MULTITLES is moved to the PDV and remains there until the next BY group is processed.
The following output lists only variables BOOKID, BOOKTITLE, CHAPTER, and VERSION for the observations where BOOKID= "NF8141".
Use of the UPDATE statement when there are multiple observations in the master data set BY group causes SAS to write warnings to the SAS log and to set the automatic variable _ERROR_ to 1.
When there are multiple observations in more than one of the data sets in the MERGE statement, SAS writes the following note to the SAS log and does not flag this as a warning or error.
NOTE: MERGE statement has more than one data set with repeats of BY values.
Using the UPDATE Statement | Using the MERGE Statement |
---|---|
UPDMULT Obs bookid booktitle chapter version 20 NF8141 Essays for All 1 3 21 NF8141 Favorite Essays 2 . 22 NF8141 Favorite Essays 3 . 23 NF8141 Favorite Essays 4 . 24 NF8141 Favorite Essays 5 . 25 NF8141 Favorite Essays 6 . 26 NF8141 Favorite Essays 7 . 27 NF8141 Favorite Essays 8 . | MRGMULT Obs bookid booktitle chapter version 20 NF8141 Popular Essays 1 2 21 NF8141 Essays for All 2 3 22 NF8141 Favorite Essays 3 3 23 NF8141 Favorite Essays 4 3 24 NF8141 Favorite Essays 5 3 25 NF8141 Favorite Essays 6 3 26 NF8141 Favorite Essays 7 3 27 NF8141 Favorite Essays 8 3 |
3.147.45.90