Integer
constant named diff
with value 999
, and a String
constant named age_of_film
with value unknown
. err_code
and err_desc
.You modified the transformation so that you didn't end up discarding the erroneous rows. In the error stream (the stream after the red dotted line), you fixed the rows by putting default values for the new fields. After that you returned the rows to the main stream.
If the errors are not severe enough to discard the rows, if you can somehow guess what data was supposed to be there instead of the error, or if you have default values for erroneous data, you can do your best to fix the errors and send the rows back to the main stream.
What you did instead of discarding the rows with no year information was to fix the rows and send them back to the main stream. The Group by step grouped them under a separate category named unknown.
There are no rules for what to do with bad rows where you handle errors. You always have the option to discard the bad rows or try to fix them. Sometimes you can fix only a few and discard the rest of them. It always depends on your particular data or business rules.
What does the PDI error-handling functionality do:
a. Avoids the happening of unexpected errors
b. Captures errors that happen and discards erroneous rows so you can continue working with valid data
c. Captures errors that happen and sends erroneous rows to a new stream, letting you decide what to do with them
On the Packt website you will find a modified football match file named wcup_modified.txt
. This modified file has some intentional errors.
Download the file and do the following:
var result_desc; result_split = Result.split('-'), home_g = str2num(result_split[0]); away_g = str2num(result_split[1]); if (home_g > away_g) result_desc = Home_Team + ' wins'; else if (home_g < away_g) result_desc = Away_Team + ' wins'; else result_desc = 'Nobody wins';
result_desc
.3.141.47.39