Suppose you want the Bouchard's row before the other rows. You can modify the transformation as follows:
You changed the transformation to give priority to Bouchard's issues.
You made it by using the Append Streams step. By telling that the head hop was the one coming from the Bouchard's file, you got the expected order—first the rows with the tasks assigned to Bouchard, sorted by progress descending, and then the rows with the tasks assigned to other programmers, also sorted by progress descending.
Modify the previous exercise so that the final output is sorted by priority. Try two possible solutions:
Which one do you think would give the best performance?
Refer to the Sort rows step issues in Chapter 3.
In which circumstances would you use the other option?
As you saw in the countries exercises, there are missing countries in the countries.xml
file. In fact, the countries are there, but with different names. For example, Russia
in the contestant file is Russian Federation
in the XML file. Modify the transformation that looks for the language. Split the stream in two—one for the rows where a language was found and the other for the rows where no language was found. For this last stream, use a Value Mapper step to rename the countries you identified as wrong, that is, rename Russia
as Russian Federation
. Then look again for a language now with the new name. Finally, merge the two streams and create the output file with the result.
3.138.101.91