Now you will split your transformation into two smaller transformation so that each meets a specific task. Here are the instructions.
transformations
folder with the name top_scores_flow_preparing.ktr
. transformations
folder with the name top_scores_flow_processing.ktr
. top_scores_flow_preparing
transformation , right-click the step Copy rows to result and select Show output fields. top_scores_flow_processing
transformation, double-click the step Get rows from result. ${Internal.Job.Filename.Directory}/transformations/top_scores_flow_preparing.ktr
as the name of the transformation. ${Internal.Job.Filename.Directory}/transformations/top_scores_flow_processing.ktr
as the name of the transformation.You split the main transformation in two—one for the preparation of data and the other for the generation of the files. Then you embedded the transformations into a job that executed them one after the other. By using the Copy rows to result step, you sent the flow of data outside the transformation, and using Get rows from result step, you picked that data to continue with the flow. The final result was the same as before the change.
Notice that you split the last version of the transformation—the one with the subtransformations inside. You could have split the original. The result would have been exactly the same.
The copy/get rows mechanism allows you to transfer data between two transformations, creating a process flow. The following drawing shows you how it works:
The Copy rows to result step transfers your rows of data to the outside of the transformation. You can then pick that data by using a Get rows from result step. In the preceding image, Transformation A copies the rows and, Transformation B, which executes right after Transformation A, gets the rows. If you create a single transformation with all steps from Transformation A followed by all steps from Transformation B, you would get the same result.
The copy of the dataset is made in memory. It's useful when you have small datasets. For bigger datasets, you should prefer saving the data in a temporary file or database table in the first transformation, and then create the dataset from the file or table in the second transformation.
The Serialize to file /De-serialize from file steps are very useful for this, as the data and the metadata are saved together.
There is no limit to the number of transformations that can be chained using this mechanism. Look at the following image:
As you can see, you may have a transformation that copies the rows, followed by another that gets the rows and copies again, followed by a third transformation that gets the rows, and so on.
Modify the last exercise in the following way:
3.147.74.211