Executing a job or a transformation whose name is determined at runtime

Suppose that you have a couple of transformations, but you do not want to run all of them. The transformation to be executed will depend on conditions only known at runtime. If you have just two transformations, you could explicitly call one or the other in a simple fashion. On the other hand, if you have several transformations or if you do not even know the names of the available transformations, you must take another approach. This recipe shows you how.

Suppose that you want to run one of the three sample transformations described in the introduction. The transformation to run will be different depending on the time of day:

  • Before 8:00 in the morning, you will call the Hello transformation
  • Between 8:00 and 20:00, you will call the transformation that generates random numbers
  • From 20:00 to midnight, you will call the transformation that lists files

Here's how to do it.

Getting ready

You will need the transformations described in the introduction. Make sure you have defined the variable ${OUTPUT_FOLDER} with the name of the destination folder. Also, make sure that the folder exists.

Also, define a variable named ${COMMON_DIR} with the path to the folder where you have the sample transformations, for example, c:/my_kettle_work/common.

How to do it...

Carry out the following steps:

  1. Create the transformation that will pick the proper transformation to run.
  2. Drag and drop a Get System Info step and use it to create a field named now with the system date.
  3. Drag and drop a Select Values step and use it to get the current hour. Select the Meta tab; add the field named now, for Type select String, and for Format, type HH. Rename the field as hour.
  4. Drag another Select Values step and use it to change the field hour to Integer.
  5. After the last step, add a Number range step. You will find it in the Transformation category.
  6. Double-click on the step. As Input field: select the field hour and as Output field: type ktr_name. Fill in the grid, as shown in the following screenshot:
    How to do it...
  7. From the Job category, add a Set Variables step and use it to create a variable named KTR_NAME with the value of the field ktr_name. For variable scope type, leave the default Valid in the root job.
  8. Save the transformation and do a preview on the last step. Assuming that it is 3:00 pm, you should see something like the following:
    How to do it...
  9. Save the transformation and create a job.
  10. Drag a Start job entry and two Transformation job entries into the canvas. Link the entries one after the other.
  11. Configure the first Transformation entry to run the transformation just created.
  12. Double-click on the second Transformation entry. For Transformation filename: type ${COMMON_DIR}/${KTR_NAME}.ktr and close the window.
  13. Run the job.
  14. Supposing that it is 3:00 pm, the log should look like the following:
    2010/12/04 15:00:01 - Spoon - Starting job...
    ...
    ... - Set Variable ${KTR_NAME}.0 - Set variable KTR_NAME to value [gen_random]
    ...
    ... - run the transformation - Loading transformation from XML file [C:/my_kettle_work/common/gen_random.ktr]
    ...
    2010/12/04 15:00:02 - Spoon - Job has ended.
    
  15. Browse the output folder (the folder defined in the variable ${OUTPUT_FOLDER}). You should see a new file named random.txt with ten random numbers in it.

Note

Note that this file is generated whenever you run the transformation between 12:00 and 20:00. At a different time of the day, you will see a different output.

How it works...

When you execute a transformation from a job, you can either type the exact name of the transformation, or use a combination of text and variables instead.

In this recipe, you implemented the second option. As you did not know which of the three transformations you had to run, you created a transformation that set a variable with the proper name. Then, in the job, instead of typing the name of the transformation, you used that variable in combination with a variable representing the path to the .ktr file. When you ran the job, the first transformation set the name of the transformation to run depending on the current time. Finally, that transformation was executed.

There's more...

In the recipe, you were sure that no matter the value of the variable ${KTR_NAME} that transformation exists. If you are not sure, it is recommended that you insert File exist entry before the second Transformation entry. With this entry, you should verify a file with the name of the transformation exists before trying to execute it. This way, you avoid your job crashing.

If instead of files you are working with a repository, you can also verify the existence of the transformation. Instead of verifying the existence of a file, you have to run a SELECT statement on the repository database to see if the transformation exists or not. If your transformation is in the root directory of the repository, then this is quite simple, but it can become a little more complicated if your transformation is deep in the transformations directory tree.

Finally, all said so far about transformations is valid for jobs as well. In order to run a job, you can either type its exact name or use a combination of text and variables, just as you did in the recipe for running a transformation.

See also

The recipe named Getting information about transformations and jobs (repository based) in Chapter 9, Getting the Most Out of Kettle. With this recipe, you will understand how to know if a transformation or job exists in a repository.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.47.169