Executing a PDI job from the Pentaho User Console

The Pentaho User Console (PUC) is a web application included with the Pentaho Server conveniently built for you to generate reports, browse cubes, explore dashboards, and more. Among the list of tasks, you can do is the ability of running Kettle jobs. As said in the previous recipe, everything in the Pentaho platform is made up of action sequences. Therefore, if you intend to run a job from the PUC, you have to create an action sequence that does it.

For this recipe, you will use a job which simply deletes all files with extension tmp found in a given folder. The objective is to run the job from the PUC through an action sequence.

Getting ready

In order to follow this recipe, you will need a basic understanding of action sequences and at least some experience with the Pentaho BI Server and Pentaho Design Studio, the action sequences editor.

Before proceeding, make sure you have a Pentaho BI Server running. You will also need Pentaho Design Studio; you can download the latest version from the following URL:

http://sourceforge.net/projects/pentaho/files/Design%20Studio/

Besides, you will need a job like the one described in the introduction of the recipe. The job should have a named parameter called TMP_FOLDER and simply delete all files with extension .tmp found in that folder.

You can develop the job before proceeding (call it delete_files.kjb), or download it from the book's site.

Finally, pick a directory on your computer (or create one) with some tmp files for deleting.

How to do it...

This recipe is split into two parts: First, you will create the action sequence and then you will test the action sequence from the PUC.

  1. Launch Design Studio. If it is the first time you do it, create the solution project where you will save your work.
  2. Copy the sample job to the solution folder.
  3. Create a new action sequence and save it in your solution project with the name delete_files.xaction.
  4. Define an input that will be used as the parameter for your job: folder. As Default Value, type the name of the folder with the .tmp files, for example, c:myfolder.
  5. Add a process action by selecting Execute | Pentaho Data Integration Job.
  6. Now, you will fill in the Input Section of the process action configuration. Give the process action a name.
  7. As Job File, type solution:delete_files.kjb.
  8. In the Job Inputs, add the only input you have: folder.
  9. Select the XML source tab.
  10. Search for the<action-definition> tag that contains the following line:
    <component-name>KettleComponent</component-name>
    
  11. You will find something similar to the following:
    <action-definition>
    <component-name>KettleComponent</component-name>
    <action-type>Pentaho Data Integration Job</action-type>
    <action-inputs>
    <folder type="string"/>
    </action-inputs>
    <action-resources>
    <job-file type="resource"/>
    </action-resources>
    <action-outputs/>
    <component-definition/>
    </action-definition>
    
  12. Replace the<component-definition/> tag with the following:
    <component-definition>
    <set-parameter>
    <name>TMP_FOLDER</name>
    <mapping>folder</mapping>
    </set-parameter>
    </component-definition>
    
  13. Save the file.

Now, it is time to test the action sequence that you just created.

  1. Login to the Pentaho BI Server and refresh the repository.
  2. Browse the solution folders and look for the delete_files action you just created. Double-click on it.
  3. You should see a window with the legend Action Successful.
  4. You can take a look at the Pentaho console to see the log of the job.
  5. Take a look at the folder defined in the input of your action sequence. There should be no tmp files.

How it works...

You can run Kettle jobs as part of an action sequence by using the Pentaho Data Integration Job process action located within the Execute category of process actions.

The main task of a PDI Job process action is to run a Kettle job. In order to do that, it has a series of checks and textboxes where you specify everything you need to run the job, and everything you want to receive back after having run it.

The most important setting in the PDI process action is the name and location of the job to be executed. In this example, you had a .kjb file in the same location as the action sequence, so you simply typed solution: followed by the name of the file.

You can specify the Kettle log level, but it is not mandatory. In this case, you left the log level empty. The log level you select here (or Basic, by default) is the level of log that Kettle writes to the Pentaho console when the job runs.

Besides the name and location of the job, you had to provide the name of the folder needed by the job. In order to do that, you created an input named folder and then you passed it to the job. You did it in the XML code by putting the name of the input enclosed between<set-parameter> and</set-parameter>.

When you run the action sequence, the job was executed deleting all .tmp files in the given folder.

Note

Note that the action sequence in this recipe has just one process action (the PDI Job). This was made on purpose to keep the recipe simple, but it could have had other actions as well, just like any action sequence.

There's more...

The main reason for embedding a job in an action sequence is for scheduling its execution with the Pentaho scheduling services. This is an alternative approach to the use of a system utility such as cron in Unix-based operating systems or the Task Scheduler in Windows.

See also

  • The recipe named Configuring the Pentaho BI Server for running PDI jobs and transformations of this chapter. It is recommended that you see this recipe before trying to run a job from the PUC.
  • The recipe named Executing a PDI transformation as part of a Pentaho process in this chapter. The topics explained in the There's more section apply equally to transformations and jobs. If you want to run a job from a repository, or if you want to know how to pass command-line arguments or variables to a job, then read this section.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.114.85