Everything in the Pentaho platform is made of action sequences. An action sequence is, as its name suggests, a sequence of atomic actions that together accomplish small processes. Those atomic actions cover a broad spectrum of tasks, for example, getting data from a table in a database, running a piece of JavaScript code, launching a report, sending e-mails, or running a Kettle transformation.
For this recipe, suppose that you want to run the sample transformation to get the current weather conditions for some cities. Instead of running this from the command line, you want to interact with this service from the PUC. You will do it with an action sequence.
In order to follow this recipe, you will need a basic understanding of action sequences and at least some experience with the Pentaho BI Server and Pentaho Design Studio, the action sequences editor.
Before proceeding, make sure that you have a Pentaho BI Server running. You will also need Pentaho Design Studio. You can download the latest version from the following URL:
http://sourceforge.net/projects/pentaho/files/Design%20Studio/
Finally, you will need the sample transformation weather.ktr
.
This recipe is split into two parts: First, you will create the action sequence, and then you will test it from the PUC. So carry out the following steps:
weather.xaction
. city_name
and temperature_scale
. solution:weather.ktr
. For Transformation Step, type current_conditions_normalized
and for Kettle Logging Level, type or select basic. city_name
and temperature_scale
.<action-definition>
tag that contains the following line:<component-name>KettleComponent</component-name>
<action-definition> <component-name>KettleComponent</component-name> <action-type>looking for the current weather</action-type> <action-inputs> <city_name type="string"/> <temperature_scale type="string"/> </action-inputs> <action-resources> <transformation-file type="resource"/> </action-resources> <action-outputs/> <component-definition> <monitor-step><![ CDATA[current_conditions]]></monitor-step> <kettle-logging-level><![ CDATA[basic]]></kettle-logging- level> </component-definition> </action-definition>
<component-definition>
, type the following:<set-parameter> <name>TEMP</name> <mapping>temperature_scale</mapping> </set-parameter> <set-argument> <name>1</name> <mapping>city_name</mapping> </set-argument>
In fact, you can type this anywhere between<component-definition>
and</component-definition>
. The order of the internal tags is not important.
weather_result
and for Output Rows Count Name, type number_of_rows
. number_of_rows==0
. No results for the city {city_name}
. For Output Name, type weather_result
.Now, it is time to test the action sequence that you just created.
weather.xaction
that you just created shows up. my_invented_city
. This time, you will see the following messageAction Successful weather_result=No results for the city my_invented_city
You can run Kettle transformations as part of an action sequence by using the Pentaho Data Integration process action located within the Get Data From category of process actions.
The main task of a PDI process action is to run a Kettle transformation. In order to do that, it has a list of checks and textboxes where you specify everything you need to run the transformation and everything you want to receive back after having run it.
The most important setting in the PDI process action is the name and location of the transformation to be executed. In this example, you had a .ktr
file in the same location as the action sequence, so you simply typed solution:
followed by the name of the file.
Then, in the Transformation Step textbox, you specified the name of the step in the transformation that would give you the results you needed. The PDI process action (just as any regular process action) is capable of receiving input from the action sequence and returning data to be used later in the sequence. Therefore, in the drop-down list in the Transformation Step textbox, you could see the list of available action sequence inputs. In this case, you just typed the name of the step.
If you are not familiar with action sequences, note that the drop-down list in the Transformation Step textbox is not the list of available steps. It is the list of available action sequence inputs.
You have the option of specifying the Kettle log level. In this case, you selected Basic. This was the level of log that Kettle wrote to the Pentaho console. Note that in this case, you also have the option of selecting an action sequence input instead of one of the log levels in the list.
As said earlier, the process action can use any inputs from the action sequence. In this case, you used two inputs: city_name
and temperature_scale
. Then you passed them to the transformation in the XML code:
city_name
between<set-parameter></set-parameter>
, you passed the city_name
input as the first command-line argument. temperature_scale
between<set-argument></set-argument>
, you passed the temperature_scale
to the transformation as the value for the named parameter TEMP
.As mentioned, the process can return data to be used later in the sequence. The textboxes in the Output Section are meant to do that. Each textbox you fill in will be a new data field to be sent to the next process action. In this case, you defined two outputs: weather_result
and number_of_rows
. The first contains the dataset that comes out of the step you defined in Transformation Step; in this case, current_conditions_normalized
. The second has the number of rows in that dataset.
You used those outputs in the next process action. If number_of_rows
was equal to zero, then you would overwrite the weather_result
data with a message to be displayed to the user.
Finally, you added the weather_result
as the output of the action sequence, so that the user either sees the current conditions for the required city, or the custom message indicating that the city was not found.
The following are some variants in the use of the Pentaho Data Integration process action:
When your transformation is in a file, you specify the location by typing or browsing for the name of the file. You have to provide the name relative to the solution folder. In the recipe, the transformation was in the same folder as the action sequence, so you simply typed solution:
followed by the name of the transformation including the extension ktr
.
If instead of having the transformation in a file it is located in a repository, then you should check the Use Kettle Repository option. The Transformation File textbox will be replaced with two textboxes named Directory and Transformation File. In these textboxes, you should type the name of the folder and the transformation exactly as they are in the repository. Alternatively, you can select the names from the available drop-down lists.
If your transformation defines or needs named parameters, Kettle variables or command-line arguments, you can pass them from the action sequence by mapping KettleComponent inputs.
First of all, you need to include them in the Transformation Inputs section. This is equivalent to typing them inside the KettleComponent action-definition XML element.
Then, depending on the kind of data to pass, you have to define a different element:
Element in the transformation |
Element in the action sequence |
---|---|
Command line parameter |
|
Variable |
|
Named parameter |
|
In the recipe, you mapped one command line argument and one named parameter.
With the following lines, you mapped the input named temperature_scale
with the named parameter TEMP:
<set-parameter> <name>TEMP</name> <mapping>temperature_scale</mapping> </set-parameter>
In the case of a variable, the syntax is exactly the same.
In the case of arguments instead of a name, you have to provide the position of the parameter: 1, 2
, and so on.
Design Studio does not implement the capability of mapping inputs with variables or named parameters. Therefore, you have to type the mappings in the XML code. If you just want to pass command-line arguments, then you can skip this task because by default, it is assumed that the inputs you enter are command-line arguments.
This way of providing values for named parameters, variables, and command-line arguments also applies to jobs executed from an action sequence.
Reporting is a classic way of delivering data. In the PUC, you can publish not only Pentaho reports, but also third-party ones, for example, Jasper reports. However, what if the final user simply wants a plain file with some numbers in it? You can avoid the effort of creating it with a reporting tool. Just create a Kettle transformation that does it and call it from an action, in the same way you did in the recipe. This practical example is clearly explained by Nicholas Goodman in his blog post Self Service Data Export using Pentaho. The following is the link to that post, which also includes sample code for downloading:
http://www.nicholasgoodman.com/bt/blog/2009/02/09/self-service-data-export-using-pentaho/
3.145.186.83