Chapter 8. Integrating Kettle and the Pentaho Suite

In this chapter, we will cover:

  • Creating a Pentaho report with data coming from PDI
  • Configuring the Pentaho BI Server for running PDI jobs and transformations
  • Executing a PDI transformation as part of a Pentaho process
  • Executing a PDI job from the PUC (Pentaho User Console)
  • Generating files from the PUC with PDI and the CDA (Community Data Access) plugin
  • Populating a CDF (Community Dashboard Framework) dashboard with data coming from a PDI transformation

Introduction

Kettle, also known as PDI, is mostly used as a stand-alone application. However, it is not an isolated tool, but part of the Pentaho Business Intelligence Suite. As such, it can also interact with other components of the suite; for example, as the datasource for a report, or as part of a bigger process. This chapter shows you how to run Kettle jobs and transformations in that context.

The chapter assumes a basic knowledge of the Pentaho BI platform and the tools that made up the Pentaho Suite. If you are not familiar with these tools, it is recommended that you visit the wiki page (wiki.pentaho.com) or the Pentaho BI Suite Community Edition (CE) site: http://community.pentaho.com/.

As another option, you can get the Pentaho Solutions book (Wiley) by Roland Bouman and Jos van Dongen that gives you a good introduction to the whole suite.

A sample transformation

The different recipes in this chapter show you how to run Kettle transformations and jobs integrated with several components of the Pentaho BI suite. In order to focus on the integration itself rather than on Kettle development, we have created a sample transformation named weather.ktr that will be used through the different recipes.

The transformation receives the name of a city as the first parameter from the command line, for example Madrid, Spain. Then, it consumes a web service to get the current weather conditions and the forecast for the next five days for that city. The transformation has a couple of named parameters:

Name

Purpose

Default

TEMP

Scale for the temperature to be returned.

It can be C (Celsius) or F (Farenheit)

C

SPEED

Scale for the wind speed to be returned. It can be Kmph or Miles

Kmph

The following diagram shows what the transformation looks like:

A sample transformation

It receives the command-line argument and the named parameters, calls the service, and retrieves the information in the desired scales for temperature and wind speed.

You can download the transformation from the book's site and test it. Do a preview on the next_days, current_conditions, and current_conditions_normalized steps to see what the results look like.

The following is a sample preview of the next_days step:

A sample transformation

The following is a sample preview of the current_conditions step:

A sample transformation

Finally, the following screenshot shows you a sample preview of the current_conditions_normalized step:

A sample transformation

Note

For details about the web service and understanding the results, you can take a look at the recipe named Specifying fields by using XPath notation (Chapter 2, Reading and Writing Files)

There is also another transformation named weather_np.ktr. This transformation does exactly the same, but it reads the city as a named parameter instead of reading it from the command line. The Getting ready sections of each recipe will tell you which of these transformations will be used.

Tip

Avoiding consuming the web service

It may happen that you do not want to consume the web service (for example, for delay reasons), or you cannot do it (for example, if you do not have Internet access). Besides, if you call a free web service like this too often, then your IP might be banned from the service. Don't worry. Along with the sample transformations on the book's site, you will find another version of the transformations that instead of using the web service, reads sample fictional data from a file containing the forecast for over 250 cities. The transformations are weather (file version).ktr and weather_np (file version).ktr. Feel free to use these transformations instead. You should not have any trouble as the parameters and the metadata of the data retrieved are exactly the same as in the transformations explained earlier.

Note

If you use transformations that do not call the web service, remember that they rely on the file with the fictional data (weatheroffline.txt). Wherever you copy the transformations, do not forget to copy that file as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.8.8