Pentaho Data Integration is an engine along with a suite of tools responsible for the processes of extracting, transforming, and loading—best known as the ETL processes. This book is meant to teach you how to use PDI.
In this chapter you will:
Before introducing PDI, let's talk about Pentaho BI Suite. The Pentaho Business Intelligence Suite is a collection of software applications intended to create and deliver solutions for decision making. The main functional areas covered by the suite are:
All this functionality can be used standalone as well as integrated. In order to run analysis, reports, and so on integrated as a suite, you have to use the Pentaho BI Platform. The platform has a solution engine, and offers critical services such as authentication, scheduling, security, and web services.
This set of software and services forms a complete BI Platform, which makes Pentaho Suite the world's leading open source Business Intelligence Suite.
Despite being out of the scope of this book, it's worth to briefly introduce the Pentaho Demo. The Pentaho BI Platform Demo is a preconfigured installation that lets you explore several capabilities of the Pentaho platform. It includes sample reports, cubes, and dashboards for Steel Wheels. Steel Wheels is a fictional store that sells all kind of scale replicas of vehicles.
The demo can be downloaded from http://sourceforge.net/projects/pentaho/files/. Under the Business Intelligence Server
folder, look for the latest stable version. The file you have to download is named biserver-ce-3.5.2.stable.zip
for Windows and biserver-ce-3.5.2.stable.tar.gz
for other systems.
In the same folder you will find a file named biserver-getting_started-ce-3.5.0.pdf
. The file is a guide that introduces you the platform and gives you some guidance on how to install and run it. The guide even includes a mini tutorial on building a simple PDI input-output transformation.
You can find more about Pentaho BI Suite at www.pentaho.org.
3.142.255.140