Running the Initial Population Script

Before you run the initial population script in Listing 7.1, you need to perform the following steps.

  • Clearing all the tables, including the staging tables

  • Pre-populating the date dimension

  • Preparing customer and product test data

  • Creating test data in the source database

These steps are discussed in the subsections below.

Clearing the Tables

To clear all tables, use the script in Listing 7.2.

Listing 7.2. Script for truncating the tables
/******************************************************************/
/*                                                                */
/* truncate_tables.sql                                            */
/*                                                                */
/******************************************************************/
USE dw;

TRUNCATE customer_dim;
TRUNCATE product_dim;
TRUNCATE order_dim;
TRUNCATE date_dim;
TRUNCATE sales_order_fact;
TRUNCATE customer_stg;
TRUNCATE product_stg;

USE source;

TRUNCATE sales_order;

/* end of script                                                  */

You run the script in Listing 7.2 by invoking this command.

mysql> . c:mysqlscripts	runcate_tables.sql

MySQL will respond by telling you how many records are deleted from each table.

Database changed
Query OK, 7 rows affected (0.17 sec)

Query OK, 4 rows affected (0.05 sec)

Query OK, 20 rows affected (0.06 sec)

Query OK, 5 rows affected (0.06 sec)

Query OK, 20 rows affected (0.05 sec)

Query OK, 7 rows affected (0.06 sec)

Query OK, 3 rows affected (0.06 sec)

Database changed
Query OK, 8 rows affected (0.07 sec)

Pre-populating the Date Dimension

Once all the tables are empty, you can pre-populate the date dimension for the period of March 1, 2005 to December 31, 2010 by executing the pre_populate_date stored procedure discussed in Chapter 6, “Populating the Date Dimension.” Make sure that the current database is dw by calling:

mysql> use dw;

Then, call the stored procedure, passing the start date and the end date as arguments.

mysql> call pre_populate_date('2005-03-01', '2010-12-31'),

Preparing the Customer and Product Test Data

The next step after pre-populating the date_dim table is to prepare the customer and product test data. I’ve prepared the customer data in a CSV file and the product data in a fixed-width text file.

Here is the customer source data.

CUSTOMER NO,CUSTOMER NAME,STREET ADDRESS,ZIP CODE,CITY,STATE
1,Really Large Customers,7500 Louise Dr.,17050,Mechanicsburg,PA
2,Small Stores,2500 Woodland St.,17055,Pittsburgh,PA
3,Medium Retailers,1111 Ritter Rd.,17055,Pittsburgh,PA
4,Good Companies,9500 Scott St.,17050,Mechanicsburg,PA
5,Wonderful Shops,3333 Rossmoyne Rd.,17050,Mechanicsburg,PA
6,Loyal Clients,7070 Ritter Rd.,17055,Pittsburgh,PA
7,Distinguished Partners,9999 Scott St.,17050,Mechanicsburg,PA

And here is the product data.

PRODUCT CODE,PRODUCT NAME,PRODUCT GROUP
1          Hard Disk Drive                Storage
2          Floppy Drive                   Storage
3          LCD Panel                      Monitor

Preparing the Sales Orders

The last set of test data you need is sales orders. The script in Listing 7.3 can be used to insert 21 sales orders into the sales_order table in the source database.

Listing 7.3. Sales orders for testing initial population
/******************************************************************/
/*                                                                */
/* sales_order_initial.sql                                        */
/*                                                                */
/******************************************************************/

USE source;

INSERT INTO sales_order VALUES
  (1, 1, 1, '2005-02-01', '2005-02-01', 1000)
, (2, 2, 2, '2005-02-10', '2005-02-10', 1000)
, (3, 3, 3, '2005-03-01', '2005-03-01', 4000)
, (4, 4, 1, '2005-04-15', '2005-04-15', 4000)
, (5, 5, 2, '2005-05-20', '2005-05-20', 6000)
, (6, 6, 3, '2005-07-30', '2005-07-30', 6000)
, (7, 7, 1, '2005-09-01', '2005-09-01', 8000)
, (8, 1, 2, '2005-11-10', '2005-11-10', 8000)
, (9, 2, 3, '2006-01-05', '2006-01-05', 1000)
, (10, 3, 1, '2006-02-10', '2006-02-10', 1000)
, (11, 4, 2, '2006-03-15', '2006-03-15', 2000)
, (12, 5, 3, '2006-04-20', '2006-04-20', 2500)
, (13, 6, 1, '2006-05-30', '2006-05-30', 3000)
, (14, 7, 2, '2006-06-01', '2006-06-01', 3500)
, (15, 1, 3, '2006-07-15', '2006-07-15', 4000)
, (16, 2, 1, '2006-08-30', '2006-08-30', 4500)
, (17, 3, 2, '2006-09-05', '2006-09-05', 1000)
, (18, 4, 3, '2006-10-05', '2006-10-05', 1000)
, (19, 5, 1, '2007-01-10', '2007-01-10', 4000)
, (20, 6, 2, '2007-02-20', '2007-02-20', 4000)
, (21, 7, 3, '2007-02-28', '2007-02-28', 4000)
;

/* end of script                                                  */

Now, run the sales_order_initial.sql script by using this command.

mysql> . c:mysqlscriptssales_order_initial.sql

You’ll see that 21 new rows were inserted.

Database changed
Query OK, 21 rows affected (0.05 sec)
Records: 21  Duplicates: 0  Warnings: 0

Note

The historical data to load is from March 1, 2005. Therefore, the first two orders will not be loaded.


Running and Confirming the Initial Population

Now that the test data is ready, it’s time to perform the initial population. First of all, however, you need to set your MySQL date to 28 February, 2007, one day before the start of your data warehouse operation. You run the initial population at the end of the day, when no more data gets in and the source data is ready.

You are now ready to test the initial population. Run the dw_initial.sql population script in Listing 7.1 by using this command.

mysql> . c:mysqlscriptsdw_initial.sql

On your console, you should see the following messages.

Database changed
Query OK, 7 rows affected (0.06 sec)
Records: 7  Deleted: 0  Skipped: 0  Warnings: 0

Query OK, 7 rows affected (0.06 sec)
Records: 7  Duplicates: 0  Warnings: 0

Query OK, 3 rows affected (0.06 sec)
Records: 3  Deleted: 0  Skipped: 0  Warnings: 0

Query OK, 3 rows affected (0.07 sec)
Records: 3  Duplicates: 0  Warnings: 0

Query OK, 19 rows affected (0.06 sec)
Records: 19  Duplicates: 0  Warnings: 0

Query OK, 19 rows affected (0.10 sec)
Records: 19  Duplicates: 0  Warnings: 0

You can use the query in Listing 7.4 to check that nineteen sales orders have been loaded correctly.

Listing 7.4. Query to confirm the sales orders are loaded correctly
/******************************************************************/
/*                                                                */
/* confirm_initial_population.sql                                 */
/*                                                                */
/******************************************************************/
USE dw;

SELECT
  order_number on
, customer_name
, product_name
, date
, order_amount amount
FROM
  sales_order_fact a
, customer_dim b
, product_dim c
, order_dim d
, date_dim e
WHERE
    a.customer_sk = b.customer_sk
AND a.product_sk = c.product_sk
AND a.order_sk = d.order_sk
AND a.order_date_sk = e.date_sk
;

/* end of script                                                  */

Run the query in Listing 7.4 by using this command.

mysql> . c:mysqlscriptsconfirm_initial_population.sql

The result should be as follows.

Database changed
+--+------------------------+-----------------+-----------+--------+
|no| customer_name          | product_name    | date      | amount |
+--+------------------------+-----------------+-----------+--------+
| 3| Medium Retailers       | LCD Panel       | 2005-03-01| 4000.00|
| 4| Good Companies         | Hard Disk Drive | 2005-04-15| 4000.00|
| 5| Wonderful Shops        | Floppy Drive    | 2005-05-20| 6000.00|
| 6| Loyal Clients          | LCD Panel       | 2005-07-30| 6000.00|
| 7| Distinguished Partners | Hard Disk Drive | 2005-09-01| 8000.00|
| 8| Really Large Customers | Floppy Drive    | 2005-11-10| 8000.00|
| 9| Small Stores           | LCD Panel       | 2006-01-05| 1000.00|
|10| Medium Retailers       | Hard Disk Drive | 2006-02-10| 1000.00|
|11| Good Companies         | Floppy Drive    | 2006-03-15| 2000.00|
|12| Wonderful Shops        | LCD Panel       | 2006-04-20| 2500.00|
|13| Loyal Clients          | Hard Disk Drive | 2006-05-30| 3000.00|
|14| Distinguished Partners | Floppy Drive    | 2006-06-01| 3500.00|
|15| Really Large Customers | LCD Panel       | 2006-07-15| 4000.00|
|16| Small Stores           | Hard Disk Drive | 2006-08-30| 4500.00|
|17| Medium Retailers       | Floppy Drive    | 2006-09-05| 1000.00|
|18| Good Companies         | LCD Panel       | 2006-10-05| 1000.00|
|19| Wonderful Shops        | Hard Disk Drive | 2007-01-10| 4000.00|
|20| Loyal Clients          | Floppy Drive    | 2007-02-20| 4000.00|
|21| Distinguished Partners | LCD Panel       | 2007-02-28| 4000.00|
+--+------------------------+-----------------+-----------+--------+
19 rows in set (0.00 sec)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.191.247