In this section I demonstrate how to implement an accumulating snapshot on sales orders. This accumulating snapshot tracks five sales order milestones: Order, Allocate, Pack, Ship, and Receive.
The five dates and their quantities come from the sales_order table in the source database. An order with a completed life cycle is described by five rows: one for the time the order is placed, one for the time the ordered products are allocated, one for the time the products are packed, one for the time the order is shipped, and one for the time the order is received by the customer. Each milestone also contains a status, whose value can be one of these: N for New, A for Allocate, P for Pack, S for Ship, and R for Receive.
For the sales_order table to handle the five different statuses, its structure must be changed. The script in Listing 14.3 changes the order_date column to status_date, adds a new column called order_status, and changes the order_quantity column to quantity. As the name implies, the order_status column is used to store one of N, A, P, S, or R. The status describes the value of the status_date column. If a record has a status of N, the value in the status_date column contains the order date. If the status is R, the status_date column contains the received date.
/******************************************************************/ /* */ /* order_status.sql */ /* */ /******************************************************************/ USE source; ALTER TABLE sales_order CHANGE order_date status_date DATE , ADD order_status CHAR(1) AFTER status_date , CHANGE order_quantity quantity INT ; /* end of script */ |
Run the order_status.sql script in Listing 14.3 by using this command.
mysql> . c:mysqlscriptsorder_status.sql
Here is how the response on your console should look like.
Database changed Query OK, 49 rows affected (0.40 sec) Records: 49 Duplicates: 0 Warnings: 0
Having changed the source database, you now need to add four new quantities and four date surrogate keys to the existing sales_order_fact table. Here are the new columns:
allocate_date_sk
allocate_quantity
packing_date_sk
packing_quantity
ship_date_sk
ship_quantity
receive_date_sk
receive_quantity
Figure 14.2 shows the schema with the eight new columns.
One sales order row in the fact table can now take the five milestones. In other words, the five milestone dates and quantities are accumulated in one sales order fact, hence the term accumulating snapshot.
You use the script in Listing 14.4 to add the four quantities and four date surrogate keys to the sales_order_fact table.
/******************************************************************/ /* */ /* add_four_milestones.sql */ /* */ /******************************************************************/ USE dw; ALTER TABLE sales_order_fact ADD allocate_date_sk INT AFTER order_date_sk , ADD allocate_quantity INT , ADD packing_date_sk INT AFTER allocate_date_sk , ADD packing_quantity INT , ADD ship_date_sk INT AFTER packing_date_sk , ADD ship_quantity INT , ADD receive_date_sk INT AFTER ship_date_sk , ADD receive_quantity INT ; /* end of script */ |
Run the add_four_milestones.sql script to add the new fact columns using this command.
mysql> . c:mysqlscriptsadd_four_milestones.sql
The following should be printed on the console as the response.
Database changed Query OK, 47 rows affected (0.36 sec) Records: 47 Duplicates: 0 Warnings: 0
You must now apply the database view role-playing technique on the date dimension for the new four dates. The script in Listing 14.5 creates the four date views you need.
Note
You created the order_date_dim view in Chapter 13.
/******************************************************************/ /* */ /* create_four_date_views.sql */ /* */ /******************************************************************/ USE dw; CREATE VIEW allocate_date_dim ( allocate_date_sk , allocate_date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date ) AS SELECT date_sk , date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date FROM date_dim ; CREATE VIEW packing_date_dim ( packing_date_sk , packing_date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date ) AS SELECT date_sk , date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date FROM date_dim ; CREATE VIEW ship_date_dim ( ship_date_sk , ship_date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date ) AS SELECT date_sk , date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date FROM date_dim ; CREATE VIEW receive_date_dim ( receive_date_sk , receive_date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date ) AS SELECT date_sk , date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date FROM date_dim ; /* end of script */ |
Run the script in Listing 14.5 using this command.
mysql> . c:mysqlscriptscreate_four_date_views.sql
Now, you need to revise your regular population script as well since the structure of the fact table has changed. The script in Listing 14.6 is the new regular population script. The five statements that handle the five milestone statuses and dates are commented. More than one transaction for a sales order can be recorded on the same date, in which case the relevant milestone dates are updated at the same time.
/******************************************************************/ /* */ /* dw_regular_14.sql */ /* */ /******************************************************************/ USE dw; /* CUSTOMER_DIM POPULATION */ TRUNCATE customer_stg; LOAD DATA INFILE 'customer.csv' INTO TABLE customer_stg FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY ' ' IGNORE 1 LINES ( customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state ) ; /* SCD 2 ON ADDRESSES */ UPDATE customer_dim a , customer_stg b SET a.expiry_date = SUBDATE(CURRENT_DATE, 1) WHERE a.customer_number = b.customer_number AND ( a.customer_street_address <> b.customer_street_address OR a.customer_city <> b.customer_city OR a.customer_zip_code <> b.customer_zip_code OR a.customer_state <> b.customer_state OR a.shipping_address <> b.shipping_address OR a.shipping_city <> b.shipping_city OR a.shipping_zip_code <> b.shipping_zip_code OR a.shipping_state <> b.shipping_state OR a.shipping_address IS NULL OR a.shipping_city IS NULL OR a.shipping_zip_code IS NULL OR a.shipping_state IS NULL) AND expiry_date = '9999-12-31' ; INSERT INTO customer_dim SELECT NULL , b.customer_number , b.customer_name , b.customer_street_address , b.customer_zip_code , b.customer_city , b.customer_state , b.shipping_address , b.shipping_zip_code , b.shipping_city , b.shipping_state , CURRENT_DATE , '9999-12-31' FROM customer_dim a , customer_stg b WHERE a.customer_number = b.customer_number AND ( a.customer_street_address <> b.customer_street_address OR a.customer_city <> b.customer_city OR a.customer_zip_code <> b.customer_zip_code OR a.customer_state <> b.customer_state OR a.shipping_address <> b.shipping_address OR a.shipping_city <> b.shipping_city OR a.shipping_zip_code <> b.shipping_zip_code OR a.shipping_state <> b.shipping_state OR a.shipping_address IS NULL OR a.shipping_city IS NULL OR a.shipping_zip_code IS NULL OR a.shipping_state IS NULL) AND EXISTS ( SELECT * FROM customer_dim x WHERE b.customer_number = x.customer_number AND a.expiry_date = SUBDATE(CURRENT_DATE, 1)) AND NOT EXISTS ( SELECT * FROM customer_dim y WHERE b.customer_number = y.customer_number AND y.expiry_date = '9999-12-31') ; /* END OF SCD 2 */ /* SCD 1 ON NAME */ UPDATE customer_dim a, customer_stg b SET a.customer_name = b.customer_name WHERE a.customer_number = b.customer_number AND a.expiry_date = '9999-12-31' AND a.customer_name <> b.customer_name ; /* ADD NEW CUSTOMER */ INSERT INTO customer_dim SELECT NULL , customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state , CURRENT_DATE , '9999-12-31' FROM customer_stg WHERE customer_number NOT IN( SELECT a.customer_number FROM customer_dim a , customer_stg b WHERE b.customer_number = a.customer_number ) ; /* RE-BUILD PA CUSTOMER DIMENSION */ TRUNCATE pa_customer_dim; INSERT INTO pa_customer_dim SELECT customer_sk , customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state , effective_date , expiry_date FROM customer_dim WHERE customer_state = 'PA' ; /* END OF CUSTOMER_DIM POPULATION */ /* PRODUCT_DIM POPULATION */ TRUNCATE product_stg ; LOAD DATA INFILE 'product.txt' INTO TABLE product_stg FIELDS TERMINATED BY '' OPTIONALLY ENCLOSED BY '' LINES TERMINATED BY ' ' IGNORE 1 LINES ( product_code , product_name , product_category ) ; /* SCD2 ON PRODUCT NAME AND GROUP */ UPDATE product_dim a , product_stg b SET expiry_date = SUBDATE(CURRENT_DATE, 1) WHERE a.product_code = b.product_code AND ( a.product_name <> b.product_name OR a.product_category <> b.product_category ) AND expiry_date = '9999-12-31' ; INSERT INTO product_dim SELECT NULL , b.product_code , b.product_name , b.product_category , CURRENT_DATE , '9999-12-31' FROM product_dim a , product_stg b WHERE a.product_code = b.product_code AND ( a.product_name <> b.product_name OR a.product_category <> b.product_category ) AND EXISTS ( SELECT * FROM product_dim x WHERE b.product_code = x.product_code AND a.expiry_date = SUBDATE(CURRENT_DATE, 1)) AND NOT EXISTS ( SELECT * FROM product_dim y WHERE b.product_code = y.product_code AND y.expiry_date = '9999-12-31') ; /* END OF SCD 2 */ /* ADD NEW PRODUCT */ INSERT INTO product_dim SELECT NULL , product_code , product_name , product_category , CURRENT_DATE , '9999-12-31' FROM product_stg WHERE product_code NOT IN( SELECT y.product_code FROM product_dim x, product_stg y WHERE x.product_code = y.product_code ) ; /* END OF PRODUCT_DIM POPULATION */ /* ORDER_DIM POPULATION */ INSERT INTO order_dim ( order_sk , order_number , effective_date , expiry_date ) SELECT NULL , order_number , status_date , '9999-12-31' FROM source.sales_order WHERE order_status = 'N' AND entry_date = CURRENT_DATE ; /* INSERTING New ORDER */ INSERT INTO sales_order_fact SELECT order_sk , customer_sk , product_sk , e.order_date_sk , NULL , NULL , NULL , NULL , f.request_delivery_date_sk , order_amount , quantity , NULL , NULL , NULL , NULL FROM source.sales_order a , order_dim b , customer_dim c , product_dim d , order_date_dim e , request_delivery_date_dim f WHERE a.order_status = 'N' AND a.order_number = b.order_number AND a.customer_number = c.customer_number AND a.status_date >= c.effective_date AND a.status_date <= c.expiry_date AND a.product_code = d.product_code AND a.status_date >= d.effective_date AND a.status_date <= d.expiry_date AND a.status_date = e.order_date AND a.request_delivery_date = f.request_delivery_date AND a.entry_date = CURRENT_DATE ; /* UPDATING the new sales order to Allocated status */ UPDATE sales_order_fact a , source.sales_order b , allocate_date_dim c , order_dim g SET a.allocate_date_sk = c.allocate_date_sk , a.allocate_quantity = b.quantity WHERE order_status = 'A' AND b.entry_date = CURRENT_DATE AND b.order_number = g.order_number AND a.order_sk = g.order_sk AND c.allocate_date = b.status_date ; /* UPDATING the allocated order to Packed status */ UPDATE sales_order_fact a , source.sales_order b , packing_date_dim d , order_dim g SET a.packing_date_sk = d.packing_date_sk , a.packing_quantity = b.quantity WHERE order_status = 'P' AND b.entry_date = CURRENT_DATE AND b.order_number = g.order_number AND a.order_sk = g.order_sk AND d.packing_date = b.status_date ; /* UPDATING the packed order to Shipped status */ UPDATE sales_order_fact a , source.sales_order b , ship_date_dim e , order_dim g SET a.ship_date_sk = e.ship_date_sk , a.ship_quantity = b.quantity WHERE order_status = 'S' AND b.entry_date = CURRENT_DATE AND b.order_number = g.order_number AND a.order_sk = g.order_sk AND e.ship_date = b.status_date ; /* UPDATING the shipped order to Received status */ UPDATE sales_order_fact a , source.sales_order b , receive_date_dim f , order_dim g SET a.receive_date_sk = f.receive_date_sk , a.receive_quantity = b.quantity WHERE order_status = 'R' AND b.entry_date = CURRENT_DATE AND b.order_number = g.order_number AND a.order_sk = g.order_sk AND f.receive_date = b.status_date ; /* end of script */ |
3.147.60.63