Accumulating Snapshots

In this section I demonstrate how to implement an accumulating snapshot on sales orders. This accumulating snapshot tracks five sales order milestones: Order, Allocate, Pack, Ship, and Receive.

The five dates and their quantities come from the sales_order table in the source database. An order with a completed life cycle is described by five rows: one for the time the order is placed, one for the time the ordered products are allocated, one for the time the products are packed, one for the time the order is shipped, and one for the time the order is received by the customer. Each milestone also contains a status, whose value can be one of these: N for New, A for Allocate, P for Pack, S for Ship, and R for Receive.

For the sales_order table to handle the five different statuses, its structure must be changed. The script in Listing 14.3 changes the order_date column to status_date, adds a new column called order_status, and changes the order_quantity column to quantity. As the name implies, the order_status column is used to store one of N, A, P, S, or R. The status describes the value of the status_date column. If a record has a status of N, the value in the status_date column contains the order date. If the status is R, the status_date column contains the received date.

Listing 14.3. Modifying the sales order table
/******************************************************************/
/*                                                                */
/* order_status.sql                                               */
/*                                                                */
/******************************************************************/
USE source;

ALTER TABLE sales_order
  CHANGE order_date status_date DATE
, ADD order_status CHAR(1) AFTER status_date
, CHANGE order_quantity quantity INT
;
/* end of script                                                  */

Run the order_status.sql script in Listing 14.3 by using this command.

mysql> . c:mysqlscriptsorder_status.sql

Here is how the response on your console should look like.

Database changed
Query OK, 49 rows affected (0.40 sec)
Records: 49  Duplicates: 0  Warnings: 0

Having changed the source database, you now need to add four new quantities and four date surrogate keys to the existing sales_order_fact table. Here are the new columns:

  • allocate_date_sk

  • allocate_quantity

  • packing_date_sk

  • packing_quantity

  • ship_date_sk

  • ship_quantity

  • receive_date_sk

  • receive_quantity

Figure 14.2 shows the schema with the eight new columns.

Figure 14.2. The schema for accumulating snapshots


One sales order row in the fact table can now take the five milestones. In other words, the five milestone dates and quantities are accumulated in one sales order fact, hence the term accumulating snapshot.

You use the script in Listing 14.4 to add the four quantities and four date surrogate keys to the sales_order_fact table.

Listing 14.4. Adding four date surrogate keys
/******************************************************************/
/*                                                                */
/* add_four_milestones.sql                                        */
/*                                                                */
/******************************************************************/

USE dw;

ALTER TABLE sales_order_fact
  ADD allocate_date_sk INT AFTER order_date_sk
, ADD allocate_quantity INT
, ADD packing_date_sk INT AFTER allocate_date_sk
, ADD packing_quantity INT
, ADD ship_date_sk INT AFTER packing_date_sk
, ADD ship_quantity INT
, ADD receive_date_sk INT AFTER ship_date_sk
, ADD receive_quantity INT
;

/* end of script                                                  */

Run the add_four_milestones.sql script to add the new fact columns using this command.

mysql> . c:mysqlscriptsadd_four_milestones.sql

The following should be printed on the console as the response.

Database changed
Query OK, 47 rows affected (0.36 sec)
Records: 47  Duplicates: 0  Warnings: 0

You must now apply the database view role-playing technique on the date dimension for the new four dates. The script in Listing 14.5 creates the four date views you need.

Note

You created the order_date_dim view in Chapter 13.


Listing 14.5. Creating four date views
/******************************************************************/
/*                                                                */
/* create_four_date_views.sql                                     */
/*                                                                */
/******************************************************************/

USE dw;

CREATE VIEW allocate_date_dim (
  allocate_date_sk
, allocate_date
, month_name
, month
, quarter
, year
, promo_ind
, effective_date
, expiry_date
)
AS SELECT
  date_sk
, date
, month_name
, month
, quarter
, year
, promo_ind
, effective_date
, expiry_date
FROM date_dim
;

CREATE VIEW packing_date_dim (
  packing_date_sk
, packing_date
, month_name
, month
, quarter
, year
, promo_ind
, effective_date
, expiry_date
)
AS SELECT
  date_sk
, date
, month_name
, month
, quarter
, year
, promo_ind
, effective_date
, expiry_date
FROM date_dim
;

CREATE VIEW ship_date_dim (
  ship_date_sk
, ship_date
, month_name
, month
, quarter
, year
, promo_ind
, effective_date
, expiry_date
)
AS SELECT
  date_sk
, date
, month_name
, month
, quarter
, year
, promo_ind
, effective_date
, expiry_date
FROM date_dim
;

CREATE VIEW receive_date_dim (
  receive_date_sk
, receive_date
, month_name
, month
, quarter
, year
, promo_ind
, effective_date
, expiry_date
)
AS SELECT
  date_sk
, date
, month_name
, month
, quarter
, year
, promo_ind
, effective_date
, expiry_date
FROM date_dim
;

/* end of script                                                  */

Run the script in Listing 14.5 using this command.

mysql> . c:mysqlscriptscreate_four_date_views.sql

Now, you need to revise your regular population script as well since the structure of the fact table has changed. The script in Listing 14.6 is the new regular population script. The five statements that handle the five milestone statuses and dates are commented. More than one transaction for a sales order can be recorded on the same date, in which case the relevant milestone dates are updated at the same time.

Listing 14.6. The revised daily DW regular population
/******************************************************************/
/*                                                                */
/* dw_regular_14.sql                                              */
/*                                                                */
/******************************************************************/

USE dw;

/* CUSTOMER_DIM POPULATION                                        */

TRUNCATE customer_stg;

LOAD DATA INFILE 'customer.csv'
INTO TABLE customer_stg
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '
'
IGNORE 1 LINES
( customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state )
;

/* SCD 2 ON ADDRESSES                                             */

UPDATE
  customer_dim a
, customer_stg b
SET
  a.expiry_date = SUBDATE(CURRENT_DATE, 1)
WHERE
  a.customer_number = b.customer_number
AND (   a.customer_street_address <> b.customer_street_address
     OR a.customer_city <> b.customer_city
     OR a.customer_zip_code <> b.customer_zip_code
     OR a.customer_state <> b.customer_state
     OR a.shipping_address <> b.shipping_address
     OR a.shipping_city <> b.shipping_city
     OR a.shipping_zip_code <> b.shipping_zip_code
     OR a.shipping_state <> b.shipping_state
     OR a.shipping_address IS NULL
     OR a.shipping_city IS NULL
     OR a.shipping_zip_code IS NULL
     OR a.shipping_state IS NULL)
AND expiry_date = '9999-12-31'
;

INSERT INTO customer_dim
SELECT
  NULL
, b.customer_number
, b.customer_name
, b.customer_street_address
, b.customer_zip_code
, b.customer_city
, b.customer_state
, b.shipping_address
, b.shipping_zip_code
, b.shipping_city
, b.shipping_state
, CURRENT_DATE
, '9999-12-31'
FROM
  customer_dim a
, customer_stg b
WHERE
    a.customer_number = b.customer_number
AND (   a.customer_street_address <> b.customer_street_address
     OR a.customer_city <> b.customer_city
     OR a.customer_zip_code <> b.customer_zip_code
     OR a.customer_state <> b.customer_state
     OR a.shipping_address <> b.shipping_address
     OR a.shipping_city <> b.shipping_city
     OR a.shipping_zip_code <> b.shipping_zip_code
     OR a.shipping_state <> b.shipping_state
     OR a.shipping_address IS NULL
     OR a.shipping_city IS NULL
     OR a.shipping_zip_code IS NULL
     OR a.shipping_state IS NULL)
AND EXISTS (
SELECT *
FROM customer_dim x
WHERE b.customer_number = x.customer_number
AND a.expiry_date = SUBDATE(CURRENT_DATE, 1))
AND NOT EXISTS (
SELECT *
FROM customer_dim y
WHERE     b.customer_number = y.customer_number
      AND y.expiry_date = '9999-12-31')
;

/* END OF SCD 2                                                   */

/* SCD 1 ON NAME                                                  */

UPDATE customer_dim a, customer_stg b
SET a.customer_name = b.customer_name
WHERE     a.customer_number = b.customer_number
      AND a.expiry_date = '9999-12-31'
      AND a.customer_name <> b.customer_name
;

/* ADD NEW CUSTOMER                                               */

INSERT INTO customer_dim
SELECT
  NULL
, customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state
, CURRENT_DATE
, '9999-12-31'
FROM customer_stg
WHERE customer_number NOT IN(
SELECT a.customer_number
FROM
  customer_dim a
, customer_stg b
WHERE b.customer_number = a.customer_number )
;

/* RE-BUILD PA CUSTOMER DIMENSION                                 */

TRUNCATE pa_customer_dim;

INSERT INTO pa_customer_dim
SELECT
  customer_sk
, customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state
, effective_date
, expiry_date
FROM customer_dim
WHERE customer_state = 'PA'
;

/* END OF CUSTOMER_DIM POPULATION                                 */

/* PRODUCT_DIM POPULATION                                         */

TRUNCATE product_stg
;

LOAD DATA INFILE 'product.txt'
INTO TABLE product_stg
FIELDS TERMINATED BY ''
OPTIONALLY ENCLOSED BY ''
LINES TERMINATED BY '
'
IGNORE 1 LINES
( product_code
, product_name
, product_category )
;

/* SCD2 ON PRODUCT NAME AND GROUP                                 */

UPDATE
  product_dim a
, product_stg b
SET
  expiry_date = SUBDATE(CURRENT_DATE, 1)
WHERE
    a.product_code = b.product_code
AND (   a.product_name <> b.product_name
     OR a.product_category <> b.product_category )
     AND expiry_date = '9999-12-31'
;

INSERT INTO product_dim
SELECT
  NULL
, b.product_code
, b.product_name
, b.product_category
, CURRENT_DATE
, '9999-12-31'
FROM
  product_dim a
, product_stg b
WHERE
    a.product_code = b.product_code
AND (   a.product_name <> b.product_name
     OR a.product_category <> b.product_category )
AND EXISTS (
SELECT *
FROM product_dim x
WHERE     b.product_code = x.product_code
             AND a.expiry_date = SUBDATE(CURRENT_DATE, 1))
AND NOT EXISTS (
SELECT *
FROM product_dim y
WHERE    b.product_code = y.product_code
             AND y.expiry_date = '9999-12-31')
;

/* END OF SCD 2                                                   */

/* ADD NEW PRODUCT                                                */

INSERT INTO product_dim
SELECT
  NULL
, product_code
, product_name
, product_category
, CURRENT_DATE
, '9999-12-31'
FROM product_stg
WHERE product_code NOT IN(
SELECT y.product_code
FROM product_dim x, product_stg y
WHERE x.product_code = y.product_code )
;

/* END OF PRODUCT_DIM POPULATION                                  */

/* ORDER_DIM POPULATION                                           */
INSERT INTO order_dim (
  order_sk
, order_number
, effective_date
, expiry_date
)
SELECT
  NULL
, order_number
, status_date
, '9999-12-31'
FROM source.sales_order
WHERE
    order_status = 'N'
AND entry_date = CURRENT_DATE
;

/* INSERTING New ORDER                                            */

INSERT INTO sales_order_fact
SELECT
  order_sk
, customer_sk
, product_sk
, e.order_date_sk
, NULL
, NULL
, NULL
, NULL
, f.request_delivery_date_sk
, order_amount
, quantity
, NULL
, NULL
, NULL
, NULL
FROM
  source.sales_order a
, order_dim b
, customer_dim c
, product_dim d
, order_date_dim e
, request_delivery_date_dim f
WHERE
    a.order_status = 'N'
AND a.order_number = b.order_number
AND a.customer_number = c.customer_number
AND a.status_date >= c.effective_date
AND a.status_date <= c.expiry_date
AND a.product_code = d.product_code
AND a.status_date >= d.effective_date
AND a.status_date <= d.expiry_date
AND a.status_date = e.order_date
AND a.request_delivery_date = f.request_delivery_date
AND a.entry_date = CURRENT_DATE
;

/* UPDATING the new sales order to Allocated status               */

UPDATE
  sales_order_fact a
, source.sales_order b
, allocate_date_dim c
, order_dim g
SET
  a.allocate_date_sk = c.allocate_date_sk
, a.allocate_quantity = b.quantity
WHERE
    order_status = 'A'
AND b.entry_date = CURRENT_DATE
AND b.order_number = g.order_number
AND a.order_sk = g.order_sk
AND c.allocate_date = b.status_date
;

/* UPDATING the allocated order to Packed status                  */

UPDATE
  sales_order_fact a
, source.sales_order b
, packing_date_dim d
, order_dim g
SET
  a.packing_date_sk = d.packing_date_sk
, a.packing_quantity = b.quantity
WHERE
    order_status = 'P'
AND b.entry_date = CURRENT_DATE
AND b.order_number = g.order_number
AND a.order_sk = g.order_sk
AND d.packing_date = b.status_date
;

/* UPDATING the packed order to Shipped status                    */

UPDATE
  sales_order_fact a
, source.sales_order b
, ship_date_dim e
, order_dim g
SET
  a.ship_date_sk = e.ship_date_sk
, a.ship_quantity = b.quantity
WHERE
    order_status = 'S'
AND b.entry_date = CURRENT_DATE
AND b.order_number = g.order_number
AND a.order_sk = g.order_sk
AND e.ship_date = b.status_date
;

/* UPDATING the shipped order to Received status                  */

UPDATE
  sales_order_fact a
, source.sales_order b
, receive_date_dim f
, order_dim g
SET
  a.receive_date_sk = f.receive_date_sk
, a.receive_quantity = b.quantity
WHERE
    order_status = 'R'
AND b.entry_date = CURRENT_DATE
AND b.order_number = g.order_number
AND a.order_sk = g.order_sk
AND f.receive_date = b.status_date
;
/* end of script                                                  */

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.60.63