© Dmitry Anoshin, Dmitry Shirokov, Donna Strok 2020
D. Anoshin et al.Jumpstart Snowflakehttps://doi.org/10.1007/978-1-4842-5328-1_14

14. Time Travel

Dmitry Anoshin1 , Dmitry Shirokov2 and Donna Strok3
(1)
British Columbia, Canada
(2)
Burnaby, BC, Canada
(3)
Seattle, WA, USA
 
This chapter will cover the following tasks:
  • Working with previous versions of objects. In other words, Snowflake provides the ability to query historical data.

  • Creating copies/backups of data on the technical history of objects.

A specialty of the technical design of the Snowflake is that the data is stored in micro-partitions,1 which are immutable. This means that with any operations such as the addition or deletion of data, a new micro-partition is created, and the old one ceases to exist. Using special commands that extend standard SQL, you can easily access historical data.

In general, the user’s data in a system has the following lifecycle:
  • Data is created in data storage.

  • Depending on the license, all states of the data are stored during the retention period (Table 14-1). Users can work with a technical history of any object using SQL extensions.

  • At the end of the term, data moves to a particular zone called fail-safe.2 Accordingly, the actual data of the object, together with the related technical history, becomes inaccessible to the user. In this area, data is stored for seven days and can be recoverable only by Snowflake.

Table 14-1

Data Retention Period Depending on License

License

Description

Standard Edition

The default is one day. (This can be set to zero days.)

Snowflake Enterprise Edition and higher

For permanent objects, the range is from 0 to 90 days.

For transient3 objects, the default is one day. The range is from zero to one day.

The parameter DATA_RETENTION_TIME_IN_DAYS can be set on the whole account or on the object level, meaning database, schema, or table. According to this hierarchy, the parameter can be overridden.

The Snowflake Time Travel SQL extension provides some groups of statements:
  • Querying any version of data of the table using the following:
    • The statement SELECT with AT/BEFORE
      • The user can request the version of the table by specifying the exact time (using keyword TIMESTAMP)

        For example, to get data on August 5, 2019, use this:
        SELECT * FROM <table> AT (TIMESTAMP => 'Mon, 05 Aug 2019 13:30:00 -0700'::timestamp);
      • The user can request the version of the table by specifying the relative time, meaning the time difference in seconds from the present time (using keyword OFFSET).

        For example, select historical data from a table of 30 minutes ago using this:
        SELECT * FROM <table> AT (offset -60*30)
    • STATEMENT identifier of a certain transaction. Here’s an example:
      SELECT * FROM <table> BEFORE (STATEMENT => '<statement_id>');
  • Creating a clone of a table, a schema, or a whole database using the CREATE <TABLE>|<SCHEMA>|<DB> CLONE . <ORIG_OBJECT> statement

    Here’s an example:
    CREATE TABLE <table_restored> CLONE <original_table>);
  • Restoring an object using the UNDROP <TABLE>|<SCHEMA>|<DB> command. Here’s an example:

    UNDROP TABLE <table>

  • Additionally, the command SHOW TABLE HISTORY helps track versions of an object.

Figure 14-1 shows an example. Table 14-2 describes the process.
../images/482416_1_En_14_Chapter/482416_1_En_14_Fig1_HTML.jpg
Figure 14-1

Time Travel feature in Snowflake

Table 14-2

Working with Time Travel

Steps

Description

1

The user sets the retention period for the account to X and then to Y using the ALTER .. SET statement. Accordingly, this strategy applies to all objects in the account.

2

The user sets the retention period for the database or schema to X and then Y using the ALTER .. SET statement. This means that the retention time will be changed for all objects below the hierarchy.

3

The user DROPs the database or schema and then UNDROPs it without any problems because the retention time has not expired yet.

4

The user creates a new table called Table_1 and then adds some data. This means at a point in time, t1, the table contains certain data. Further, the user updates some rows in the table, so, at the time point t2, the table already has other data.

5

The user creates a new table called Table_2 with the previous state of table Table_1 at point of time t1 using the statement SELECT with AT t1 or SELECT with t1.

6

This mechanism also supports schema evolution, which means the user can add a new column to the table and add values into a new column. The user can request any version of the table, and data will be returned in the format it was in at the time of the request.

7

The last step is that the user can DROP and UNDROP the table and also can CREATE a new table as CLONE at any point in time during the retention time.

Time Travel Feature

Let’s look at how to use the Time Travel feature in practice.
  1. 1.

    Log into your Snowflake account.

     
  2. 2.

    Switch to Worksheets and execute the code in Listing 14-1 to check the current data retention parameter.

     
show parameters like '%DATA_RETENTION%' in account;
alter account set DATA_RETENTION_TIME_IN_DAYS = 2;
show parameters like '%DATA%_RETENTION' in database samples;
alter database samples
  set DATA_RETENTION_TIME_IN_DAYS = 1;
Listing 14-1

Checking Retention Parameters and Trying to Change Them

In Listing 14-1, we did the following:

  • We checked the current data retention parameter for the account using the show parameters.. in account command.

  • We changed the parameter to two days for the account using the alter account..set command.

  • We checked the current data retention parameter for the database using the show parameters..in account command. We can see that it changed to 2. Since all objects are attached to an account, the account parameters are automatically applied to all objects below the hierarchy.

  • We changed the parameter to 1 for the database using the alter database..set command. See Figure 14-2.
    ../images/482416_1_En_14_Chapter/482416_1_En_14_Fig2_HTML.jpg
    Figure 14-2

    Data retention parameter

  1. 3.

    Create a new sample table for the example by executing the code in Listing 14-2.

     
create or replace table samples.finance.stocks (
   id int,
    symbol string,
    name string);
insert into samples.finance.stocks
       values(1,'TDC', 'Teradata'),
       (2,'ORCL', 'Oracle'),
       (3,'TSLA', 'Tesla');
select * from samples.finance.stocks;
Listing 14-2

Creating a New Table

In Listing 14-2, we did the following:

  • We created a new empty table called stocks.

  • We populated the table with values. See Figure 14-3.
    ../images/482416_1_En_14_Chapter/482416_1_En_14_Fig3_HTML.jpg
    Figure 14-3

    Sample table

  1. 4.

    Modify the table and try to query the previous state. Wait about a minute after the previous commands and execute the code in Listing 14-3.

     
insert into samples.finance.stocks
values(5,'MSFT', 'Microsoft');
  delete from samples.finance.stocks
  where id = 3;
select * from samples.finance.stocks;
select * from samples.finance.stocks at
(offset => -1*60);
Listing 14-3

Changing Data in the Table and Checking the State of the Table

In Listing 14-3, we did the following:

  • We changed the data in the table, inserted a new row, and deleted one row.

  • We checked the current state of the table. See Figure 14-4.
    ../images/482416_1_En_14_Chapter/482416_1_En_14_Fig4_HTML.jpg
    Figure 14-4

    The last state of the table

  • We checked the state minutes ago using the following. See Figure 14-5.
    at (offset => -1*60).
    ../images/482416_1_En_14_Chapter/482416_1_En_14_Fig5_HTML.jpg
    Figure 14-5

    The previous state of the table

  1. 5.

    Drop and undrop the table, as shown in Listing 14-4.

     
drop table samples.finance.stocks;
select * from samples.finance.stocks;
   undrop table samples.finance.stocks;
Listing 14-4

Changing the Data in the Table and Checking the State of the Table

  1. 6.

    Create a new table as a clone of the previous state of the original table. Execute the code in Listing 14-5.

     
create table samples.finance.stocks_10m clone
   samples.finance.stocks at (offset => -10*60);
Listing 14-5

Creating a Clone of the Table

Summary

In this chapter, we covered the Time Travel feature. Moreover, you learned about the data lifecycle in Snowflake and how to work with the history of data objects in Snowflake.

Finally, we walked you through a few examples using the Time Travel feature.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.129.194