Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

A. CarruthersBuilding the Snowflake Data Cloudhttps://doi.org/10.1007/978-1-4842-8593-0_6

6. Account Usage Store

Andrew Carruthers¹

(1)

Birmingham, UK

In Chapter 4, you worked through some examples of account controls and learned how to implement Snowflake features to keep the bad guys out and protect your accounts. Chapter 4 is also referenced to implement corresponding monitoring and alerting.

Let’s take a step back and ask ourselves what other operational functions we can facilitate by building out monitoring, and see whether the Account Usage store can fulfill our needs. The functional requirements typically include the following.

Audit: Both internal and external system audits would want to access the Account Usage store and run ad hoc queries. We cannot know the full suite of future requirements, but we can make an educated guess at the likely use cases an audit function would want to satisfy.
Consumption charging: Every finance department wants to know how many credits have been consumed and by whom and looks at consumption growth trends over time. The procurement team will want to know consumption numbers to project future trends and inform contract renewal negotiations.
Metadata cataloging: Enterprise data model mapping, object and attribute capture, and data lineage are valid use cases.
Event monitoring: This chapter’s primary purpose is tooling, which frequently monitors activity and raises alerts.

Please ensure your account is the Business Critical Edition. The examples here are dependent upon Business Critical Edition features. Otherwise, please re-register using a different email address by selecting the Business Critical option at https://signup.snowflake.com .

Snowflake continually enhances the Account Usage views and capability. Wherever possible, we want to build the ability to future-proof usage and enable new monitoring requirements with minimal re-work. I suggest the operational support team conduct monthly reviews of Account Usage views to identify changes proactively. Better to be on top of change and present our cybersecurity colleagues with an opportunity to tighten controls and add value.

Automating each control using streams and tasks brings its own challenges and the obligation to record the status and logging of each task run. For simplicity and to reduce the page count, I have not implemented a status setting or log in each procedure. I leave this for you to investigate but note that you may encounter concurrency issues with multiple high-frequency processes updating the same status and logging table. Likewise, the code presented is a skeleton. You may wish to implement additional attributes such as URLs to confluence pages containing control information or an active_flag for each parameter. Just note the need to extend all dependent objects such as views and MERGE statements.

And add comments omitted due to page constraints. Hopefully, the code is sufficiently self-explanatory.

This chapter could—and probably should—be a book in its own right. The more I dig into the Account Usage store, the more I find of interest. After writing the monitoring suite over a year ago, it is a delight revisiting this subject, and I hope this chapter inspires you to investigate further. Please also see https://docs.snowflake.com/en/sql-reference/account-usage.html .

Let’s now discuss Snowflake Account Usage, what it is, how to use it, and offer practical hands-on suggestions to make your job easier when building monitoring tooling .

What Is the Account Usage Store?

All RDBMS have a metadata catalog, Oracle has data dictionary views, dynamic performance views (v$), and a plethora of k$ and x$ tables—complex and hard to navigate. SQL Server and MySQL have the information_schema, and Snowflake has followed the same pattern by implementing an information schema for each created database.

Except for a single Snowflake account level holistic source of information, Snowflake supplies an imported database confusingly called SNOWFLAKE, where the Account Usage store resides, maintained by Snowflake, which also contains an information schema. Please note there are some features to be aware of. Account Usage views have a latency of between 45 minutes and 3 hours. Therefore, transactions can take time to appear in the views but typically store information for up to one year. But there is a way to gain access to data in real time by using information_schema, noting data is only stored for 14 days. I demonstrate both approaches. The distinction is important. The Account Usage store is an aggregated view of Snowflake’s centrally maintained data presented as a shared database. The information schema is driven from the global services layer accessing the FoundationDB key-value store. We use both throughout this chapter.

Depending upon the role in use, the imported Snowflake database may not be visible, which changes the role. ACCOUNTADMIN is the only role enabled by default to access Account Usage store content.

USE ROLE accountadmin;

Then refresh as shown in Figure 6-1, after which the Snowflake database is viewable. SNOWFLAKE and SNOWFLAKE_SAMPLE_DATA are both imported databases. An imported database is one supplied by Snowflake as part of the initial account delivery. Like any other object, they can be dropped.

Note

Do not drop the SNOWFLAKE database!

You may drop SNOWFLAKE_SAMPLE_DATA, but note UNDROP is not supported for an imported database that must be re-created along with granting imported entitlement to roles.

By default, only the ACCOUNTADMIN role can access the Account Usage views. Other roles can see the Snowflake database but not access the contents. Immutable history is restricted to one year, which is insufficient to support audit requests for greater than one year. We have a further requirement to periodically capture and retain all history.

Let’s dive into the Snowflake Account Usage store, along with an investigation of the global services layer. In line with the hands-on approach, I offer a script and walk-through with a strong recommendation to cut and paste each section as a practical learning experience. The code looks familiar to what is in Chapter 5, but the implementation is quite different.

Best Practices

Snowflake best practice suggests using the one-year immutable history to check events against, but I suggest this does not fulfill all requirements. There are scenarios where account settings have aged out of the history and are not available for comparison or reinstatement back to original values. I show you how to mitigate against this risk in the history retention section. In other words, always expect the unexpected, don’t trust anything, prove assertions and verify the recommendations match your requirements. Those words of wisdom have served me well in nearly 30 years of development.

Accessing the Account Usage Store

As discussed in Chapter 5, the only way to grant entitlement is via a role. The first step is to design our approach, as shown in Figure 6-2.

Figure 6-2
Example Account Usage store access pattern

Our approach is to create a new database (MONITOR) containing two schemas: one for static reference data and the second to wrap the source Account Usage views to enrich, contextualize, summarize, and pre-filter the source data to make reporting easier.

While we must explicitly enable access to the Account Usage store, each database information schema is immediately accessible .

Preamble

Our script starts with some declarations enabling later automation.

SET monitor_database = 'MONITOR';

SET monitor_reference_schema = 'MONITOR.reference_owner';

SET monitor_owner_schema = 'MONITOR.monitor_owner';

SET monitor_warehouse = 'monitor_wh';

SET monitor_reference_role = 'monitor_reference_role';

SET monitor_owner_role = 'monitor_owner_role';

SET monitor_reader_role = 'monitor_reader_role';

The SET command allows us to declare a label that resolves its actual value when the script runs. You see how the labels are interpreted shortly, providing a single point where we can make declarations for later automation .

Database, Warehouse, and Schema

The following is a command set that changes role to SYSADMIN, then creates a database with 90-day time travel. Also created is a single, extra-small warehouse with two schemas in the database. Refresh your browser to see the new database and schemas .

USE ROLE sysadmin;

CREATE OR REPLACE DATABASE IDENTIFIER ( $monitor_database ) DATA_RETENTION_TIME_IN_DAYS = 90;

CREATE OR REPLACE WAREHOUSE IDENTIFIER ( $monitor_warehouse ) WITH

WAREHOUSE_SIZE = 'X-SMALL'

AUTO_SUSPEND = 60

AUTO_RESUME = TRUE

MIN_CLUSTER_COUNT = 1

MAX_CLUSTER_COUNT = 4

SCALING_POLICY = 'STANDARD'

INITIALLY_SUSPENDED = TRUE;

CREATE OR REPLACE SCHEMA IDENTIFIER ( $monitor_reference_schema );

CREATE OR REPLACE SCHEMA IDENTIFIER ( $monitor_owner_schema );

In the preceding example, the labels are expanded by using the IDENTIFIER keyword. Using the SYADMIN role ensures all created objects remain owned by SYSADMIN and is a safeguard against unexpected manipulation .

Your worksheet should reflect Figure 6-3 when refreshed.

Figure 6-3
Refreshed database and schema

Roles

Let’s switch roles to SECURITYADMIN to create custom roles for later use resolving IDENTIFIER labels .

USE ROLE securityadmin;

CREATE OR REPLACE ROLE IDENTIFIER ( $monitor_reference_role ) COMMENT = 'MONITOR.monitor_reference Role';

CREATE OR REPLACE ROLE IDENTIFIER ( $monitor_owner_role ) COMMENT = 'MONITOR.monitor_owner Role';

CREATE OR REPLACE ROLE IDENTIFIER ( $monitor_reader_role ) COMMENT = 'MONITOR.monitor_reader Role';

In line with Snowflake’s best practices, let’s assign our newly created roles to SECURITYADMIN. (Note that we may choose to have a custom local admin role as an intermediary role—some homework for you!)

GRANT ROLE IDENTIFIER ( $monitor_reference_role ) TO ROLE securityadmin;

GRANT ROLE IDENTIFIER ( $monitor_owner_role ) TO ROLE securityadmin;

GRANT ROLE IDENTIFIER ( $monitor_reader_role ) TO ROLE securityadmin;

Role Grants

With the SECURITYADMIN role in use, let’s enable each role to use the database and warehouse .

GRANT USAGE ON DATABASE IDENTIFIER ( $monitor_database ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT USAGE ON WAREHOUSE IDENTIFIER ( $monitor_warehouse ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT OPERATE ON WAREHOUSE IDENTIFIER ( $monitor_warehouse ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT USAGE ON SCHEMA IDENTIFIER ( $monitor_reference_schema ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT USAGE ON DATABASE IDENTIFIER ( $monitor_database ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT USAGE ON WAREHOUSE IDENTIFIER ( $monitor_warehouse ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT OPERATE ON WAREHOUSE IDENTIFIER ( $monitor_warehouse ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT USAGE ON SCHEMA IDENTIFIER ( $monitor_reference_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT USAGE ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT USAGE ON DATABASE IDENTIFIER ( $monitor_database ) TO ROLE IDENTIFIER ( $monitor_reader_role );

GRANT USAGE ON WAREHOUSE IDENTIFIER ( $monitor_warehouse ) TO ROLE IDENTIFIER ( $monitor_reader_role );

GRANT OPERATE ON WAREHOUSE IDENTIFIER ( $monitor_warehouse ) TO ROLE IDENTIFIER ( $monitor_reader_role );

GRANT USAGE ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_reader_role );

Now let’s grant specific object entitlements for each of the two schemas.

GRANT CREATE TABLE ON SCHEMA IDENTIFIER ( $monitor_reference_schema ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT CREATE VIEW ON SCHEMA IDENTIFIER ( $monitor_reference_schema ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT CREATE SEQUENCE ON SCHEMA IDENTIFIER ( $monitor_reference_schema ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT CREATE STREAM ON SCHEMA IDENTIFIER ( $monitor_reference_schema ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT CREATE MATERIALIZED VIEW ON SCHEMA IDENTIFIER ( $monitor_reference_schema ) TO ROLE IDENTIFIER ( $monitor_reference_role );

GRANT CREATE TABLE ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT CREATE VIEW ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT CREATE SEQUENCE ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT CREATE FUNCTION ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT CREATE PROCEDURE ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT CREATE STREAM ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT CREATE MATERIALIZED VIEW ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

GRANT CREATE FILE FORMAT ON SCHEMA IDENTIFIER ( $monitor_owner_schema ) TO ROLE IDENTIFIER ( $monitor_owner_role );

Assigning Roles to Users

When all three roles have been created and all entitlement granted, we must enable one or more users to interact with our empty schemas. We do this by granting each role to nominated users.

GRANT ROLE IDENTIFIER ( $monitor_reference_role ) TO USER <your_name>;

GRANT ROLE IDENTIFIER ( $monitor_owner_role ) TO USER <your_name>;

GRANT ROLE IDENTIFIER ( $monitor_reader_role ) TO USER <your_name>;

Enabling the Account Usage Store

The following command entitles monitor_owner_role to access the Account Usage store .

GRANT IMPORTED PRIVILEGES ON DATABASE snowflake TO ROLE IDENTIFIER ( $monitor_owner_role );

Switch to monitor_owner_role as this is the only custom role that can view Account Usage store objects and then run a sample query.

USE ROLE IDENTIFIER ( $monitor_owner_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_owner_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

We can now query the Account Usage views, in this example, selecting metadata for all active databases in our account.

SELECT * FROM SNOWFLAKE.account_usage.databases WHERE deleted IS NULL;

Building Reference Data Sets

In the previous sections, we walked through setting up our environment. Now let’s focus on the practical implementation of monitoring objects. To do this, we must understand what we are monitoring and why.

Starting with a simple use case, our first objective is to examine the sample controls developed in Chapter 4 and determine whether a pattern-based approach can be derived, enabling future reuse. A design pattern is a code template that can easily be replicated, extended, or, better still, be data driven.

From experience, there are six different patterns to be developed that cover all known scenarios. Let’s begin our investigation into these patterns, starting with the simplest and adding notes to build up to the most complex. Note space does not permit a full explanation of each pattern, and this chapter builds upon principles outlined in Chapter 5, albeit in a different format.

With a fully historized reference data set , we achieve two objectives. First, we establish a temporal history for audit purposes because settings age out of the history after a year, and this is a weakness we must correct. Second, in the unlikely event a parameter is incorrectly set, we have a record of what the current value should be and of all historical values.

We must identify a way of ensuring parameter values are recorded for comparison, hence the need for reference data created in the reference_owner schema. We also introduce a way of historizing our reference data, we provide historical data to ensure the data lineage is preserved, and a brief digression is in order. Data warehouses often re-create a report at any point where the data warehouse holds information. But, if we only hold the most recent values for a record, we cannot store the history too. Enter Slowly Changing Dimension (SCD) modeling . A full explanation is beyond the scope of this chapter, but more information is at https://en.wikipedia.org/wiki/Slowly_changing_dimension . For our purposes, we need to know account parameter values at any point in history. SCD2 provides the answer; hence all historization for reference data is SCD2 based.

Finally, to ease building out each control pattern, we need to identify a way of grouping like controls in our reference data. Please take time to understand the companion script, Chapter_6_Account_Usage_Demo.sql, and all will become clear.

Referencing Schema Objects

Our reference schema contains three core tables along with their historized counterparts and a single secure view that denormalizes the data model for ease of use. A full explanation of denormalization is at https://en.wikipedia.org/wiki/Denormalization . Figure 6-4 shows the three tables and secure view we will create, along with their relationships.

Figure 6-4
Sample reference owner schema objects

The general pattern for creating a single table SCD2 implementation is shown in Figure 6-5. Please refer to the accompanying script, Chapter_6_Account_Usage_Demo.sql, for the complete implementation of all three tables and a secure view. Note Chapter_6_Account_Usage_Demo.sql takes a few minutes to run as there are many SQL statements .

Figure 6-5 shows the full suite of SQL statements and data flow required to implement our example.

Figure 6-5
Sample reference owner data flow

Let’s overview Figure 6-5. We create sequence monitor_group_id to generate a surrogate key. Then we create tables monitor_group and hist_monitor_group also stream strm_monitor_group, which captures records as they are inserted into the monitor_group table. We define view v_monitor_group against both stream and history table, which does all the hard work of identifying delta records . With all objects created, we insert data into monitor_group and finally run the merge statement, which loads the SCD2 records into hist_monitor_group. Don’t worry. All will become clear. Just know there are dependencies on object creation order, which I now explain.

Change the role to reference_owner role.

USE ROLE IDENTIFIER ( $monitor_reference_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_reference_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

Create a sequence.

CREATE OR REPLACE SEQUENCE monitor_group_id_seq START WITH 10000;

Create tables.

CREATE OR REPLACE TABLE monitor_group

(

monitor_group_id NUMBER(10) PRIMARY KEY NOT NULL,

monitor_group_name VARCHAR(255) NOT NULL,

last_updated TIMESTAMP_NTZ DEFAULT current_timestamp()::TIMESTAMP_NTZ NOT NULL,

CONSTRAINT monitor_group_u1 UNIQUE ( monitor_group_name )

);

CREATE OR REPLACE TABLE hist_monitor_group

(

monitor_group_id NUMBER(10) NOT NULL,

monitor_group_name VARCHAR(255) NOT NULL,

valid_from TIMESTAMP_NTZ,

valid_to TIMESTAMP_NTZ,

current_flag VARCHAR(1)

);

Create a stream.

CREATE OR REPLACE STREAM strm_monitor_group ON TABLE monitor_group;

Create a history view . Note that UNION ALL joins three sections. The first code block identifies records for INSERTs, the second for UPDATEs, and the third for DELETEs. An UPDATE is two actions: a logical DELETE and an INSERT. Hence in v_monitor_group declaration, dml_flag is never set to ‘U’.

CREATE OR REPLACE VIEW v_monitor_group

SELECT monitor_group_id,

monitor_group_name,

valid_from,

valid_to,

current_flag,

'I' AS dml_type

FROM (

SELECT monitor_group_id,

monitor_group_name,

last_updated AS valid_from,

LAG ( last_updated ) OVER ( PARTITION BY monitor_group_id ORDER BY last_updated DESC ) AS valid_to_raw,

CASE

WHEN valid_to_raw IS NULL

THEN '9999-12-31'::TIMESTAMP_NTZ

ELSE valid_to_raw

END AS valid_to,

CASE

WHEN valid_to_raw IS NULL

THEN 'Y'

ELSE 'N'

END AS current_flag,

'I' AS dml_type

FROM (

SELECT strm.monitor_group_id,

strm.monitor_group_name,

strm.last_updated

FROM strm_monitor_group strm

WHERE strm.metadata$action = 'INSERT'

AND strm.metadata$isupdate = 'FALSE'

)

UNION ALL

SELECT monitor_group_id,

monitor_group_name,

valid_from,

valid_to,

current_flag,

dml_type

FROM (

SELECT monitor_group_id,

monitor_group_name,

valid_from,

LAG ( valid_from ) OVER ( PARTITION BY monitor_group_id ORDER BY valid_from DESC ) AS valid_to_raw,

CASE

WHEN valid_to_raw IS NULL

THEN '9999-12-31'::TIMESTAMP_NTZ

ELSE valid_to_raw

END AS valid_to,

CASE

WHEN valid_to_raw IS NULL

THEN 'Y'

ELSE 'N'

END AS current_flag,

dml_type

FROM (

SELECT strm.monitor_group_id,

strm.monitor_group_name,

strm.last_updated AS valid_from,

'I' AS dml_type

FROM strm_monitor_group strm

WHERE strm.metadata$action = 'INSERT'

AND strm.metadata$isupdate = 'TRUE'

UNION ALL

SELECT tgt.monitor_group_id,

tgt.monitor_group_name,

tgt.valid_from,

'D' AS dml_type

FROM hist_monitor_group tgt

WHERE tgt.monitor_group_id IN

(

SELECT DISTINCT strm.monitor_group_id

FROM strm_monitor_group strm

WHERE strm.metadata$action = 'INSERT'

AND strm.metadata$isupdate = 'TRUE'

)

AND tgt.current_flag = 'Y'

)

UNION ALL

SELECT strm.monitor_group_id,

strm.monitor_group_name,

tgt.valid_from,

current_timestamp()::TIMESTAMP_NTZ AS valid_to,

NULL,

'D' AS dml_type

FROM hist_monitor_group tgt

INNER JOIN strm_monitor_group strm

ON tgt.monitor_group_id = strm.monitor_group_id

WHERE strm.metadata$action = 'DELETE'

AND strm.metadata$isupdate = 'FALSE'

AND tgt.current_flag = 'Y';

Determine monitor groups, for example, we have account parameters, network policies, and time travel. there are others to add later when we encounter them, but for now, these three suffice. create INSERT statements for each.

INSERT INTO monitor_group ( monitor_group_id, monitor_group_name ) VALUES ( monitor_group_id_seq.NEXTVAL, 'ACCOUNT PARAMETER' );

INSERT INTO monitor_group ( monitor_group_id, monitor_group_name ) VALUES ( monitor_group_id_seq.NEXTVAL, 'NETWORK POLICY' );

INSERT INTO monitor_group ( monitor_group_id, monitor_group_name ) VALUES ( monitor_group_id_seq.NEXTVAL, 'TIME TRAVEL' );

Run the MERGE statement to historize the data, as dml_flag is never set to ‘U’, we only need to consider INSERT and (logical) DELETE operations. Note the number of rows inserted and number of rows updated in the returned result set.

MERGE INTO hist_monitor_group tgt

USING v_monitor_group strm

ON tgt.monitor_group_id = strm.monitor_group_id

AND tgt.valid_from = strm.valid_from

AND tgt.monitor_group_name = strm.monitor_group_name

WHEN MATCHED AND strm.dml_type = 'D' THEN

UPDATE

SET tgt.valid_to = strm.valid_to,

tgt.current_flag = 'N'

WHEN NOT MATCHED AND strm.dml_type = 'I' THEN

INSERT

(

tgt.monitor_group_id,

tgt.monitor_group_name,

tgt.valid_from,

tgt.valid_to,

tgt.current_flag

)

VALUES

(

strm.monitor_group_id,

strm.monitor_group_name,

strm.valid_from,

strm.valid_to,

strm.current_flag

);

Now query the history table to ensure the MERGE created the three records noting the valid_to date is set to ‘9999-12-31’ and current_flag = ‘Y’.

SELECT * FROM hist_monitor_group;

Testing

Please refer to the corresponding script, Chapter_6_Account_Usage_Demo.sql, where additional SQL commands are found to INSERT, UPDATE, and DELETE test parameters, always remembering to run the MERGE statement which historizes the data. Additional steps for automation (and possibly overkill as maintenance is a very low volume occasional activity) would be to create a task; see Chapter 5 for an example.

Completing the Reference Schema Data Flow

Due to space constraints , I have only provided a single table walk-through. Two more tables conforming to the same pattern, including foreign key references, are required to be built contained in the Chapter_6_Account_Usage_Demo.sql corresponding script.

Note

If you add attributes to the sample code, ensure the MERGE statement is extended to capture the new attributes.

You should begin to see that we build our code according to patterns. Adopting this approach leads us to consider writing code to generate the patterns ensuring repeatability and a consistent rule-based approach to development. We explore this concept later in this book as the implication for building the Snowflake data cloud are profound.

When complete, we can build the secure view on top to abstract our data model into a single usable object .

Creating a Secure View

Assuming Chapter_6_Account_Usage_Demo.sql has been run and all objects created, let’s now build a secure view to denormalize our reference data which delivers a single object for lookups. We use a secure view to prevent the underlying SQL from being visible (a security feature). Views provide abstraction meaning less code to write when referencing the underlying model.

USE ROLE IDENTIFIER ( $monitor_reference_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_owner_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

CREATE OR REPLACE SECURE VIEW v_monitor_data COPY GRANTS

SELECT mg.monitor_group_id,

mg.monitor_group_name,

mc.monitor_control_id,

mc.monitor_control_name,

mp.monitor_parameter_id,

mp.monitor_parameter_name,

mp.monitor_parameter_value

FROM monitor_group mg,

monitor_control mc,

monitor_parameter mp

WHERE mg.monitor_group_id = mc.monitor_group_id

AND mc.monitor_control_id = mp.monitor_control_id;

We should check v_monitor_data returns expected values.

SELECT * FROM v_monitor_data;

And finally, grant entitlement to monitor_owner_role.

GRANT SELECT, REFERENCES ON v_monitor_data TO ROLE IDENTIFIER ( $monitor_owner_role );

Storing Monitored Output

This section describes how to store result sets and maintain a full audit history of monitoring activity and extension when our monitoring requirements change. Figure 6-6 illustrates the components required to identify changes to our baseline security configuration held in reference data, then compared to the actual values set on our account, with any differences reported for consumption by monitoring teams .

We begin (as usual) by setting our context.

USE ROLE IDENTIFIER ( $monitor_owner_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_owner_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

We must determine what is to be logged. The reference data, actual values, and context, plus anything useful in assisting future investigation such as record creation timestamp. From these requirements, our surrogate key sequence and primary logging table look like the following.

CREATE OR REPLACE SEQUENCE monitor_log_id_seq START WITH 10000;

CREATE OR REPLACE TABLE monitor_log

(

monitor_log_id NUMBER PRIMARY KEY NOT NULL,

event_description STRING NOT NULL,

event_result STRING NOT NULL,

monitor_control_name STRING NOT NULL,

monitor_parameter_name STRING NOT NULL,

monitor_parameter_value STRING NOT NULL,

last_updated TIMESTAMP_NTZ DEFAULT current_timestamp()::TIMESTAMP_NTZ NOT NULL

);

There is a second logging table for one complex use case described later .

Asynchronous Monitoring

These pattern runs asynchronously and should be called from a task periodically. Then the logged output should be queried via a second task for event propagation. We adopt this approach since it is impossible to embed the SHOW, LIST, and DESCRIBE commands’ output in a view or other abstraction mechanism, such as UDF.

Publication could be via external procedure wrapping email functionality or writing to an external stage. Monitoring screens can also be developed against the monitor_log table for use by your operational support team.

Account Parameters

The following pattern is for simple account parameter monitoring, TRUE/FALSE flag setting for a named parameter.

You may recall in Chapter 4 that we developed controls to prevent data from being unloaded to user-specified Internet locations and restrict data unloads to specified, system-mapped Internet locations.

USE ROLE accountadmin;

ALTER ACCOUNT SET prevent_unload_to_inline_url = TRUE;

ALTER ACCOUNT SET require_storage_integration_for_stage_creation = TRUE;

ALTER ACCOUNT SET require_storage_integration_for_stage_operation = TRUE;

Two separate controls with a common implementation pattern. we are setting account level parameters.

Note our sample script Chapter_6_Account_Usage_Demo.sql initially creates only three monitor parameters in support of account parameter monitoring, providing our first set of controls to monitor .

USE ROLE IDENTIFIER ( $monitor_owner_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_owner_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

SELECT *

FROM MONITOR.reference_owner.v_monitor_data

WHERE monitor_parameter_value = 'TRUE';

Note fully qualified object reference. We should see three rows returned as per Figure 6-7.

Figure 6-7
Expected v_monitor_data result set

But how do we compare the reference data to the actual values?

First, we must run the SHOW command to access the Snowflake global services layer based on FoundationDB, the key-pair store that underpins all Snowflake metadata operations. We return to this theme later in this chapter .

SHOW PARAMETERS LIKE 'prevent_unload_to_inline_url' IN ACCOUNT;

However, there is a problem. SHOW is not a true SQL command since we cannot use it in a SQL statement. For example, we cannot create a view based on it. This does not work. Create or replace view v_show_parameters as SHOW parameters. However, we can use RESULT_SCAN to convert SHOW output to a result set, but we must understand the interaction. When RESULT_SCAN is used with last_query_id, we cannot run commands in between as the query_id changes.

SELECT "value" FROM TABLE ( RESULT_SCAN ( last_query_id()));

Instead, we must use a stored procedure to programmatically extract the value for a named parameter as the two statements run consecutively, noting we extend this code pattern later for network parameter monitoring .

CREATE OR REPLACE PROCEDURE sp_get_parameter_value ( P_PARAMETER STRING ) RETURNS STRING

LANGUAGE javascript

EXECUTE AS CALLER

var sql_stmt = "SHOW PARAMETERS LIKE '" + P_PARAMETER + "'IN ACCOUNT";

var show_stmt = snowflake.createStatement ({ sqlText:sql_stmt });

show_res = show_stmt.execute();

show_op = show_res.next();

var sql_stmt = `SELECT "value" FROM TABLE ( RESULT_SCAN ( last_query_id()));`;

var exec_stmt = snowflake.createStatement ({ sqlText:sql_stmt });

rec_set = exec_stmt.execute();

rec_op = rec_set.next();

return rec_set.getColumnValue(1);

$$;

The following tests the stored procedure.

CALL sp_get_parameter_value ( 'prevent_unload_to_inline_url' );

The stored procedure should return “true”.

You may be wondering whether the stored procedure can be converted into a function and embedded in a SQL statement. In short, the answer at the time of writing, using a function, is not possible (but would have been so much simpler).

Having explained and implemented all dependent objects on which account parameter monitoring relies, Figure 6-8 illustrates how the components relate with sp_check_parameters being the next deliverable for development .

Flowchart to explain the logic contained in sp_check_parameters in Figure 6-9.

Let’s implement the sp_check_parameters stored procedure using JavaScript.

CREATE OR REPLACE PROCEDURE sp_check_parameters() RETURNS STRING

LANGUAGE javascript

EXECUTE AS CALLER

var sql_stmt = "";

var stmt = "";

var recset = "";

var result = "";

var monitor_control_name = "";

var monitor_parameter_name = "";

var monitor_parameter_value = "";

var parameter_value = "";

sql_stmt += "SELECT monitor_control_name, "

sql_stmt += " UPPER ( monitor_parameter_name ), "

sql_stmt += " UPPER ( monitor_parameter_value ) "

sql_stmt += "FROM MONITOR.reference_owner.v_monitor_data "

sql_stmt += "WHERE monitor_group_name = 'ACCOUNT PARAMETER';";

stmt = snowflake.createStatement ({ sqlText:sql_stmt });

try

{

recset = stmt.execute();

while(recset.next())

{

monitor_control_name = recset.getColumnValue(1);

monitor_parameter_name = recset.getColumnValue(2);

monitor_parameter_value = recset.getColumnValue(3);

stmt = snowflake.createStatement ( { sqlText: "CALL sp_get_parameter_value(?);",

binds:[monitor_parameter_name] } );

res = stmt.execute();

res.next();

parameter_value = res.getColumnValue(1);

sql_stmt = "INSERT INTO monitor_log "

sql_stmt += "SELECT monitor_log_id_seq.NEXTVAL, "

sql_stmt += " '" + monitor_parameter_name + "', "

sql_stmt += " '" + parameter_value + "', "

sql_stmt += " '" + monitor_control_name + "', "

sql_stmt += " '" + monitor_parameter_name + "', "

sql_stmt += " '" + monitor_parameter_value + "', "

sql_stmt += " current_timestamp()::TIMESTAMP_NTZ "

sql_stmt += "FROM dual "

sql_stmt += "WHERE UPPER ( '" + parameter_value + "' ) <> UPPER ( '" + monitor_parameter_value + "' );";

stmt = snowflake.createStatement ({ sqlText:sql_stmt });

try

{

stmt.execute();

result = "Success";

}

catch { result = sql_stmt; }

}

result = "Success";

}

catch { result = sql_stmt; }

return result;

$$;

Call the stored procedure to test. This should not insert any records in monitor_log .

CALL sp_check_parameters();

SELECT * FROM monitor_log;

Finally, mis-set a parameter then retest.

USE ROLE accountadmin;

ALTER ACCOUNT SET prevent_unload_to_inline_url = FALSE;

USE ROLE IDENTIFIER ( $monitor_owner_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_owner_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

CALL sp_check_parameters();

SELECT * FROM monitor_log;

You should see a single record, as shown in Figure 6-10.

Network Parameters

Network policy monitoring is a variation on account parameter monitoring with two extra steps. Just as sp_get_parameter_value uses the SHOW command with consecutive RESULT_SCAN query, interrogating network policies requires the same constructs followed by DESCRIBE and RESULT_SCAN query.

Add reference data changing the IP range to suit your local environment, and noting your Snowflake account network policy should also be set.

USE ROLE IDENTIFIER ( $monitor_reference_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_reference_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

INSERT INTO monitor_parameter ( monitor_parameter_id, monitor_control_id, monitor_parameter_name, monitor_parameter_value ) SELECT monitor_parameter_id_seq.NEXTVAL, monitor_control_id, 'MY_NETWORK_POLICY', '192.168.0.0' FROM monitor_control WHERE monitor_control_name = 'SNOWFLAKE_NIST_NP1';

The queries required (in order) are as follows.

USE ROLE accountadmin;

CREATE OR REPLACE NETWORK POLICY my_network_policy

ALLOWED_IP_LIST=( '192.168.0.0' );

SHOW NETWORK POLICIES IN ACCOUNT;

SELECT "name" FROM TABLE ( RESULT_SCAN ( last_query_id()));

DESCRIBE NETWORK POLICY my_network_policy;

SELECT "name", "value" FROM TABLE ( RESULT_SCAN ( last_query_id()));

Build the dynamic INSERT INTO monitor_log using same pattern from sp_check_parameters, call and test your stored procedure.

System for Cross-Domain Identity Management (SCIM) Events

Monitoring SCIM events is distinctly different from all other monitoring insofar as accessing rest_event_history is only allowable using ACCOUNTADMIN role. We can achieve the same outcome with some lateral thinking, a task (which runs as ACCOUNTADMIN), and a separate logging table. Please refer to https://docs.snowflake.com/en/sql-reference/functions/rest_event_history.html for more details.

For this example, we use SNOWFLAKE_NIST_RE1 as the control reference.

Create a sequence and table to capture rest_event_history output .

USE ROLE IDENTIFIER ( $monitor_owner_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_owner_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

-- Create Sequence

CREATE OR REPLACE SEQUENCE hist_rest_event_history_id_seq START WITH 10000;

CREATE OR REPLACE TABLE hist_rest_event_history

(

hist_rest_event_history NUMBER,

event_timestamp TIMESTAMP_LTZ,

event_id NUMBER,

event_type TEXT,

endpoint TEXT,

method TEXT,

status TEXT,

error_code TEXT,

details TEXT,

client_ip TEXT,

actor_name TEXT,

actor_domain TEXT,

resource_name TEXT,

resource_domain TEXT

);

Recognizing the need to report events in the past hour, using the ACCOUNTADMIN role, create a view joining SNOWFLAKE.information_schema.rest_event_history to our reference data. Note the use of the TABLE function .

USE ROLE accountadmin;

CREATE OR REPLACE SECURE VIEW v_rest_event_history COPY GRANTS

SELECT reh.event_timestamp,

reh.event_id,

reh.event_type,

reh.endpoint,

reh.method,

reh.status,

reh.error_code,

reh.details,

reh.client_ip,

reh.actor_name,

reh.actor_domain,

reh.resource_name,

reh.resource_domain,

current_timestamp()::TIMESTAMP_NTZ current_timestamp

FROM TABLE ( snowflake.information_schema.rest_event_history (

'scim',

DATEADD ( 'minutes', -60, current_timestamp()),

current_timestamp(),

1000 )

) reh;

An alternative way to derive the time since the last run is to use your local monitor status table if implemented. The INSERT statement should reference the view using a stored procedure and TASK to execute periodically .

Synchronous Monitoring

The remainder of our monitoring configuration is much more straightforward and relies upon accessing the Account Usage views. We still rely upon reference data to determine our baseline criteria reflected in the following sections.

Time Travel

Database time-travel monitoring does not require SHOW/DESCRIBE processing. Instead, it is a straightforward join between v_monitor_data and Account Usage store (assuming corresponding reference data has been created first and the monitoring databases have 90-day time travel enabled).

USE ROLE IDENTIFIER ( $monitor_reference_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_reference_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

INSERT INTO monitor_parameter ( monitor_parameter_id, monitor_control_id, monitor_parameter_name, monitor_parameter_value ) SELECT monitor_parameter_id_seq.NEXTVAL, monitor_control_id, 'MONITOR', '90' FROM monitor_control WHERE monitor_control_name = 'SNOWFLAKE_NIST_TT1';

INSERT INTO monitor_parameter ( monitor_parameter_id, monitor_control_id, monitor_parameter_name, monitor_parameter_value ) SELECT monitor_parameter_id_seq.NEXTVAL, monitor_control_id, 'TEST', '90' FROM monitor_control WHERE monitor_control_name = 'SNOWFLAKE_NIST_TT1';

Use the following query to test, noting the Account Usage views can take up to 3 hours to refresh.

USE ROLE IDENTIFIER ( $monitor_owner_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_owner_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

INSERT INTO monitor_log

SELECT monitor_log_id_seq.NEXTVAL,

d.database_name,

d.retention_time,

v.monitor_control_name,

v.monitor_parameter_name,

v.monitor_parameter_value,

current_timestamp()::TIMESTAMP_NTZ

FROM MONITOR.reference_owner.v_monitor_data v,

snowflake.account_usage.databases d

WHERE v.monitor_parameter_name = d.database_name

AND v.monitor_parameter_value != d.retention_time

AND v.monitor_group_name = 'TIME TRAVEL'

AND d.deleted IS NULL;

Testing can also be accomplished by setting != to = in this query .

Events

Many monitoring requirements can be achieved by directly querying account_usage_store or information_schema and joining the reference data. Examples of both approaches are offered next. Remember the key differences are Account Usage views have a latency of between 45 minutes and 3 hours and typically store information for up to one year. In contrast, the information schema has zero latency, but data is only stored for 14 days.

Reference Chapter 4 and our first control, SNOWFLAKE_NIST_AC1.

USE ROLE IDENTIFIER ( $monitor_owner_role );

USE DATABASE IDENTIFIER ( $monitor_database );

USE SCHEMA IDENTIFIER ( $monitor_owner_schema );

USE WAREHOUSE IDENTIFIER ( $monitor_warehouse );

Create secure view v_snowflake_nist_ac1.

CREATE OR REPLACE SECURE VIEW v_snowflake_nist_ac1 COPY GRANTS

COMMENT = 'Reference: Chapter 4 Account Security'

SELECT 'SNOWFLAKE_NIST_AC1' AS control_name,

'MONITOR.monitor_owner.v_snowflake_nist_ac1' AS control_object,

start_time,

role_name,

database_name,

schema_name,

user_name,

query_text,

query_id

FROM snowflake.account_usage.query_history

WHERE role_name = 'ACCOUNTADMIN';

Note the absence of timeband filters. We should not limit the lowest Level Account Usage views, but we may later choose to apply timeband filters. This is a building block for later use.

Check view v_snowflake_nist_ac1 has content:

SELECT * FROM v_snowflake_nist_ac1;

The equivalent real-time information schema view looks like this.

CREATE OR REPLACE SECURE VIEW v_rt_snowflake_nist_ac1 COPY GRANTS

COMMENT = 'Reference: Chapter 4 Account Security and our first control'

SELECT 'SNOWFLAKE_NIST_AC1' AS control_name,

'MONITOR.monitor_owner.v_rt_snowflake_nist_ac1' AS control_object,

start_time,

role_name,

database_name,

schema_name,

user_name,

query_text,

query_id

FROM TABLE ( snowflake.information_schema.query_history ( DATEADD ( 'days', -1, current_timestamp()), current_timestamp()))

WHERE role_name = 'ACCOUNTADMIN';

Note the use of the TABLE operator converting function output to tabular content. More information is at https://docs.snowflake.com/en/sql-reference/functions-table.html and parameters to derive only the last day activity.

Naturally, we would develop one wrapper view for each control and consolidate it into a single master view, joining our reference data against which we may wish to apply filters.

CREATE OR REPLACE SECURE VIEW v_rt_snowflake_controls COPY GRANTS

SELECT ac1.control_name,

ac1.control_object,

ac1.start_time,

ac1.role_name,

ac1.database_name,

ac1.schema_name,

ac1.user_name,

ac1.query_text,

ac1.query_id,

rd.monitor_group_name,

rd.monitor_control_name,

rd.monitor_parameter_name,

rd.monitor_parameter_value

FROM MONITOR.monitor_owner.v_rt_snowflake_nist_ac1 ac1,

MONITOR.reference_owner.v_monitor_data rd

WHERE ac1.control_name = rd.monitor_control_name

AND rd.monitor_control_name = 'SNOWFLAKE_NIST_AC1'

AND rd.monitor_parameter_name = 'ACCOUNTADMIN'

AND ac1.start_time >= DATEADD ( 'minutes', rd.monitor_parameter_value, current_timestamp()::TIMESTAMP_NTZ );

Only one source view is shown. All others should be added via UNION ALL. Also, note the filter on start_time, restricting results to the past 60 minutes set by reference data.

Check view v_rt_snowflake_controls has content :

SELECT * FROM v_rt_snowflake_controls;

External Monitoring Tools

Some third-party tools require direct access to Snowflake Account Usage views to function. We should not allow direct access to any Snowflake objects but instead, create wrapper views and custom roles to limit data access to minimal objects required to achieve each objective. While this approach may be uncomfortable for vendors, we must not compromise our account security and must insist on vendors conforming to our security policies. Some vendors make provision for customization through a configuration file, as with Datadog.

An example view for specific monitoring purposes would include all source view attributes and some custom attributes, which may look like the following.

CREATE OR REPLACE SECURE VIEW login_history COPY GRANTS

SELECT lh.*,

current_account()||'.'||user_name AS object_name,

'Login History Wrapper View' AS source_name,

'snowflake.account_usage.login_history' AS source_path

FROM snowflake.account_usage.login_history lh;

Check view login_history has content:

SELECT * FROM login_history;

Note the removal of the v_ prefix because we want to present our view to the consuming user in (almost) native format to make integration easy.

We must grant entitlement on our views to a role.

GRANT SELECT ON login_history TO ROLE IDENTIFIER ( $monitor_reader_role );

Finally, grant the role to a user, which I leave you to complete .

Cleanup

Finally, you want to remove all objects .

USE ROLE sysadmin;

DROP DATABASE IDENTIFIER ( $monitor_database );

DROP WAREHOUSE IDENTIFIER ( $monitor_warehouse );

USE ROLE securityadmin;

DROP ROLE IDENTIFIER ( $monitor_reference_role );

DROP ROLE IDENTIFIER ( $monitor_reader_role );

DROP ROLE IDENTIFIER ( $monitor_owner_role );

History Retention

We may need to hold the full history of every action in our account from inception to the current date. While Snowflake retains a one-year immutable history, we may need to periodically copy data to local tables before information ages out.

The following list of snowflake.account_usage views should be reviewed for completeness in scope and attribute changes before invoking data copy. I suggest that data be copied in March and September to avoid year-end change freeze blackouts .

task_history
access_history
automatic_clustering_history
copy_history
database_storage_usage_history
data_transfer_history
load_history
login_history
materialized_view_refresh_history
metering_daily_history
metering_history
pipe_usage_history
query_history
replication_usage_history
search_optimization_history
serverless_task_history
sessions
stage_storage_usage_history
storage_usage
warehouse_events_history
warehouse_load_history
warehouse_metering_history

Summary

This chapter introduced the Account Usage store and built a new database and associated schemas for later use. You then enabled access to the Account Usage store.

Next, you developed a reference schema with metadata storing our account level settings and denormalizing it for later use, demonstrating a sample but extensible data model.

A full monitoring solution is not possible with the limited space available. The page budget has been significantly exceeded to make things clear. It is well worth the effort. I hope you agree and have plenty to pique your interest. I have included additional code in the accompanying file to assist your understanding.

Let’s turn our attention to ingesting data, where you learn more techniques on your journey to building the Snowflake Data Cloud.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 6. Account Usage Store

Create new playlist

Sign In

Sign Up

6. Account Usage Store

What Is the Account Usage Store?

Best Practices

Accessing the Account Usage Store

Preamble

Database, Warehouse, and Schema

Roles

Role Grants

Assigning Roles to Users

Enabling the Account Usage Store

Building Reference Data Sets

Referencing Schema Objects

Testing

Completing the Reference Schema Data Flow

Creating a Secure View

Storing Monitored Output

Asynchronous Monitoring

Account Parameters

Network Parameters

System for Cross-Domain Identity Management (SCIM) Events

Synchronous Monitoring

Time Travel

Events

External Monitoring Tools

Cleanup

History Retention

Summary

Table of Contents for
6. Account Usage Store