Dynamic Pivot

As another exercise, assume that you’re given the task of writing a stored procedure that produces a dynamic pivot in the database you are connected to. The stored procedure accepts the following parameters (all Unicode character strings): @query, @on_rows, @on_cols, @agg_func and @agg_col. Based on the inputs, you’re supposed to construct a PIVOT query string and execute it dynamically. Here’s the description of the input parameters:

  • @query. Query or table/view name given to the PIVOT operator as input

  • @on_rows. Column/expression list that will be used as the grouping columns

  • @on_cols. Column or expression to be pivoted; the distinct values from this column will become the target column names

  • @agg_func. Aggregate function (MIN, MAX, SUM, COUNT, and so on)

  • @agg_col. Column/expression given to the aggregate function as input

If you’re still confused regarding the requirements and the meaning of each input, skip the solution in Example 7-8. Instead, examine the invocation examples and the outputs that follow the listing and the explanation of the solution. Then try to provide your own solution before looking at this one.

Important

Important

Note that the solution in Example 7-8 follows bad programming practices and is insecure. I’ll use this solution to discuss flaws in its implementation and then suggest a more robust and secure alternative.

Example 7-8 shows a suggested solution for the task.

Example 7-8. Creation script for the sp_pivot stored procedure

USE master;
GO

IF OBJECT_ID('dbo.sp_pivot') IS NOT NULL
  DROP PROC dbo.sp_pivot;
GO

CREATE PROC dbo.sp_pivot
  @query    AS NVARCHAR(MAX),
  @on_rows  AS NVARCHAR(MAX),
  @on_cols  AS NVARCHAR(MAX),
  @agg_func AS NVARCHAR(MAX) = N'MAX',
  @agg_col  AS NVARCHAR(MAX)
AS

DECLARE
  @sql     AS NVARCHAR(MAX),
  @cols    AS NVARCHAR(MAX),
  @newline AS NVARCHAR(2);

SET @newline = NCHAR(13) + NCHAR(10);

-- If input is a valid table or view
-- construct a SELECT statement against it
IF COALESCE(OBJECT_ID(@query, N'U'),
            OBJECT_ID(@query, N'V')) IS NOT NULL
  SET @query = N'SELECT * FROM ' + @query;

-- Make the query a derived table
SET @query = N'( ' + @query + @newline + N'      ) AS Query';

-- Handle * input in @agg_col
IF @agg_col = N'*'
  SET @agg_col = N'1';

-- Construct column list
SET @sql =
  N'SET @result = '                                    + @newline +
  N'  STUFF('                                          + @newline +
  N'    (SELECT N'','' + '
           + N'QUOTENAME(pivot_col) AS [text()]'       + @newline +
  N'     FROM (SELECT DISTINCT('
           + @on_cols + N') AS pivot_col'              + @newline +
  N'           FROM' + @query + N') AS DistinctCols'   + @newline +
  N'     ORDER BY pivot_col'                           + @newline +
  N'     FOR XML PATH('''')),'                         + @newline +
  N'    1, 1, N''''),'

EXEC sp_executesql
  @stmt   = @sql,
  @params = N'@result AS NVARCHAR(MAX) OUTPUT',
  @result = @cols OUTPUT;

-- Create the PIVOT query
SET @sql =
  N'SELECT *'                                          + @newline +
  N'FROM'                                              + @newline +
  N'  ( SELECT '                                       + @newline +
  N'      ' + @on_rows + N','                          + @newline +
  N'      ' + @on_cols + N' AS pivot_col,'             + @newline +
  N'      ' + @agg_col + N' AS agg_col'                + @newline +
  N'    FROM '                                         + @newline +
  N'      ' + @query                                   + @newline +
  N'  ) AS PivotInput'                                 + @newline +
  N'  PIVOT'                                           + @newline +
  N'    ( ' + @agg_func + N'(agg_col)'                 + @newline +
  N'      FOR pivot_col'                               + @newline +
  N'        IN(' + @cols + N')'                        + @newline +
  N'    ) AS PivotOutput;'

EXEC sp_executesql @sql;
GO

I’m using this exercise both to explain how to achieve dynamic pivoting and to discuss bad programming practices and security flaws. I’ll start by discussing the logic behind the code, and then I’ll describe the bad programming practices and flaws and present a more robust and secure solution.

The stored procedure is created as a special procedure in master to allow running it in any database. Remember that dynamic execution is invoked in the context of the current database. This means that the stored procedure’s code will effectively run in the context of the current database, interacting with local user objects.

The code checks whether the input parameter @query contains a valid table or view. If it does, the code constructs a SELECT statement against the object, storing the statement back in @query. If @query doesn’t contain an existing table/view name, the code assumes that it already contains a query.

The code then makes the query a derived table by adding surrounding parentheses and a derived table alias (AS Query). The result string is stored back in @query. This derived table will be used both to determine the distinct values that need to be pivoted (from the column/expression stored in the @on_cols input parameter) and as the input table expression for the PIVOT operator.

Because the PIVOT operator doesn’t support * as an input for the aggregate function–for example, COUNT(*)–the code substitutes a * input in @agg_col with the constant 1.

The code continues by constructing a dynamic query string within the @sql variable. This string has code that constructs the column list that will later be served to PIVOT’s IN clause. The column list is constructed by a FOR XML PATH query. The query concatenates the distinct list of values from the column/expression stored in the @on_cols input parameter. The concatenation query string (stored in @sql) is invoked dynamically. The dynamic code returns through an output parameter a string with the column list, and it assigns it to the variable @cols.

The next section of code constructs the actual PIVOT query string in the @sql variable. It constructs an outer query against the derived table (aliased as Query), which is currently stored in @query. The outer query creates another derived table called PivotInput. The SELECT list in the outer query includes the following items:

  • The grouping column/expression list stored in @on_rows, which is the part that the PIVOT operator will use in its implicit grouping activity

  • The columns/expression to be pivoted (currently stored in @on_cols), aliased as pivot_col

  • The column that will be used as the aggregate function’s input (currently stored in @agg_col), aliased as agg_col

The PIVOT operator works on the derived table PivotInput. Within PIVOT’s parentheses, the code embeds the following items: the aggregate function (@agg_func) with the aggregate column as its input (agg_col), and the column list (@cols) within the parentheses of the IN clause. The outermost query simply uses a SELECT * to grab all columns returned from the PIVOT operation.

Finally, the PIVOT query constructed in the @sql variable is invoked dynamically.

More Info

More Info

For in-depth discussion of the PIVOT operator, refer to Inside T-SQL Querying.

The sp_pivot stored procedure is extremely flexible, though this flexibility comes at a high security cost, which I’ll describe later. To demonstrate its flexibility, I’ll provide three examples of invoking it with different inputs. Make sure you study and understand all the inputs carefully.

The following code produces the count of orders per employee and order year, pivoted by order month, and it generates the output shown in Table 7-12:

EXEC Northwind.dbo.sp_pivot
  @query    = N'dbo.Orders',
  @on_rows  = N'EmployeeID AS empid, YEAR(OrderDate) AS order_year',
  @on_cols  = N'MONTH(OrderDate)',
  @agg_func = N'COUNT',
  @agg_col  = N'*';

Table 7-12. Count of Orders per Employee and Order Year Pivoted by Order Month

empid

order_year

1

2

3

4

5

6

7

8

9

10

11

12

1

1996

0

0

0

0

0

0

1

5

5

2

4

9

2

1996

0

0

0

0

0

0

1

2

5

2

2

4

3

1996

0

0

0

0

0

0

4

2

1

3

4

4

4

1996

0

0

0

0

0

0

7

5

3

8

5

3

5

1996

0

0

0

0

0

0

3

0

1

2

2

3

6

1996

0

0

0

0

0

0

2

4

3

0

3

3

7

1996

0

0

0

0

0

0

0

1

2

5

3

0

8

1996

0

0

0

0

0

0

2

6

3

2

2

4

9

1996

0

0

0

0

0

0

2

0

0

2

0

1

1

1997

3

2

5

1

5

4

7

3

8

7

3

7

2

1997

4

1

4

3

3

4

3

1

7

1

5

5

3

1997

7

9

3

5

5

6

2

4

4

7

8

11

4

1997

8

6

4

8

5

5

6

11

5

7

6

10

5

1997

0

0

3

0

2

2

1

3

2

3

1

1

6

1997

2

2

2

4

2

2

2

2

1

4

5

5

7

1997

3

1

2

6

5

1

5

3

5

1

1

3

8

1997

5

8

6

2

4

3

6

5

3

7

2

3

9

1997

1

0

1

2

1

3

1

1

2

1

3

3

1

1998

9

9

11

8

5

0

0

0

0

0

0

0

2

1998

7

3

9

18

2

0

0

0

0

0

0

0

3

1998

10

6

12

10

0

0

0

0

0

0

0

0

4

1998

6

14

12

10

2

0

0

0

0

0

0

0

5

1998

4

6

2

1

0

0

0

0

0

0

0

0

6

1998

3

4

7

5

0

0

0

0

0

0

0

0

7

1998

4

6

4

9

2

0

0

0

0

0

0

0

8

1998

7

2

10

9

3

0

0

0

0

0

0

0

9

1998

5

4

6

4

0

0

0

0

0

0

0

0

The following code produces the sum of the value (quantity * unit price) per employee, pivoted by order year, and it generates the output shown in Table 7-13:

EXEC Northwind.dbo.sp_pivot
  @query    = N'
SELECT O.OrderID, EmployeeID, OrderDate, Quantity, UnitPrice
FROM dbo.Orders AS O
  JOIN dbo.[Order Details] AS OD
    ON OD.OrderID = O.OrderID',
  @on_rows  = N'EmployeeID AS empid',
  @on_cols  = N'YEAR(OrderDate)',
  @agg_func = N'SUM',
  @agg_col  = N'Quantity*UnitPrice';

Table 7-13. Sum of Value per Employee Pivoted by Order Year

empid

1996

1997

1998

3

19231.80

111788.61

82030.89

6

17731.10

45992.00

14475.00

9

11365.70

29577.55

42020.75

7

18104.80

66689.14

56502.05

1

38789.00

97533.58

65821.13

4

53114.80

139477.70

57594.95

2

22834.70

74958.60

79955.96

5

21965.20

32595.05

21007.50

8

23161.40

59776.52

50363.11

The following code produces the sum of the quantity per store, pivoted by order year and month, and it generates the output shown in Table 7-14:

EXEC pubs.dbo.sp_pivot
  @query    = N'
SELECT stor_id, YEAR(ord_date) AS oy, MONTH(ord_date) AS om, qty
FROM dbo.sales',
  @on_rows  = N'stor_id',
  @on_cols  = N'
CAST(oy AS VARCHAR(4)) + ''_''
  + RIGHT(''0'' + CAST(om AS VARCHAR(2)), 2)',
  @agg_func = N'SUM',
  @agg_col  = N'qty';

Table 7-14. Sum of Quantity per Store Pivoted by Order Year and Month

stor_id

1992_06

1993_02

1993_03

1993_05

1993_10

1993_12

1994_09

6380

NULL

NULL

NULL

NULL

NULL

NULL

8

7066

NULL

NULL

NULL

50

NULL

NULL

75

7067

80

NULL

NULL

NULL

NULL

NULL

10

7131

NULL

NULL

NULL

85

NULL

NULL

45

7896

NULL

35

NULL

NULL

15

10

NULL

8042

NULL

NULL

25

30

NULL

NULL

25

The implementation of the stored procedure sp_pivot suffers from bad programming practices and security flaws. As I mentioned earlier in the chapter, Microsoft strongly advises against using the sp_ prefix for user-defined procedure names. On one hand, creating this procedure as a special procedure allows flexibility; on the other hand, by doing so you’re relying on behavior that is not supported. It is advisable to forgo the flexibility obtained by creating the procedure with the sp_ prefix and create it with another prefix as a user-defined stored procedure in the user databases where you need it.

The code defines all input parameters with a virtually unlimited size (using the MAX specifier) and doesn’t have any input validation. Because the stored procedure invokes dynamic execution based on user input strings, it’s very important to limit the sizes of the inputs and to check those for potential SQL injection attacks. With the existing implementation it’s very easy for hackers to inject code that will do havoc and mayhem in your system. You can find discussions about SQL injection in Chapter 4 and in Books Online (URL: http://msdn2.microsoft.com/en-us/library/ms161953(SQL.90).aspx). As an example for injecting malicious code through user inputs, consider the following invocation of the stored procedure:

EXEC Northwind.dbo.sp_pivot
  @query    = N'dbo.Orders',
  @on_rows  = N'1 AS dummy_col ) DummyTable;
PRINT ''So easy to inject code here!
This could have been a DROP TABLE or xp_cmdshell command!'';
SELECT * FROM (select EmployeeID AS empid',
  @on_cols  = N'MONTH(OrderDate)',
  @agg_func = N'COUNT',
  @agg_col  = N'*';

The query string generated by the stored procedure looks like this:

SELECT *
FROM
  ( SELECT
      1 AS dummy_col ) DummyTable;
PRINT 'So easy to inject code here!
This could have been a DROP TABLE or xp_cmdshell command!';
SELECT * FROM (select EmployeeID AS empid,
      MONTH(OrderDate) AS pivot_col,
      1 AS agg_col
    FROM
      ( SELECT * FROM dbo.Orders
      ) AS Query
  ) AS PivotInput
  PIVOT
    ( COUNT(agg_col)
      FOR pivot_col
        IN([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])
    ) AS PivotOutput;

When this code is executed, the injected PRINT statement executes without any problem. I used a harmless PRINT statement just to demonstrate that code can be easily injected here, but obviously the malicious code could be any valid T-SQL code; for example, a DROP TABLE statement, invocation of xp_cmdshell, and so on. In short, it is vital here to take protective measures against SQL injection attempts, as I will demonstrate shortly.

Besides SQL injection attempts, input validation is not performed at all; for example, to verify the validity of input object and column names. The stored procedure also doesn’t incorporate exception handling. I discuss exception handling in Chapter 10, so I won’t demonstrate it here in the revised solution. I will demonstrate input validation, though.

Before presenting the revised solution, first get rid of the existing sp_pivot implementation:

USE master;
GO
IF OBJECT_ID('dbo.sp_pivot') IS NOT NULL
  DROP PROC dbo.sp_pivot;

Example 7-9 shows a suggested revised solution for the task.

Example 7-9. Creation script for the usp_pivot stored procedure

USE Northwind;
GO

IF OBJECT_ID('dbo.usp_pivot') IS NOT NULL
  DROP PROC dbo.usp_pivot;
GO

CREATE PROC dbo.usp_pivot
  @schema_name AS sysname      = N'dbo', -- schema of table/view
  @object_name AS sysname      = NULL,   -- name of table/view
  @on_rows     AS sysname      = NULL,   -- group by column
  @on_cols     AS sysname      = NULL,   -- rotation column
  @agg_func    AS NVARCHAR(12) = N'MAX', -- aggregate function
  @agg_col     AS sysname      = NULL    -- aggregate column
AS

DECLARE
  @object  AS NVARCHAR(600),
  @sql     AS NVARCHAR(MAX),
  @cols    AS NVARCHAR(MAX),
  @newline AS NVARCHAR(2),
  @msg     AS NVARCHAR(500);

SET @newline = NCHAR(13) + NCHAR(10);
SET @object  = QUOTENAME(@schema_name) + N'.' + QUOTENAME(@object_name);

-- Check for missing input
IF   @schema_name IS NULL
  OR @object_name IS NULL
  OR @on_rows     IS NULL
  OR @on_cols     IS NULL
  OR @agg_func    IS NULL
  OR @agg_col     IS NULL
BEGIN
  SET @msg = N'Missing input parameters: '
    + CASE WHEN @schema_name IS NULL THEN N'@schema_name;' ELSE N'' END
    + CASE WHEN @object_name IS NULL THEN N'@object_name;' ELSE N'' END
    + CASE WHEN @on_rows     IS NULL THEN N'@on_rows;'     ELSE N'' END
    + CASE WHEN @on_cols     IS NULL THEN N'@on_cols;'     ELSE N'' END
    + CASE WHEN @agg_func    IS NULL THEN N'@agg_func;'    ELSE N'' END
    + CASE WHEN @agg_col     IS NULL THEN N'@agg_col;'     ELSE N'' END
  RAISERROR(@msg, 16, 1);
  RETURN;
END

-- Allow only existing table or view name as input object
IF COALESCE(OBJECT_ID(@object, N'U'),
            OBJECT_ID(@object, N'V')) IS NULL
BEGIN
  SET @msg = N'%s is not an existing table or view in the database.';
  RAISERROR(@msg, 16, 1, @object);
  RETURN;
END

-- Verify that column names specified in @on_rows, @on_cols, @agg_col exist
IF   COLUMNPROPERTY(OBJECT_ID(@object), @on_rows, 'ColumnId') IS NULL
  OR COLUMNPROPERTY(OBJECT_ID(@object), @on_cols, 'ColumnId') IS NULL
  OR COLUMNPROPERTY(OBJECT_ID(@object), @agg_col, 'ColumnId') IS NULL
BEGIN
  SET @msg = N'%s, %s and %s must'
    + N' be existing column names in %s.';
  RAISERROR(@msg, 16, 1, @on_rows, @on_cols, @agg_col, @object);
  RETURN;
END

-- Verify that @agg_func is in a known list of functions
-- Add to list as needed and adjust @agg_func size accordingly
IF @agg_func NOT IN
  (N'AVG', N'COUNT', N'COUNT_BIG', N'SUM', N'MIN', N'MAX',
   N'STDEV', N'STDEVP', N'VAR', N'VARP')
BEGIN
  SET @msg = N'%s is an unsupported aggregate function.';
  RAISERROR(@msg, 16, 1, @agg_func);
  RETURN;
END

-- Construct column list
SET @sql =
  N'SET @result = '                                    + @newline +
  N'  STUFF('                                          + @newline +
  N'    (SELECT N'','' + '
           + N'QUOTENAME(pivot_col) AS [text()]'       + @newline +
  N'     FROM (SELECT DISTINCT('
           + QUOTENAME(@on_cols) + N') AS pivot_col'   + @newline +
  N'           FROM ' + @object + N') AS DistinctCols' + @newline +
  N'     ORDER BY pivot_col'                           + @newline +
  N'     FOR XML PATH('''')),'                         + @newline +
  N'    1, 1, N''''),'

EXEC sp_executesql
  @stmt   = @sql,
  @params = N'@result AS NVARCHAR(MAX) OUTPUT',
  @result = @cols OUTPUT;

-- Check @cols for possible SQL injection attempt
IF   UPPER(@cols) LIKE UPPER(N'%0x%')
  OR UPPER(@cols) LIKE UPPER(N'%;%')
  OR UPPER(@cols) LIKE UPPER(N'%''%')
  OR UPPER(@cols) LIKE UPPER(N'%--%')
  OR UPPER(@cols) LIKE UPPER(N'%/*%*/%')
  OR UPPER(@cols) LIKE UPPER(N'%EXEC%')
  OR UPPER(@cols) LIKE UPPER(N'%xp_%')
  OR UPPER(@cols) LIKE UPPER(N'%sp_%')
  OR UPPER(@cols) LIKE UPPER(N'%SELECT%')
  OR UPPER(@cols) LIKE UPPER(N'%INSERT%')
  OR UPPER(@cols) LIKE UPPER(N'%UPDATE%')
  OR UPPER(@cols) LIKE UPPER(N'%DELETE%')
  OR UPPER(@cols) LIKE UPPER(N'%TRUNCATE%')
  OR UPPER(@cols) LIKE UPPER(N'%CREATE%')
  OR UPPER(@cols) LIKE UPPER(N'%ALTER%')
  OR UPPER(@cols) LIKE UPPER(N'%DROP%')
  -- look for other possible strings used in SQL injection here
BEGIN
  SET @msg = N'Possible SQL injection attempt.';
  RAISERROR(@msg, 16, 1);
  RETURN;
END

-- Create the PIVOT query
SET @sql =
  N'SELECT *'                                          + @newline +
  N'FROM'                                              + @newline +
  N'  ( SELECT '                                       + @newline +
  N'      ' + QUOTENAME(@on_rows) + N','               + @newline +
  N'      ' + QUOTENAME(@on_cols) + N' AS pivot_col,'  + @newline +
  N'      ' + QUOTENAME(@agg_col) + N' AS agg_col'     + @newline +
  N'    FROM ' + @object                               + @newline +
  N'  ) AS PivotInput'                                 + @newline +
  N'  PIVOT'                                           + @newline +
  N'    ( ' + @agg_func + N'(agg_col)'                 + @newline +
  N'      FOR pivot_col'                               + @newline +
  N'        IN(' + @cols + N')'                        + @newline +
  N'    ) AS PivotOutput;';

EXEC sp_executesql @sql;
GO

This implementation of the stored procedure follows good programming practices and addresses the security flaws mentioned earlier. Keep in mind, however, that when constructing code based on user inputs and stored data/metadata, it is extremely difficult (if at all possible) to achieve complete protection against SQL injection.

The stored procedure usp_pivot is created as a user-defined procedure in the Northwind database with the usp_ prefix. This means that it isn’t as flexible as the previous implementation in the sense that it interacts only with tables and views from Northwind. Note that you can create a view in Northwind that queries objects from other databases, and provide this view as input to the stored procedure.

The usp_pivot stored procedure’s code takes several measures to try and prevent SQL injection attempts:

  • The sizes of the input parameters are limited.

  • Instead of allowing any query as input, the stored procedure accepts only a valid table or view name that exists in the database. Similarly, instead of allowing any T-SQL expression for the arguments @on_rows, @on_cols and @agg_col, the stored procedure accepts only valid column names that exist in the input table/view. Note that you can create a view with any query that you like and serve it as input to the stored procedure.

  • The code uses QUOTENAME where relevant to quote object and column names with square brackets.

  • The stored procedure’s code inspects the @cols variable for possible code strings injected to it through data stored in the rotation column values that are being concatenated.

The code also performs input validation to verify that all parameters were supplied; that the table/view and column names exist; and that the aggregate function appears in the list of functions that you want to support. As I mentioned, I discuss exception handling in Chapter 10.

The usp_pivot stored procedure might seem much less flexible than sp_pivot, but remember that you can always create a view to prepare the data for usp_pivot. For example, consider the following code used earlier to return the sum of value (quantity * unit price) per employee, pivoted by order year:

EXEC Northwind.dbo.sp_pivot
  @query    = N'
SELECT O.OrderID, EmployeeID, OrderDate, Quantity, UnitPrice
FROM dbo.Orders AS O
  JOIN dbo.[Order Details] AS OD
    ON OD.OrderID = O.OrderID',
  @on_rows  = N'EmployeeID AS empid',
  @on_cols  = N'YEAR(OrderDate)',
  @agg_func = N'SUM',
  @agg_col  = N'Quantity*UnitPrice';

You can achieve the same with usp_pivot by first creating a view that prepares the data:

USE Northwind;
GO
IF OBJECT_ID('dbo.ViewForPivot') IS NOT NULL
  DROP VIEW dbo.ViewForPivot;
GO

CREATE VIEW dbo.ViewForPivot
AS

SELECT
  O.OrderID       AS orderid,
  EmployeeID      AS empid,
  YEAR(OrderDate) AS order_year,
  Quantity * UnitPrice AS val
FROM dbo.Orders AS O
  JOIN dbo.[Order Details] AS OD
    ON OD.OrderID = O.OrderID;
GO

Then invoke usp_pivot, as in:

EXEC dbo.usp_pivot
  @object_name = N'ViewForPivot',
  @on_rows  = N'empid',
  @on_cols  = N'order_year',
  @agg_func = N'SUM',
  @agg_col  = N'val';

You will get the output shown earlier in Table 7-13.

If you think about it, that’s a small price to pay compared to compromising the security of your system.

When you’re done, run the following code for cleanup:

USE Northwind;
GO
IF OBJECT_ID('dbo.ViewForPivot') IS NOT NULL
  DROP VIEW dbo.ViewForPivot;
GO
IF OBJECT_ID('dbo.usp_pivot') IS NOT NULL
  DROP PROC dbo.usp_pivot;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.21.47