Indexed Views

Remember that without an index, a view does not have any physical representation of its data–rather, it just has metadata information pointing to the underlying objects. However, SQL Server will physically materialize a view’s data if you create a unique clustered index on the view. SQL Server will keep the indexed view in sync with modifications against the underlying tables. You cannot request to synchronize the view’s contents on demand or on scheduled basis. An indexed view is very much like a table index in terms of data integrity.

Indexed views can give you great performance benefits for queries that retrieve data. Indexed views can substantially reduce the amount of I/O required to return data and the processing time required for expensive calculations. Substantial performance gains can be achieved, for example, for data aggregation queries or expensive joins. However, keep in mind that modifications to an indexed view’s underlying tables will require changes to the indexed (and therefore materialized) view, degrading the performance of your modifications.

There’s a long list of requirements and restrictions for creating an indexed view, often preventing indexed views from being a viable option. The requirement list in SQL Server 2005 did not get much shorter.

The first index that you create on a view must be unique and clustered. After creating a clustered index on a view, you can create additional nonclustered indexes. The view must be created with the SCHEMABINDING option; therefore, you must use the two-part naming convention for object names and explicitly specify column names in the SELECT list. If the view’s query aggregates data, its select list must include the COUNT_BIG(*) aggregate function. COUNT_BIG is the same as COUNT, except that its result type is BIGINT. This count allows SQL Server to keep track of the number of rows that were aggregated in each group, and it is also used to calculate other aggregates. Some SET options in the session must be in a certain state. The list of requirements and restrictions goes on. Please refer to Books Online for the gory details.

As an example, suppose that you want to optimize queries that request aggregated data from the Orders and Order Details tables for employees. One way to do this is to create a materialized view containing the aggregates you expect to request. The following code creates the VEmpOrders indexed view based on a query that joins Orders and Order Details, groups the data by EmployeeID, and calculates the sum of Quantity and the count of rows for each employee:

USE Northwind;
GO
IF OBJECT_ID('dbo.VEmpOrders') IS NOT NULL
  DROP VIEW dbo.VEmpOrders;
GO
CREATE VIEW dbo.VEmpOrders WITH SCHEMABINDING
AS

SELECT O.EmployeeID, SUM(OD.Quantity) AS TotalQty, COUNT_BIG(*) AS Cnt
FROM dbo.Orders AS O
  JOIN dbo.[Order Details] AS OD
    ON OD.OrderID = O.OrderID
GROUP BY O.EmployeeID;
GO
CREATE UNIQUE CLUSTERED INDEX idx_uc_empid ON dbo.VEmpOrders(EmployeeID);
GO

Notice that the view was created with the SCHEMABINDING option, the tables are referenced with two-part names, the COUNT_BIG function is used because it’s a query that calculates aggregates, and the index created on the view is both clustered and unique.

SQL Server doesn’t regenerate the whole index whenever the underlying tables are modified; rather, it maintains the index in a smarter manner. When you insert data, SQL Server will identify the affected row of the view and increment the aggregate values TotalQty and Cnt for that row. When you delete data, SQL Server will decrement these values. When you update data in the underlying tables, SQL Server will update the aggregate values accordingly.

To observe the performance benefit indexed views can give you, run the following query after turning on the STATISTICS IO option and the Include Actual Execution Plan in SSMS:

SELECT EmployeeID, TotalQty, Cnt FROM dbo.VEmpOrders;

The output of this query is shown in Table 5-11, and the execution plan is shown in Figure 5-3.

Table 5-11. Contents of VEmpOrders

EmployeeID

TotalQty

Cnt

1

7812

345

2

6055

241

3

7852

321

4

9798

420

5

3036

117

6

3527

168

7

4654

176

8

5913

260

9

2670

107

Execution plan for a query against the VEmpOrders view

Figure 5-3. Execution plan for a query against the VEmpOrders view

The plan shows that the view’s clustered index was scanned. For this small view, which contains only nine rows, the total I/O was two logical reads. If you’re using the Enterprise edition of SQL Server (or the Developer edition, which is equivalent in its feature set), the query optimizer will consider using the indexed view for queries against the view without specifying any hints, and for queries against the base tables. For example, the following query generates the execution plan shown in Figure 5-4:

SELECT O.EmployeeID, SUM(OD.Quantity) AS TotalQty, AVG(OD.Quantity) AS AvgQty, COUNT_BIG(*)
AS Cnt
FROM dbo.Orders AS O
  JOIN dbo.[Order Details] AS OD
    ON OD.OrderID = O.OrderID
GROUP BY O.EmployeeID;
Execution plan for a query against the Orders and Order Details tables

Figure 5-4. Execution plan for a query against the Orders and Order Details tables

As you can see, the indexed view was used, and again the I/O cost was only two logical reads. Interestingly, the query requested the aggregate AVG(OD.Quantity), which was not part of the view, yet the indexed view was used. Remember that the sum and the count were calculated. If you expand the properties of the Compute Scalar operator in the plan, you will find the following expression, which calculates the average from the sum and the count:

[Expr1005] = CASE WHEN [Northwind].[dbo].[VEmpOrders].[Cnt]=(0) THEN NULL ELSE [Northwind].[
dbo].[VEmpOrders].[TotalQty]/
CONVERT_IMPLICIT(int,[Northwind].[dbo].[VEmpOrders].[Cnt],0) END, [[Northwind].[dbo].[VEmpOr
ders].Cnt] = [Northwind].[dbo].[VEmpOrders].[Cnt]

Note

Note

If you’re not working with the Enterprise or Developer edition of SQL Server, an indexed view will not be considered by default even when you query the view directly. To use the index, you must specify the NOEXPAND hint.

As I mentioned earlier, the requirements and restrictions list for creating an indexed view did not get much shorter in SQL Server 2005. The optimizer was enhanced in the sense that it uses indexed views in more cases. For example, SQL Server 2000 did not use an indexed view when the outer query specified a subinterval of the view’s query filter. That is, if the view’s query had the filter col1 > 5 and the outer query had the filter col1 > 10, the indexed view was not considered at all. SQL Server 2005 now supports subintervals. Similarly, SQL Server 2000 did not consider the index when the outer query used a logically equivalent filter expression as the view’s expression, but not an exact one. For example, if the view’s query had the filter col1 = 5 and the outer query had 5 = col1, SQL Server 2000’s optimizer did not consider using the indexed view. SQL Server 2005’s optimizer does.

More Info

More Info

For detailed information about indexed views in SQL Server 2005 and improvements over SQL Server 2000, please refer to the white paper "Improving Performance with SQL Server 2005 Indexed Views" by Eric Hanson, which can be found at http://www.microsoft.com/technet/prodtechnol/sql/2005/ipsql05iv.mspx.

You can use indexed views for other purposes besides performance. For example, T-SQL’s UNIQUE constraint treats two NULLs as equal. If you create a UNIQUE constraint on a nullable column, the constraint will allow only one instance of a NULL in that column. Suppose that you want to enforce uniqueness for known (that is, NOT NULL) values only, but allow multiple NULLs. You can achieve this by using a trigger, but such a trigger has a cost. If the trigger rolls back, it’s as if the modification that fired it was done twice–or, more accurately, done and then undone. Instead of using a trigger, you can enforce such an integrity rule by using an indexed view. Create an indexed view based on a query that filters only the non-NULL values from the source column. Remember that the clustered index created on a view must be unique. Such an index will prevent duplicate known values from entering the base table, but it will allow multiple NULLs because NULLs are not part of the unique index.

To demonstrate this, run the following code, which creates the table T1 with the column keycol, and an indexed view based on a query that filters only known keycol values from T1:

USE tempdb;
GO
IF OBJECT_ID('dbo.V1') IS NOT NULL
  DROP VIEW dbo.V1;
GO
IF OBJECT_ID('dbo.T1') IS NOT NULL
  DROP TABLE dbo.T1;
GO
CREATE TABLE dbo.T1
(
  keycol  INT         NULL,
  datacol VARCHAR(10) NOT NULL
);
GO
CREATE VIEW dbo.V1 WITH SCHEMABINDING
AS

SELECT keycol FROM dbo.T1 WHERE keycol IS NOT NULL;
GO
CREATE UNIQUE CLUSTERED INDEX idx_uc_keycol ON dbo.V1(keycol);

Next, issue the following INSERT statements:

INSERT INTO dbo.T1(keycol, datacol) VALUES(1,    'a'),
INSERT INTO dbo.T1(keycol, datacol) VALUES(1,    'b'), -- fails
INSERT INTO dbo.T1(keycol, datacol) VALUES(NULL, 'c'),
INSERT INTO dbo.T1(keycol, datacol) VALUES(NULL, 'd'),

Notice that the second attempt to insert a row with the value 1 in keycol fails, but both NULLs are accepted. Query T1 and observe in the output shown in Table 5-12 that both NULLs reached the table:

SELECT keycol, datacol FROM dbo.T1;

Table 5-12. Contents of T1

keycol

datacol

1

a

NULL

c

NULL

d

When done, run the following code for cleanup:

USE Northwind;
GO
IF OBJECT_ID('dbo.VEmpOrders') IS NOT NULL
  DROP VIEW dbo.VEmpOrders;
GO
USE tempdb;
GO
IF OBJECT_ID('dbo.V1') IS NOT NULL
  DROP VIEW dbo.V1;
GO
IF OBJECT_ID('dbo.T1') IS NOT NULL
  DROP TABLE dbo.T1;
GO
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.105.2