Temporary Tables

SQL Server supports two types of temporary tables: local and global. For the most part, I’ll focus on local temporary tables because this is the type you would typically consider in the same situations as table variables and table expressions. I’ll also describe global temporary tables, but these typically have completely different uses than local temporary tables.

Local Temporary Tables

I’ll start with some fundamentals of local temporary tables before showing examples, and I’ll do the same whenever discussing a new temporary object type. When referring to temporary tables in this section, assume that the discussion pertains to local ones.

You create and manipulate a temporary table just as you would a permanent one, for the most part. I’ll point out the aspects of temporary tables that are different from permanent ones, or aspects that are often misunderstood.

tempdb

Temporary tables are created in tempdb, regardless of the database context of your session. They have physical representation in tempdb, although when they’re small enough and Microsoft SQL Server has enough memory to spare, their pages will reside in cache. SQL Server will persist the temporary table’s pages on disk when there is too little free memory. Furthermore, tempdb’s recovery model is SIMPLE and cannot be changed. This means that all bulk operations involved with temporary tables are always minimally logged. Unlike user databases, there’s no recovery process in tempdb. There are many other issues you need to consider with regard to tempdb. I’ll cover some of these issues later in the chapter in a dedicated section.

So one reason to use a temporary table is to take the load off of a user database when you need to persist temporary data. You can also enjoy the fact that tempdb is treated differently from user databases.

Tip

Tip

My preferred method for checking whether an object already exists is to use the OBJECT_ID function. If the function returns a NULL, the object doesn’t exist. If you want to check whether a temporary table already exists, make sure you specify the tempdb database prefix; otherwise, SQL Server will look for it in the current database, won’t find one, and will always return a NULL. For example, to check whether #T1 exists, use OBJECT_ID(‘tempdb..#T1’) and not OBJECT_ID(‘#T1’).

Also, SQL Server 2005 now supports a second argument for OBJECT_ID, where you can specify the object type you’re looking for (for example, ‘U’ for User table). The second argument’s value must match the type column in sys.objects. This second argument was available in SQL Server 2000 as well, but it just wasn’t documented until now.

Scope and Visibility

Temporary table names are prefixed with a number symbol (#). A temporary table is owned by the creating session and visible only to it. However, SQL Server allows different sessions to create a temporary table with the same name. Internally, SQL Server adds underscores and a unique numeric suffix to the table name to distinguish between temporary tables with the same name across sessions. For example, suppose that you created a temporary table called #T1. If you query the table sys.objects (dbo.sysobjects in SQL Server 2000) in tempdb looking for a table with name LIKE ′#T1%′, you will find a table with a name similar to the following (the suffix will vary): #T1____________________________________________________ _____________________________________________________________00000000001E. Although this is the table’s internal name, you refer to it in your code by the name you used when you created it–#T1.

Within the session, the temporary table is visible only to the creating level in the call stack and also inner levels, not to outer ones. For example, if you create a temp table in the session’s outermost level, it’s available anywhere within the session, across batches, and even in inner levels–for example, dynamic batch, stored procedure, and trigger. As long as you don’t close the connection, you can access the temporary table. If it’s created within a stored procedure, it’s visible to the stored procedure and inner levels invoked by that procedure (for example, a nested procedure or a trigger). You can rely on the visibility behavior of temporary tables–for example, when you want to pass data between different levels in your session, or even just signal something to an inner level and that inner level doesn’t support input parameters (for example, a trigger). However, in some cases, you can pass such information through the context_info feature, which is visible across the session. (See SET CONTEXT_INFO in Books Online for details.)

Once its creating level gets out of scope (terminates), a temporary table is automatically destroyed. If a temporary table was created in the outermost level, it is destroyed when the session is terminated. If it’s created within a stored procedure, it is automatically dropped as soon as the stored procedure is finished.

Remember that a temporary table is not visible to levels outside of the creating one in the call stack. That’s why, for example, you can’t use a temporary table created in a dynamic batch in the calling batch. Once the dynamic batch is out of scope, the temporary table is gone. Later in the chapter, I’ll suggest alternatives to use when such a need occurs. The next part, regarding the scope, is a bit tricky. You can, in fact, create multiple temporary tables with the same name within the same session, as long as you create them in different levels–although doing so might lead to trouble. I’ll elaborate on this point in the "Temporary Table Name Resolution" section later in the chapter.

The scope and visibility of a temporary table are very different than they are with both permanent tables and table variables and can be major factors in choosing one type of temporary object over another.

Transaction Context

A temporary table is an integral part of an outer transaction if it’s manipulated in one (with DML or DDL). This fact has consequences for logging and locking. Logging has to support rollback operations only, not roll-forward ones. (Remember, there is no recovery process in tempdb.) As for locking, because the temporary table is visible only to the creating session, less locking is involved than with permanent tables, which can be accessed from multiple sessions.

So, one of the factors you should consider when choosing a temporary object type is whether you want manipulation against it to be part of an outer transaction.

Statistics

The optimizer creates and maintains distribution statistics (column value histograms) for temporary tables and keeps track of their cardinality, much as it does for permanent ones. This capability is especially important when you index the temporary table. Distribution information is available to the optimizer when it needs to estimate selectivity, and you will get optimized plans that were generated based on this information. This is one of the main areas in which temporary tables differ from table variables in terms of performance.

Also, because statistics are maintained for temporary tables, your code will be recompiled if a sufficient number of rows of a referenced table has changed since the last compilation (the recompilation threshold is reached). The recompilation threshold (RT) is based on the table type and the number of rows. For permanent tables, if n <= 500, then RT = 500 (n = table’s cardinality when a query plan is compiled). If n > 500, then RT = 500 + 0.20 * n. For temporary tables, if n < 6, then RT = 6. If 6 <= n <= 500, then RT = 500. If n > 500, then RT = 500 + 0.20 * n. You realize that, for example, after loading six rows into a temporary table, adding a seventh will trigger a recompile, whereas with permanent tables the first trigger will occur much later. If you want queries against temporary tables to use the same recompilation thresholds as against permanent ones, use the KEEP PLAN query hint.

Note

Note

SQL Server 2005 uses a statement-level recompilation model, as opposed to the procedure-level model in SQL Server 2000. I urge you to read the white paper "Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005" by Arun Marathe for details about the subject, including relevance to temporary tables. You can link to this paper at http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx.

The fact that the optimizer maintains distribution statistics for temporary tables and the aforementioned implications are the most crucial aspects of choosing a temporary object type. These factors are especially important when choosing between temporary tables and table variables, for which the optimizer doesn’t create or maintain distribution statistics. Rowcount information is maintained for table variables (in sysindexes in SQL Server 2000, and in sys.partitions in SQL Server 2005) but this information is often inaccurate. Table variables themselves do not trigger recompiles, and recompiles are required to update the rowcount information. You can force a recompile for a query involving table variables in SQL Server 2005 using the RECOMPILE query hint. In short, table variables don’t involve optimality-based recompilations.

There are two main questions you must ask yourself:

  1. Does the optimizer need distribution statistics or accurate cardinality estimations to generate an efficient plan, and if so, what’s the cost of using an inefficient plan when statistics are not available?

  2. What’s the cost of recompilations if you do use temporary tables?

There are cases in which the optimizer doesn’t need statistics to figure out an optimal plan–for example, given a query requesting all rows from a table, a point query filtering a column on which a unique index is defined, a range query that utilizes a clustered or covering index, and so on. In such cases, regardless of the table’s size, there’s no benefit in having statistics because you will only suffer from the cost of recompilations. In such cases, consider using a table variable.

Also, if the table is tiny (say, a couple of pages), the alternatives are 1) using a table variable resulting in complete scans and few or no recompilations and 2) use a temporary table resulting in index seeks and more recompilations. The advantage of seeks versus scans may be outweighed by the disadvantage of recompiles. That’s another case for which you should consider using table variables.

On the other hand, if the optimizer does need statistics to generate an efficient plan and you’re not dealing with tiny tables, the cost of using an inefficient plan might well be substantially higher than the cost of the recompilations involved. That’s a case in which you should consider using temporary tables. In the "Table Variables" section, I’ll provide examples related to these scenarios in which I’ll also demonstrate execution plans.

Temporary Table Name Resolution

As I mentioned earlier, technically you’re allowed to create multiple local temporary tables with the same name within the same session, as long as you create them in different levels. However, you should avoid doing this because of name-resolution considerations that might cause your code to break.

When a batch is resolved, the schema of a temporary table that is created within that batch is not available. So resolution of code that refers to the temporary table is deferred to run time. However, if a temporary table name you refer to already exists within the session (for example, it has been created by a higher level in the call stack), that table name will resolve to the existing temporary table. However, the code will always run against the innermost temporary table with the referenced name.

This resolution architecture can cause your code to break when you least expect it; this can happen when temporary tables with the same name exist in different levels with different schemas.

This part is very tricky and is probably best explained by using an example. Run the following code to create the stored procedures proc1 and proc2:

SET NOCOUNT ON;
USE tempdb;
GO
IF OBJECT_ID('dbo.proc1') IS NOT NULL
  DROP PROC dbo.proc1;
GO
IF OBJECT_ID('dbo.proc2') IS NOT NULL
  DROP PROC dbo.proc2;
GO

CREATE PROC dbo.proc1
AS

CREATE TABLE #T1(col1 INT NOT NULL);
INSERT INTO #T1 VALUES(1);
SELECT * FROM #T1;

EXEC dbo.proc2;
GO

CREATE PROC dbo.proc2
AS

CREATE TABLE #T1(col1 INT NULL);
INSERT INTO #T1 VALUES(2);
SELECT * FROM #T1;
GO

proc1 creates a temporary table called #T1 with a single integer column, loads a row with the value 1, returns #T1’s contents, and invokes proc2. proc2 also creates a temporary table called #T1 with a single integer column, loads a row with the value 2, and returns #T1’s contents. Both #T1 tables have the same schema. Now, invoke proc1:

EXEC dbo.proc1;

The output is what you probably expected:

col1
-----------
1

col1
-----------
2

Both procedures returned the contents of the #T1 table they created. Being oblivious to the resolution process I described earlier doesn’t really affect you in this case. After all, you did get the expected result, and the code ran with no errors. However, things change if you alter proc2 in such a way that it creates #T1 with a different schema than in proc1:

ALTER PROC dbo.proc2
AS

CREATE TABLE #T1(col1 INT NULL, col2 INT NOT NULL);
INSERT INTO #T1 VALUES(2, 2);
SELECT * FROM #T1;
GO

Run proc1 again:

EXEC dbo.proc1;

And notice the error you get in the output:

col1
-----------
1

Msg 213, Level 16, State 1, Procedure proc2, Line 5
Insert Error: Column name or number of supplied values does not match table definition.

Can you explain the error? Admittedly, the problem in the resolution process I described is very elusive, and you might not have realized it after the first read. Try to read the paragraph describing the resolution process again, and then see whether you can explain the error. Essentially, when proc2 was invoked by proc1, a table called #T1 already existed. So even though proc2’s code creates a table called #T2 with two columns and loads a row with two values, when the INSERT statement is resolved, proc2’s #T1 does not exist yet, but proc1’s does. Therefore, SQL Server reports a resolution error–you attempt to load a row with two values to a table with one column (as if).

If you invoke proc2 alone, there’s no reason for the code to fail because no other #T1 table exists in the session–and it doesn’t fail:

EXEC dbo.proc2;

You get an output with the row loaded to proc2’s #T1:

col1        col2
----------- -----------
2           2

The execution plan for proc2 now resides in cache. Ironically, if you now run proc1 again, the code will complete with no errors. proc2 will not go through a resolution process again (neither will it go through parsing or optimization); rather, SQL Server will simply reuse the plan from cache:

EXEC dbo.proc1;

And now you get the output you probably expected to begin with:

col1
-----------
1

col1        col2
----------- -----------
2           2

However, if proc2’s plan will be removed from cache and you run proc1, you’re code will break.

In short, hopefully you realize that it’s wise to avoid naming temporary tables the same in different stored procedures/levels. A way to avoid such issues is to add a unique proc identifier to the names of temporary tables. For example, you could name the temporary table in proc1 #T1_proc1, and in proc2 name the temporary table #T1_proc2.

Schema Changes to Temporary Tables in Dynamic Batches

Remember that a local temporary table created in a certain level is not visible to outer levels in the call stack. Occasionally, programmers look for ways around this limitation, especially when working with dynamic execution. That is, you want to construct the schema of the temporary table dynamically and populate it based on some user input, and then access it from an outer level. Frankly, insisting on using local temporary tables in such a scenario is very problematic. The solution involves ugly code, as is the nature of dynamic execution in general, plus recompilations resulting from schema changes and data modifications. You should consider other alternatives to provide for the original need. Still, I want to show you a way around the limitations.

Here’s an initial algorithm that attempts to provide a solution for this request:

  1. In the outer level, create temporary table #T with a single dummy column.

  2. Within a dynamic batch, perform the following tasks:

    • 2.1. Alter #T, adding the columns you need.

    • 2.2. Alter #T, dropping the dummy column.

    • 2.3. Populate #T.

  3. Back in the outer level, access #T in a new batch.

The problem with this algorithm lies in the last item within the dynamic batch. References to #T will be resolved against the outer #T’s schema. Remember that when the batch is resolved, #T’s new schema is not available yet. The solution is to populate #T within another dynamic batch, in a level inner to the dynamic batch that alters #T’s schema. You do this by performing the following tasks:

  1. In the outer level, create temporary table #T with a single dummy column.

  2. Within a dynamic batch, perform the following tasks:

    • 2.1. Alter #T, adding the columns you need.

    • 2.2. Alter #T, dropping the dummy column.

      • 2.3. Open another level of dynamic execution.

    • 2.3.1. Populate #T.

  3. Back in the outer level, access #T in a new batch.

Here’s some sample code that implements this algorithm and generates the output shown in Table 2-1:

-- Assume @schema and @insert were constructed dynamically
DECLARE @schema AS VARCHAR(1000), @insert AS VARCHAR(1000);
SET @schema = 'col1 INT, col2 DECIMAL(10, 2)';
SET @insert = 'INSERT INTO #T42 VALUES(10, 20.30)';

-- In the outer level, create temp table #T with a single dummy column
CREATE TABLE #T42(dummycol INT);

-- Within a dynamic batch:
--    Alter #T adding the columns you need
--    Alter #T dropping the dummy column
--    Open another level of dynamic execution
--      Populate #T
EXEC('
ALTER TABLE #T42 ADD ' + @schema + ';
ALTER TABLE #T42 DROP COLUMN dummycol;
EXEC(''' + @insert + ''')'),
GO

-- Back in the outer level, access #T in a new batch
SELECT * FROM #T42;

-- Cleanup
DROP TABLE #T42;

Table 2-1. Output of Code that Dynamically Alters the Schema of a Temporary Table

col1

col2

10

20.30

Global Temporary Tables

Global temporary tables differ from local ones mainly in their scope and visibility. They are accessible by all sessions, with no security limitations whatsoever. Any session can even drop the table. So when you design your application, you should factor in security and consider whether you really want temporary tables or just permanent ones. You create global temporary tables by prefixing their names with two number symbols (##), and like local temporary tables, they are also created in tempdb. However, because global temporary tables are accessible to all sessions, you cannot create multiple ones with the same name; neither in the same session, nor across sessions. So typical scenarios for using global temporary tables are when you want to share temporary data among sessions and don’t care about security.

Unlike local temporary tables, global ones persist until the creating session–not the creating level–terminates. For example, if you create such a table in a stored procedure and the stored procedure goes out of scope, the table is not destroyed. SQL Server will automatically attempt to drop the table when the creating session terminates, all statements issued against it from other sessions finish, and any locks they hold are released.

I’ll walk you through a simple example to demonstrate the accessibility and termination of a global temporary table. Open two connections to SQL Server (call them Connection 1 and Connection 2). In Connection 1, create and populate the table ##T1:

CREATE TABLE ##T1(col1 INT);
INSERT INTO ##T1 VALUES(1);

In Connection 2, open a transaction, and modify the table:

BEGIN TRAN
  UPDATE ##T1 SET col1 = col1 + 1;

Then close Connection 1. If it weren’t for the open transaction that still holds locks against the table, SQL Server would have dropped the table at this point. However, because Connection 2 still holds locks against the table, it’s not dropped yet. Next, in Connection 2, query the table and commit the transaction:

  SELECT * FROM ##T1;
COMMIT

At this point, SQL Server drops the table because there are no active statements accessing it, and no locks are held against it. If you try to query it again from any session, you will get an error saying that the table doesn’t exist:

SELECT * FROM ##T1;

There’s a special case where you might want to have a global temporary table available but not owned by any session. In this case, it will always exist, regardless of which sessions are open or closed, and eliminated only if someone explicitly drops it. To achieve this, you create the table within a special procedure (using the sp_prefix, created in the master) and mark the stored procedure with the "startup" procedure option. SQL Server invokes a startup procedure every time it starts. Furthermore, SQL Server always maintains a reference counter greater than zero for a global temporary table created within a startup procedure. This ensures that SQL Server will not attempt to drop it automatically.

Here’s some sample code that creates a startup procedure called sp_Globals, which in turn creates a global temporary table called ##Globals.

USE master;
GO
IF OBJECT_ID('dbo.sp_Globals') IS NOT NULL
  DROP PROC dbo.sp_Globals
GO
CREATE PROC dbo.sp_Globals
AS

CREATE TABLE ##Globals
(
  varname sysname NOT NULL PRIMARY KEY,
  val     SQL_VARIANT NULL
);
GO

EXEC dbo.sp_procoption 'sp_Globals', 'startup', 'true';

After restarting SQL Server, the global temporary table will be created automatically and persist until someone explicitly drops it. To test the procedure, restart SQL Server and then run the following code:

SET NOCOUNT ON;
INSERT INTO ##Globals VALUES('var1', CAST('abc' AS VARCHAR(10)));
SELECT * FROM ##Globals;

You probably guessed already that ##Globals is a shared global temporary table where you can logically maintain cross-session global variables. This can be useful, for example, when you need to maintain temporary counters or other "variables" that are globally accessible by all sessions. The preceding code creates a new global variable called var1, initializes it with the character string ‘abc’, and queries the table. The output of this code is shown in Table 2-2.

Table 2-2. Contents of ##Globals

varname

val

var1

abc

When you’re done, run the following code for cleanup:

USE master;
GO
DROP PROC dbo.sp_Globals;
DROP TABLE ##Globals;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.209.180