Saturday, May 05, 2007

Temporary Tables, Table Variables, Table Expressions - Part 1/4

Temporary Tables, Table Variables, Common Table Expressions - Part 1.

In this series I'm going to clear up some of the misconceptions and confusion surrounding these tempory data structures that are commonly used to materialize data temporarily. Temporary tables, table variables are often 'abused' due to a lack of knowledge about efficient set based programming (myself included). Hopefully I can provide a better understanding of how these temporary structures behave and in which circumstances you should use each.

Local Temporay Tables
Tempory tables are manipulated in the exact same way as permanent tables, however temp tables are created in the tempdb, reguardless of your session's database context (except if they are small enough and sql server has enough free memory, in which case they will reside in the cache - but don't count on it).

Remember that tempdb's recovery mode is SIMPLE and cannot be changed, this means that all bulk operations involved with temporary are always minimally logged - there is no recovery process for tempdb. One reason to use temporary tables is to take load off of the user database when you need persistant temporary data; we can take advantage of the fact that tempdb is treated differently than user databases.

Scope and Visibility
A temporary table is owned by the creating session, and is only visible to it. The scope of the temp table is limited to the session, therefore other sessions may create temp tables using the same name. Because of this SQL Server will generate its own naming scheme for entries in system tables (ie sys.objects), and these names won't directly correlate directly with the name
you have assigned it.

Within the session, the temp table is only visible to the creating level in the call stack as well as the inner levels, not the outer levels. If you create a temp table in the outermost level, its available everywhere within the session, across batches and within the inner levels. As long as you don't close the connection, you'll have access to the temp table. This can be really useful when you want to pass information to inner levels that don't have input parameters such as triggers. when the creating level gets out of scope the temporary table is automatically destroyed. The scope and visibility of temporary tables are much different than table variables and common table expressions, and will likely influence your choice in using one of these objects over the other.

Transaction Context
Temporary tables are likely to be used in transactions and obvisously behave differently than permanent tables in terms of logging and locking. Remember again that tempdb has no recovery process, therefore there will be minimal logging which only ensure that transactions can be rolled back (but not rolled forward). Unlike, permanent tables, temporary tables can only be accessed by the creating session, therefore there will be substantionally less locking involved.

Statistics.
Unlike table variables, The query optimizer creates and maintains distribution statistics for temporary tables in order to keep track of their cardinality, similar to permanant tables. This info is used to estimate selectivity and determine optimized plans. In order to maintain accurate statistics SQL Server must recompile statistics when the recompilation threshold is reached (determining the recompilation threshold for temp tables will be a whbole different article). This propery will likely will affect your choice of temporary data structures. If you are planning on doing a table scan of your data anyway you might not need to accept the overhead involved with keeping these statistics.

Temporary Tables or Table Variables?
In order to determine which structure to use you must understand the answers t the following questions: 1. Does the optimizer need distribution statistics or accurate cardinality estimations to generate an efficint plan, and if so, What's the cost of using an inefficient plan when statistics are not available. 2. What is the cost of recompilations if you do use temporary tables?

If the table is tiny (only a couple pages) the alternatives are to either 1) Use a table variable resulting in complete svans and few or no recompilations, or 2) use a temporary table resulting in index seeks and more recompilations. The advantages of seeks versus scans may be outweighed by the disadvantages of recompiles, or vice versa. I'll talk about table variables more in my next article.

Part 1: Temporary Tables (#Tables)
Part 2: Table Variables (@Tables)
Part 3: Table Expressions (CTEs)

1 comment:

Anonymous said...

Some more info about sql server temporary tables.
http://vipinc007.blogspot.com/2008/08/sql-server-2005-temporary-tables.html