In "Keeping Concurrent," October 2005, I introduced you to one of the most widely advertised new features of SQL Server 2005: the ability to use row-level versioning (RLV) to support snapshot isolation. In "Seek and You Shall Find," September 2005, I began a discussion of SQL Server 2005's new metadata model. Over the past several months, I've also discussed some of the new metadata views available. Of particular interest is the suite of dynamic management objects, which are similar to the sysprocesses and syscacheobjects pseudo system tables. Now, I'm going to combine the topics of snapshot isolation and dynamic management objects and tell you about a couple of the Dynamic Management Views (DMVs) that you can use to observe the impact that RLV has on your system and to perform troubleshooting.
The sys.dm_tran_version_store DMV lets you see the version-store rows that snapshot transactions use to get the committed data that existed when the statement or transaction started. This DMV returns a virtual table that contains all the current records being saved for versioning. For most of my analyses, all I need to know is how many rows the version store contains, so a simple count(*) function is all that's necessary.
The code that Listing 1 shows creates a new database called testsi, which you can use to test and observe some snapshot isolation behavior. Listing 1 also creates a table called dbo.test and populates that table with 26 large rows. After you run the code in Listing 1, run the command
ALTER DATABASE testsi SET allow_snapshot_isolation ON
to alter the testsi database so that it can run in full snapshot isolation. Doing so lets a transaction access the committed data version that was available when the transaction started.(Transactions can read versioned data in this isolation level only after you use the SET command to switch the session to snapshot isolation.) Once you alter the database to allow snapshot isolation, all update and delete operations will save the previous data values in the version store—even when no sessions are attempting to read versioned data.
To observe this behavior, run the script that Listing 2 shows. This script performs a simple update to the testsi database, then looks at the count of rows in the version store. The count(*) function in Listing 2 returns a value of 26 because the script updates all 26 rows in the table. The rows in the version store reflect the previous value of the data. SQL Server had to store these rows, even though no active transactions were running a snapshot transaction, because another session could have started such a transaction after the update operation was finished but before the transaction was committed. Once the transaction is committed, the saved rows in the version store aren't needed so long as no sessions using snapshot transactions are running. However, SQL Server doesn't clean up those unnecessary rows immediately, so if you immediately rerun the update in Listing 2, you'll see twice as many rows in the version store.
A background thread runs every minute and reclaims all reusable space from the version store. Wait just a minute, then run the SELECT statement in Listing 2 to return the count of rows in the version store; you'll see that the value has dropped to 0. If you've configured tempdb for autogrow and tempdb runs out of free space, SQL Server will call the cleanup function before increasing the size of the tempdb files. If the disk fills up and SQL Server can't grow the tempdb files, it stops generating version records for the data modifications. If a snapshot query needs to read an older version of a row that wasn't generated because of space constraints, the query fails with error 3958. The error message (as of the September 2005 Community Technology Preview—CTP—build) reads Transaction aborted when accessing versioned row in the table 'dbo.test' in database 'testsi'. Requested versioned row was not found. Your tempdb is probably out of space. Please refer to BOL on how to configure tempdb for versioning.
Note that the version store contains only committed row versions. If the same data changes multiple times in one transaction, the version store doesn't hold all the intermediate values.
Watch for Trouble
Now let's look at a snapshot transaction running concurrently with the update transaction in Listing 2. (To generate versions, the database need only be set to ALLOW full snapshot isolation or to READ COMMITTED SNAPSHOT. A SELECT statement is considered to be a snapshot transaction if it's reading from the version store.) If you've run the update in Listing 2 twice, the value in the Cost column will be 2.0 in every row. You can open a new session and run the batch that Listing 3 shows to see that committed value.
Run the code in Listing 2 in a different session, then rerun the SELECT statement from Listing 3 in the same connection you ran it in the first time. Even though the cost values have been updated, the SELECT statement will return the original average value because the SELECT statement is running as a snapshot transaction that's still in progress. If you run Listing 2 in its original connection several more times, then run the SELECT statement again, the average cost value still won't change. That doesn't mean that no additional cost exists.
Every time you execute the Listing 2 update, SQL Server generates another version of each of the rows in dbo.test. The most-recent version is always linked to the head of the list, so whenever SQL Server needs to find versions of a row, it starts with the most-recent version. Every subsequent SELECT statement in the same transaction must traverse a longer chain to find the data from the start of that transaction. As long as SQL Server hasn't committed (or rolled back) the SELECT transaction, all the versions for the rows that the transaction accessed need to be kept. The background cleanup thread won't clean up the version store. If you notice that the version store isn't being cleaned out or that your snapshot SELECT statements seem to be taking longer and longer to execute, look at the sys.dm_tran_active_snapshot_database_transactions DMV. This DMV contains one row for each active transaction in each snapshot-enabled database in the SQL Server instance. The DMV doesn't include read-only transactions (e.g., a SELECT statement in auto-commit mode, without explicit BEGIN TRAN and COMMIT TRAN statements) or system transactions.
Next, run the SELECT statement that Listing 4 shows. This statement will return one row, which will contain the session_id value for the connection in which you're running the snapshot transaction. No other active snapshot transactions exist, and the update transaction runs to completion every time you run Listing 2, so it isn't active when you run SELECT from the sys.dm_tran_active_snapshot_database_transactions DMV. SQL Server generates a transaction_sequence_num column for each transaction in a snapshot-enabled database that is either generating versions or running in one of the snapshot isolation levels. The output will show a value of 0 in the first _snapshot_sequence_num column, which means that no snapshot or version-generating transactions were running when the current transaction started. If another running transaction had also generated a transaction_sequence_num column, the lowest-numbered value would be returned in the first_snapshot_sequence_num column.
The max_version_chain_traversed column is also valuable for troubleshooting. Every time you run another update transaction on the dbo.test table then run SELECT from the table again, the version chain will grow longer. If you see this value getting into the double digits in conjunction with an increased elapsed_time_seconds value, you'll probably want to investigate why the transaction is staying open so long. Because you also have the session_id value, you'll be able to determine which process is associated with the long version chain.
Not for Everyone
Snapshot isolation isn't for every organization or every application. You need to be aware of the costs of using this functionality and make sure that the benefits will be worth those costs. First, the version store in tempdb needs to be large enough to hold all the necessary versions of all the rows that any snapshot transaction might need. Second, any data-modification operation in any database that uses snapshot isolation will take longer because old row versions must be written to the version store in tempdb. Third, any snapshot SELECT statements you run might take longer because SQL Server needs to traverse the linked list of versioned rows in the version store to find the data that was committed before the current transaction started.
In future columns, I'll show you more about using DMVs to watch what happens during snapshot transactions. I'll also tell you about other SQL Server 2005 features that use the version store for other operations besides snapshot transactions.
Kalen Delaney ([email protected]) is a principal mentor of Solid Quality Learning, and provides SQL Server training and consulting to clients around the world. Her most recent book is Inside Microsoft SQL Server 2000 (Microsoft Press).