Index Internal Information

Downloads
44572.zip

The sysindexes system table contains information about your index structures that you can't get through information schema views. Every database has a sysindexes table, which contains one row for every index and one row for every column-based statistics on a table. In general, Microsoft recommends that you don't retrieve data directly from this table, but there's no other way to obtain some of the information it contains. You can derive some of the index properties from the INDEXPROPERTY() function, but not all of them, as I mentioned in my December column, "Digging Up the Dirt on Indexes."

One type of information that you can get from the sysindexes table and from no other documented method is certain page numbers that a table or index uses. The sysindexes table contains three columns—first, firstIAM, and root—that represent page locations within a database, as indicated by a file number/page number combination. SQL Server stores each of these page locations in a byte-swapped format. To convert this to a decimal file number and page number, you must first swap the bytes, then convert the values from hexadecimal to decimal. For example, let's find the page address for the first page of the Northwind database's Order Details table. You can run the following query to get the value in hex:

USE Northwind
SELECT first FROM sysindexes
WHERE id = object_id('Order Details') AND indid < 2

As I mentioned in December, a sysindexes.indid value of less than 2 identifies the row for the table itself and not for a separate index on the table.

The preceding query returns the hex value 0x940000000100 for the first page of the table. You now have to convert this number to a file and page address. In hexadecimal notation, each set of two hexadecimal digits represents 1 byte, so each pair of digits has to stay together as a byte. You first "swap" or reverse the bytes, so the last becomes the first and the first becomes the last, to get 00 01 00 00 00 94. The first two groups represent the 2-byte file number, and the last four groups represent the page number. So the file number is 0x0001, which is 1, and the page number is 0x00000094, which is 148 in decimal. This means that the first page of the Order Details table is on file 1, page 148. But in a relational database, there's no implied ordering to table rows, so what does "first" mean? "First" means something only if you have an index, in which case the index's leaf level stores the pages in an order that's based on the index key columns. For a clustered index, such as the one here, the index's leaf level is the table itself.

SQL Server doesn't guarantee that the sysindexes.first column will always tell you the first page of a table. I've found that the value is reliable immediately after I build an index and stays reliable until I perform a lot of data-modification operations on the data in the table. As I mentioned, two other columns in sysindexes use the same byte-swapped hexadecimal format. The root column's value is the page location of the index's root. If the table's row in sysindexes has an indid value of 0, the table is a heap and has no clustered index and consequently no root. In that case, the value in the sysindexes.root column is just a copy of the value in the sysindexes.first column.

The third hexadecimal column, first-IAM, is the page number of the first Index Allocation Map (IAM) for the table or index. You can get lots of information about IAMs in my April 1999 article, "The New Space Management."

You can use the page addresses for the first and root columns as arguments to the DBCC PAGE command, which I described in "Using DBCC Page," March 2000. If you want to know the page numbers for all the pages that make up a table, you could get the first page number from the sysindexes table as I described above, then use DBCC PAGE to look at the header of the page and find the value for m_nextPage. Then, you could use DBCC PAGE to open that next page, find the value for m_nextPage again, and continue to follow the chain of pages for any table that has a clustered index. However, as much as this method might appeal to your inner geek, there are easier ways to get all the page numbers belonging to a table or index. These methods, which I describe next, will always show you the correct page numbers in a table, even after you perform numerous data-modification operations.

First, enable the undocumented DBCC command PGLINKAGE by turning on trace flag 3604, just like you need to do for DBCC PAGE. In order to use this command, you need to know at least one page number belonging to the table. Then you pass the database ID, the file ID, the page ID, and the number of pages you want to see:

DBCC PGLINKAGE (dbid, fileID, 
pagenum, number_of_pages, printopt=\{1|2\} )

Supplying a 0 for number_of_pages will return all the pages from the given page number to the end of the linked list of pages. A printopt value of 1 lists just the file and page numbers; when you give a printopt value of 2, you can see the previous and next page numbers along with each page number. Thus, the following command will list the first 10 pages in the Northwind database's Order Details table:

DBCC TRACEON(3604)
DBCC PGLINKAGE (6,1,148,10, 1)

Figure 1 shows the output that I get back from this command. Of course, your actual page numbers will likely be different when you run this command.

Using DBCC PGLINKAGE is a quick way to get the list of all pages belonging to a table, in sorted order. However, the page numbers all come back as messages, so you can't easily save them and examine them using SQL queries. For this reason, I prefer to use DBCC IND to get all the page numbers for a table. One benefit of DBCC IND is that you don't need to know any page numbers before you use this command; all you need to know is the table name. Also, with just one command, DBCC IND can tell you all the pages that belong to the table for all indexes and returns the results in a tabular format that you can save in a table for analysis and reporting. DBCC IND also tells you all the pages for the table, for IAMs, and for pages containing text or image data.

DBCC IND has three parameters. The first parameter is the database name or the database ID. The second parameter is the object name or object ID within the database. The third parameter is a specific index ID or one of the values 0, -1, or −2, which Table 1 explains. Here's the syntax:

DBCC IND (\{'dbname' | dbid \}, \{ 
'objname' | objid \}, \{ indid | 0 | -1 | -2 \})

To get all the pages for all the indexes for the Order Details table, I'd execute the following command:

DBCC IND ('Northwind', 'Order Details', -1)

Note that, unlike with the DBCC PAGE command, I don't need to enable trace flag 3604 before running DBCC IND. Table 2 lists this command's output columns and the meaning of each.

Although we usually talk about the leaf level of any index as being level 0 and the leaf level of a clustered index as being the data pages, in the DBCC IND output, the level above the data in a clustered index is considered to be level 0. To find the first page in any index level, you can look for the page with a PrevPagePID and a PrevPageFID of 0. However, if you have lots of rows in the output, it can be hard to scroll the output to find the row representing that first page. If you could capture this output into a table, you could then use T-SQL queries to find the rows you're interested in. Fortunately, you can capture the output of any command that gives results in a tabular form.

The first step in this case is to run the code that Listing 1 shows to create a table that will hold the tabular results of the DBCC command. Because the table name starts with sp_, if you create this table in the master database, you'll be able to access it from any database without having to qualify the table name with the database name.

To populate this table with index information from the Order Details table, run the following INSERT statement:

TRUNCATE TABLE sp_index_info
  INSERT INTO sp_index_info
    EXEC ('DBCC IND 
      ('Northwind', 'Order 
        Details', -1) ')

Since the table can be populated from any database, you might want to truncate the table before you insert more data into it, so the TRUNCATE TABLE statement is optional. To find the row for the first data page, you need to find a row that has a page type of 1 and no previous page. You can use the following SELECT statement:

SELECT * FROM sp_index_info
WHERE pagetype = 1 AND 
  prevpagePID = 0 AND prevpageFID = 0

After you have the page numbers for a table, you can use the DBCC PAGE command to examine the data rows. And, now that you have the data in a table, you can organize it in many different ways and ask questions about the indexes on this table. For example, if you want to know how many pages of each type you have, you can run the query that Listing 2 shows.

If you want to know how many pages are in the DBCC IND output for each index, you can run the code that Listing 3 shows. This query joins the new table to the sysindexes table in Northwind to get the name of each index. You can probably come up with many more questions about the indexes on a table, so I'll leave it up to you to come up with the queries to extract that information.

In future columns, I'll tell you about more interesting information you can find in the sysindexes table. I'll also start examining some of the index information that will be available in SQL Server 2005.

Comments

Plain text