5 File and Filegroup Tips

SQL Server 2000 and 7.0 let you create a database that's stored on multiple physical files and organized in one or more filegroups. Because files and filegroups aren't part of the ANSI SQL-92 definition, no information schema views contain information about these organizational structures. In my last T-SQL Admin column, "Accessing File and Filegroup Metadata," May 2002, I discussed SQL Server files and filegroups and presented some code for accessing system-table information about files and filegroups that you can't get by using Microsoft-supplied methods. You might also need to manipulate files and filegroups in ways that Microsoft-supplied methods such as information schema views or Enterprise Manager don't support. Some of these tasks don't necessarily require direct system-table access, but they don't have obvious solutions either. Here are answers to five common questions you might ask about these kinds of tasks. (For information about why you might want to create a database on multiple files or filegroups, see my SQL Server Magazine article "The Road to Recovery," September 2001, and Michael D. Reilly's Web-exclusive Certifiably SQL articles "Backup Strategies," August 21, 2000, and "Restoring Databases," September 24, 2000.)

No command specifically moves a table to a new filegroup. However, recall that a clustered index's leaf level is the table data; so if you create (or recreate) a table's clustered index in a new filegroup, the table will automatically be relocated in the specified filegroup. The most straightforward way to recreate an existing clustered index is to use the CREATE INDEX command's DROP_EXISTING option, as the following syntax shows:

CREATE \[UNIQUE\] CLUSTERED INDEX <index name>
 ON <table name> (columns) 
    WITH DROP_EXISTING ON 
<filegroup name>...

If you use the two commands DROP INDEX and CREATE INDEX (without the DROP_EXISTING option) to recreate a clustered index, SQL Server completely rebuilds all the nonclustered indexes each time you use one of the commands. The row locators SQL Server uses in a nonclustered index are different depending on whether the table has a clustered index. So if you drop a clustered index, SQL Server rebuilds the nonclustered indexes to have the form they need for an underlying heap table (a table without a clustered index). Then when you rebuild the clustered index, SQL Server must rebuild all the nonclustered indexes again.

But if you use the DROP_EXISTING option, as the above example shows, SQL Server realizes that the lack of a clustered index is temporary. In SQL Server 7.0, if you use the DROP_EXISTING option to rebuild a clustered index on the same columns, the nonclustered indexes won't change. However, in SQL Server 2000 release to manufacturing (RTM) and Service Pack 1 (SP1), SQL Server rebuilds the nonclustered indexes once. This behavior, which SP2 fixes, is a bug documented in the Microsoft article "FIX: Nonclustered Indexes Rebuilt For CREATE UNIQUE CLUSTERED INDEX ... WITH DROP_EXISTING Query" (http://support.microsoft.com/default.aspx?scid=kb;en-us;q304519). However, even without the SP2 fix, using the DROP_EXISTING solution is still a more efficient method than the double rebuild needed when you don't use the DROP_EXISTING option. Just remember that the data follows the clustered index: The filegroup that contains the clustered index also contains the data.

In addition to using the DROP_EXISTING option to place tables and indexes in a specific filegroup, you can also use the keyword TEXTIMAGE_ON with the CREATE TABLE command to place the table's large object data (text, ntext, or image data type) into a specific filegroup. SQL Server supports this capability only when you first create a table. SQL Server doesn't provide a way to change the location of the large object data without completely rebuilding the table. For example, you can't just use the ALTER TABLE command to drop the text or image column, then use the ALTER TABLE command again to add it back because ALTER TABLE doesn't support the TEXTIMAGE_ON keyword. Also, because large object data is generally stored on extents that are separate from the rest of the table data, moving the clustered index won't affect the storage location of text or image data.

You can use the sp_help system stored procedure to learn which filegroup the table data is in and which filegroup each index is in, but sp_help won't tell you which filegroup the text or image data is in. For that information, you have to directly access the system tables. Information about the storage of large object data is in a row in sysindexes that has an indid value of 255. The name that sysindexes shows is the name of the table preceded by the letter t.

Let's look at some code samples that show how you can access information about the storage location of large object data. Listing 1, creates a new database that contains a user-defined filegroup called testfg. Listing 2 creates a table called testlob that stores its text data in the filegroup testfg. The SELECT statement in Listing 2 then examines the sysindexes table and reports the name of the filegroup that the text data is stored in. You could easily turn this code into a user-defined function (UDF) in SQL Server 2000, which would give you an easy way to find the filegroup for any table's large object data.

First, carefully consider why you want to move the file. Depending on your reasons, you might be able to accomplish your purpose by using the method described earlier to move all your objects to a new filegroup with its own files. In that case, you wouldn't have to move any files. However, if for some reason you need to use the exact same filename in a different filegroup, you first drop all the objects in the file (which might be all the objects in the filegroup), then use ALTER DATABASE with the REMOVE FILE option to drop the file from the filegroup. Alternatively, you can also use DBCC SHRINKFILE with the EMPTYFILE option, which will remove the data from that file and spread that data across all other files in the same filegroup, then remove the file. After you've dropped the file, you can use ALTER DATABASE with the ADD FILE option to add the file to the necessary filegroup, then recreate all the objects in the new file. This process leads to the next question.

One solution to dropping all the tables in a filegroup involves an undocumented stored procedure called sp_MSforeachtable. Although undocumented, this stored procedure isn't encrypted or hidden; you can read its definition by using the sp_helptext stored procedure. You can do a lot of things with sp_MSforeachtable besides dropping tables from a filegroup. The simplest way to use the procedure lets you perform an action on all the tables in your current database. You must follow the call to sp_MSforeachtable with a string that must resolve into a legal T-SQL batch. The string can contain placeholders, which SQL Server replaces during execution with the names of each user table in the database; the default placeholder is a question mark (?). The string executes once for each table.

Here's a simple example that prints just the name of each user table. Because I want to print a table name as a string, the name must be in quotes. To execute, the whole batch must also be in quotes, so I use two single quotes on each side of the placeholder to mean that, on execution, I want only a single quote:

EXEC sp_MSforeachtable 'PRINT ''?'' '

Although the quotes around the placeholder look like double quotes, be sure to use four single quotes. A final single quote closes the string that's being executed for each table. If you want to use this procedure to find the number of rows in each table, you don't have to use the single quotes around the table name because you never quote a table name in a FROM clause:

EXEC sp_MSforeachtable 'SELECT COUNT(*) FROM ? '

Admittedly, the output you get from the above statement is pretty useless unless you know which table the COUNT() function is referring to, so I would probably add the table name to the SELECT statement:

EXEC sp_MSforeachtable 'SELECT ''?'', COUNT(*) FROM ? '

The contents of the batch that you use with sp_MSforeachtable can be much more complex. For example, Listing 3 creates three tables in the testfg filegroup in testdb, then provides the T-SQL code to drop all the user tables in the filegroup testfg. Before you run Listing 3's code, you need to use the code in Listing 1 to create the testdb database and testfg filegroup. For your purposes, you can replace the filegroup name with one in your database. Note that all the quotes that look like double quotes are really pairs of single quotes. Also, make sure you include the BEGIN and END keywords after the condition in the parentheses. Otherwise, you'll print that you're dropping only the tables from the requested filegroup, but you'll end up dropping all the user tables in the database. Believe me—I speak from bitter experience.

Finally, notice that in Listing 3's code, I'm looking only for rows in sysindexes that have an indid value of 0 or 1, which means I'm interested only in the rows for the tables, not the rows for the nonclustered indexes or large object data. (If I wanted to include nonclustered indexes and large object data, I'd need to write the batch differently. I'll save that solution for another T-SQL Admin column.)

The solution to this question uses the code in Listing 1 that I used to create the testfile file in the testfg filegroup. The code in Listing 4 finds the name of the primary data file from the sysfiles system table and the location of the last slash in the file path. To find the last slash, I use charindex to find the first occurrence of the slash in the reversed string. After I find the last slash, I can use its location with the substring() function to extract the name of the primary file up to and including the last slash; this complete name is the name of the directory that I'll put new files in.

After I have the directory name, I can get a new filename by concatenating the name of the new file to the directory name, and I can use the new filename to dynamically execute the ALTER DATABASE command and add the new file to the database. You can test yourself by figuring out the exact syntax for a dynamically executed ALTER DATABASE command that creates a new file in the same directory as the rest of the data files. I'll show you the solution in a future column.

I just installed a new hard drive and want to place my most heavily accessed tables on it. How can I move a table or index from one filegroup to a filegroup on the new drive?
I have a table that contains several text and image columns that are rarely accessed. I don't want to use space on my fastest drives to store this little-used large object data. Can I place the data on a filegroup that's separate from the rest of the table's data?
How can I move a file to a new filegroup?
Can I use one command to drop all the tables in a filegroup rather than dropping them one by one?
How can I determine where a database's primary file is so that I can create other files in the same directory?

Working on Your Own

As an administrator, you have to perform many tasks that involve SQL Server files and filegroups. You can't accomplish all those tasks by using the Microsoft-supplied tools such as information schema views or Enterprise Manager. And even when you can perform file or filegroup tasks by using Enterprise Manager, you'll sometimes want or need a programmatic solution that you can run as a T-SQL script. You can use the solutions in this article to perform some of those tasks and to figure out more T-SQL solutions on your own.

Comments

Plain text