In "Native Partitioned Tables and Indexes" at http://www.windowsitpro.com/Article/ArticleID/45153/45153.html I began my coverage of native partitioning in SQL Server 2005 and the related T-SQL code. I discussed the motivation for partitioning, the enhancements in SQL Server 2005, and explained how to create partitioned tables by creating a partition function, creating a partition scheme, and creating a partitioned table or index. Here, I continue my coverage with partitioned index considerations and querying metadata information. Run Listing 1 to create and populate a testdb database that has a partitioned Orders table with three years' data and creates a primary key and a couple of indexes. Running the code in Listing 1 will bring you to the point where we stopped last month in "Native Partitioned Tables and Indexes."
Partitioned Index Considerations
When you explicitly create a partitioned index by using the CREATE INDEX statement, or implicitly when you add a PRIMARY KEY or UNIQUE constraint, you simply specify the partition scheme in the statement's last ON clause. For example, the following code that appears at the end of Listing 1 creates a PRIMARY KEY constraint, a unique index, and a non-unique index on the Orders table:
ALTER TABLE Orders ADD CONSTRAINT PK_Orders PRIMARY KEY(orderdate, orderid) ON PS_Orders(orderdate); CREATE UNIQUE INDEX idx_nc_orderid ON Orders(orderid, orderdate) ON PS_Orders(orderdate); CREATE INDEX idx_nc_customerid ON Orders(customerid, orderdate) ON PS_Orders(orderdate);
The code creates all constraints and indexes on the PS_Orders partition scheme, which is the same partition scheme used for the Orders table itself. Remember, a partition scheme is a collection of filegroups used as the physical destination for a partitioned table or index.
There are several things you need to consider when you create partitioned indexes. First, unique partitioned indexes must include the partitioning column explicitly (not necessarily as the first key column), which lets SQL Server check uniqueness by accessing a single partition. Second, non-unique partitioned indexes aren't required to include the partitioning column explicitly, but if you don't specify it explicitly, SQL Server will implicitly include it as a subkey. Third, non-partitioned indexes (unique or non-unique) don't have to include the partitioning column. Fourth, if you don't explicitly specify a partitioning scheme or a filegroup name when you create an index, the index inherits the table's partitioning, making it aligned. Indexes that you create on a partitioned table can be either aligned or unaligned. Aligned indexes are partitioned using the same or equivalent partitioning function as the base table (same datatype and boundaries). Try to think of aligned indexes as indexes that are physically partitioned the same as the base table. Unaligned indexes are partitioned differently than the base table. In a future article, I'll cover partition switching between two tables, which allows fast data load and purging. All indexes on a table must be aligned to use partition switching.
You might need to create unaligned indexes to enforce uniqueness across partitions and you don't want the partitioning column to be included in the unique key. You just need to be aware of the aforementioned implications of using unaligned indexes. Also, you might want the index to participate in collocated joins with other tables that have different partitioning. I'll discuss collocated joins in a future article that will focus on querying, but for now, understand that collocated joins are joins where individual partitions from two inputs can be joined before the results are concatenated (unioned), as opposed to first concatenating the partitions from each input and then joining the unified results, which is less efficient.
Querying Metadata Information
To query metadata information about your partitioned tables and indexes SQL Server provides you with the $PARTITION() function and several catalog views. The $PARTITION() function is used in conjunction with the partition function name ($PARTITION.partition_function_name()). The function's purpose is to return the target partition number of a given input value that represents a partitioning column value. For example, our tables and indexes were partitioned using the PF_RangeByYear() function, which accepts a DATETIME value as input. To find the target partition number for the date 1 July 2001, use the following code:
You'll get 1 as the output because the target partition for the given input date is 1. Note that the input date doesn't necessarily have to represent a row's order date that already resides in the table. You can use the function to find out a row's target partition number with a certain order date before you insert it into the table. You can also use the function as part of a query to check in which partition each row resides as shown below:
SELECT *, $partition.PF_RangeByYear(orderdate) AS partition# FROM Orders;
Or you can return the current row count in each partition by invoking the GROUP BY query:
SELECT $partition.PF_RangeByYear(orderdate) AS partition#, COUNT(*) AS cnt FROM Orders GROUP BY $partition.PF_RangeByYear(orderdate);
Table 1 shows this query's output. You can also query many catalog views to get information about the partitioned tables, indexes, the partition functions, and partition schemes. Here's the list of catalog views available and a brief description of each:
- sys.partition_functions-returns information about partition functions
- sys.partition_parameters-returns information about parameters of partition functions
- sys.partition_range_values-returns information about partition function's boundary values
- sys.partition_parameters-returns information about parameters of a partition function
- sys.partition_range_values-returns information about boundary values of a partition function
- sys.partition_schemes-returns information about partition schemes
- sys.data_spaces-returns information about partition schemes
- sys.destination_data_spaces-returns information about individual partition schemes
- sys.partitions-returns information about partitions
- sys.tables-returns information about partitioning of tables
- sys.indexes-returns information about partitioning of indexes
- sys.index_columns-returns information about partitioning of indexes
For example, the following query gives you partition numbers and an approximate number of rows in each partition of the Orders table:
SELECT partition_number, rows FROM sys.partitions WHERE object_id = OBJECT_ID('Orders') AND index_id
Table 2 shows this query's output. The difference between using this query as opposed to an aggregate query against the Orders table with the $PARTITION() function is that the former gives you the result immediately while the latter actually counts the rows. The row counts in sys.partitions view aren't updated with each modification submitted against a partition. So if you need a rough estimate you can query the sys.partitions view, but if you need an accurate number, you need to use the $PARTITION() function and actually count. Another example of querying catalog views for partitioning information is the following query, which returns the PF_RangeByYear() function's boundary values:SELECT * FROM sys.partition_range_values WHERE function_id = (SELECT function_id FROM sys.partition_functions WHERE name = 'PF_RangeByYear')
Table 3 shows this query's output. I'll let you explore the rest of the catalog views yourself. Most are self explanatory and well documented in SQL Server Books Online (BOL).
I've covered partitioned index considerations and showed you how to query metadata information related to partitioning. You can partition indexes the same way you do the table (aligned) or differently (unaligned). The $PARTITION() function and the catalog views give you a lot of information about partitioning. Next month I'll delve into maintaining partitioned tables and indexes by explaining how to load new data and purge data by splitting, merging, and switching partitions.