Skip navigation

Query Performance On Partitioned Tables

Congratulations to Ahmad Mohamed, a DBA and developer for CGI-AMS in Cleveland, Ohio, and Daniele Pierasco, a DBA for Brain Force Software Italia in Milan, Italy. Ahmad won first prize of $100 for the best solution to the April Reader Challenge, "Query Performance On Partitioned Tables." Daniele won second prize of $50. Here’s a recap of the problem and the solution to the April Reader Challenge.

Problem:

David is a database developer for a company that reports about Web site traffic. The information from the Web server log files is stored in SQL Server 2000 databases. One of the main tables in the database is stored as partitions and each partition contains one week’s worth of Web server log file data. David has scripts that bulk insert data into the partitions daily. The following code is an example of the partitioned tables.

USE tempdb
GO
CREATE TABLE WebHits_W1 ( WebHitId int NOT NULL PRIMARY KEY, WeekStartId smallint NOT NULL CHECK (WeekStartId = 1) )
CREATE TABLE WebHits_W2 ( WebHitId int NOT NULL PRIMARY KEY, WeekStartId smallint NOT NULL CHECK (WeekStartId = 2) )
GO
CREATE VIEW WebHits_Weekly
AS
SELECT WebHitId, WeekStartId FROM WebHits_W1
UNION ALL
SELECT WebHitId, WeekStartId FROM WebHits_W2
GO

A sample file that contains the data for the WebHits_W1 table follows.

-- File: c:\temp\webhits.txt
1,1
2,1
3,1
4,2
5,2
6,2

David uses the following bulk insert statement to insert data into the table WebHits_W1:

BULK INSERT tempdb..WebHits_W1 FROM 'c:\temp\webhits.txt'
WITH (tablock, datafiletype = 'char', fieldterminator = ',', firstrow = 1, lastrow = 3)

BULK INSERT tempdb..WebHits_W2 FROM 'c:\temp\webhits.txt'
WITH (tablock, datafiletype = 'char', fieldterminator = ',', firstrow = 4, lastrow = 6)

A bug in the bulk insert script caused some of the partitions to be modified without checking the constraints, which resulted in poor query performance. Sample queries that perform poorly after the bulk inserts follow.

SELECT *
FROM WebHits_Weekly
WHERE WeekStartId = 1

SELECT top 1 *
FROM WebHits_Weekly
WHERE WeekStartId = 1
ORDER BY WebHitId

SELECT count(*)
FROM WebHits_Weekly
WHERE WeekStartId = 1

Help David identify the performance problem’s cause, query to identify the affected partitioned tables, and correct the problem.

Solution:

David’s query against the partitioned view is accessing all the view’s base tables, thus causing the performance problem. He verifies this by analyzing the statistics I/O output for queries that show I/O against both the WebHits_Weekly partitioned view’s base tables. For example, if David runs the following statements

SET STATISTICS IO ON
GO
SELECT * FROM WebHits_Weekly
WHERE WeekStartId = 1
GO
SET STATISTICS IO OFF
GO 

The statistics I/O output shows that the query is accessing both partitions, even though the check constraint information can be used to eliminate the WebHits_W2 table from the query based on the condition that WeekStartId = 1. Statistics I/O returns the following output:

Table 'WebHits_W2'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.
Table 'WebHits_W1'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.

The problem is that the optimizer can’t use the check constraints to eliminate partitions at run-time. Check constraints—that aren’t considered by the optimizer—are untrusted and David can identify them using the OBJECTPROPERTY() metadata function’s CnstIsNotTrusted property value. The query to identify such constraints using the built-in INFORMATION_SCHEMA views follows:

SELECT ctu.TABLE_SCHEMA, ctu.TABLE_NAME, cc.CONSTRAINT_NAME, cc.CHECK_CLAUSE
FROM INFORMATION_SCHEMA.CHECK_CONSTRAINTS AS cc
JOIN INFORMATION_SCHEMA.CONSTRAINT_TABLE_USAGE AS ctu
ON ctu.CONSTRAINT_NAME = cc.CONSTRAINT_NAME
WHERE objectproperty( object_id(quotename(cc.CONSTRAINT_NAME)), 'CnstIsNotTrusted' ) = 1

David can use the query’s results to make the optimizer trust the check constraints. The check constraints can be enabled again using the ALTER TABLE statement’s WITH CHECK option. The following code will make the untrusted constraints trusted:

DECLARE @tablename nvarchar(300), @constraint nvarchar(300)
DECLARE @constraints cursor
SET @constraints = cursor fast_forward FOR
                   SELECT quotename(ctu.TABLE_SCHEMA) + N'.' + quotename(ctu.TABLE_NAME), quotename(cc.CONSTRAINT_NAME)
                     FROM INFORMATION_SCHEMA.CHECK_CONSTRAINTS AS cc
                     JOIN INFORMATION_SCHEMA.CONSTRAINT_TABLE_USAGE AS ctu
                       ON ctu.CONSTRAINT_NAME = cc.CONSTRAINT_NAME
                    WHERE objectproperty( object_id(quotename(cc.CONSTRAINT_NAME)), 'CnstIsNotTrusted' ) = 1
OPEN @constraints
WHILE (1=1)
BEGIN
	FETCH @constraints INTO @tablename, @constraint
	IF @@fetch_status 

David uses the following code to verify the original query’s statistics I/O.

SET STATISTICS IO ON
GO
SELECT * FROM WebHits_Weekly
WHERE WeekStartId = 1
GO
SET STATISTICS IO OFF
GO

Now, based on the WHERE clause’s information, statistics I/O shows that the query only accesses the WebHits_W1 partition.

Table 'WebHits_W2'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0.
Table 'WebHits_W1'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.

Finally, David uses the BULK INSERT command’s CHECK_CONSTRAINTS to ensure that constraints are checked upon data insertion, which will prevent the check constraints from being marked untrusted.

JUNE READER CHALLENGE:

Now, test your SQL Server savvy in the June Reader Challenge, "Capturing Vital Information About Data Transformations " (below). Submit your solution in an email message to [email protected] by May 19. Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We’ll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.

Problem:

Harry develops products that help build data warehouses and have SQL Server 2000 back-end databases. He’s trying to develop a set of stored procedures that will perform data transformations for the product and run long-running batch jobs. Harry wants to modify his stored procedures so he can easily capture vital information and statistics about the data transformations or batch jobs, which will let him monitor a batch job’s progress or provide counters about certain events the application defines. Help Harry provide a mechanism to raise events with an event description and the associated data without considerably modifying the application code. Also, monitor the events outside the database using tools or scripts as needed. The following example shows the data Harry wants to return from his stored procedure, which invokes a series of jobs.

Counter Name Counter Data
------------ ------------
Job Status   Percent value like 10, 20, 30
Step Status  Percent value like 10, 20, 30
App Xfrm 1   Number of Executions 1025

 

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish