Finding Your Top 10 SQL Server Queries

Take a little effort to identify your worst-performing queries

ITPro Today

October 29, 2008

4 Min Read
Finding Your Top 10 SQL Server Queries

You’re already capturing SQL trace information through the use of server-side traces, which send results directly to files on a locally attached disk on the server. Server-side traces not only minimize server load during the tracing process but also give you an easy way to parse and analyze the information later. By copying the files to another machine for processing, you can negate any impact on the production server while you analyze the data.

But what if you need to parse the trace information in a way that will let you aggregate the data to find your worst-performing and most-called queries, stored procedures, and so on? One simple solution is to use a T-SQL user-defined function called SQL_ Signature (popularized by Microsoft’s Tom Davidson). There are more efficient but slightly more complicated methods for tackling this problem—I’ll follow up in the SQL Server Magazine Performance Tuning and Optimization forum ( with a few—but this solution should get you started.

Replacing Information

In “Generating Server-Side Traces”, I established that you can use the fn_ trace_gettable() function to read events and columns directly from the trace files. This function will be your primary means of retrieving trace data for direct processing (or placing it in a table for later processing).


The goal of the SQL_Signature function, which Listing 1, shows, is to replace certain pieces of information (which might vary between calls to the database) with a constant—in this case, the # symbol. These pieces of information are typically the values in a search argument of the WHERE clause that vary from one call to the next, while the rest of the statement remains constant.

Replacing these values will let you get a core list of what I call Query Classes and will let you aggregate the data to get meaningful metrics. In the following example, the three statements are effectively the same except for the CustomerID value. If you tried to aggregate these with a simple GROUP BY clause, you would get three separate groups.

SELECT * FROM [dbo].[OrderHeader] WHERE  CustomerID = 397SELECT * FROM [dbo].[OrderHeader] WHERE  CustomerID = 398SELECT * FROM [dbo].[OrderHeader] WHERE  CustomerID = 398

However, if you replace the IDs with a constant, you can group these together to get more meaningful metrics.

SELECT * FROM [dbo].[OrderHeader] WHERE  CustomerID = #

The same is true for stored procedure calls. For example,

EXECUTE [dbo].[cp_test] 70EXECUTE [dbo].[cp_test] 71EXECUTE [dbo].[cp_test] 72


EXECUTE [dbo].[cp_test] #

To use the function, you’d simply pass in the text you want to parse, along with a second parameter to tell it how many characters from the beginning of the statement to use during the parsing.

SELECT dbo.SQL_Signature('select * from  orders where orderid = 10255 AND MyDate  = '20060101',1000)

You can also use the function directly when reading the data from the trace file; to do it properly for many rows, you need to place the data into a table. You can use the following code to create the table, insert the data, and parse the TextData column:

SELECT [EventClass],[TextData],[DatabaseID],[Duration],[Reads],[Writes],[CPU],dbo.SQL_Signature([TextData],1000) AS  [SQLSignature],CAST(0 as INT) AS [HashCode]INTO YourTableFROM ::FN_TRACE_GETTABLE('C:YourTraceFile.  trc', DEFAULT)

Next, to make the aggregations easier, compute a hashed value of the parsed text for each row. You might want to add an index on HashCode after the update:

UPDATE YourTable WITH (TABLOCK)  SET [HashCode] =  CHECKSUM([SQLSignature])

Now, you can aggregate the data:

SELECT [EventClass],SUM([Duration]) AS  [Total Duration],SUM([Reads]) AS [Total Reads],  SUM([Writes]) AS [Total Writes],SUM([CPU]) AS [Total CPU], COUNT(*) AS  [Total Counts],[HashCode], CAST(' AS NVARCHAR(4000))  AS [SQLSignature],CAST(' AS NVARCHAR(4000)) AS [Sample_  Statement]INTO AggTableFROM YourTable GROUP BY  [EventClass],[HashCode]

Finally, add the query class (SQLSignature) and a sample statement for each class, and you’re ready to go.

UPDATE AggTable SET [SQLSignature] =(SELECT TOP 1 a.[SQLSignature] FROM  YourTable AS a WHERE a.[HashCode] =AggTable.[HashCode]),[Sample_Statement] =(SELECT TOP 1 a.[TextData] FROM  YourTable AS a WHERE a.[HashCode] =  AggTable.[HashCode])


Your Call

You can easily add further metrics (e.g., averages, max) to this simplified example as you see fit. Now, to find your top worst performing queries, you can select the aggregated results in any order you want— by Reads, Writes, CPU, or Duration. Also, take into account the number of times the query class was run. For example, if you execute a query once a day, and it has 1,000,000 reads, is that considered one of the top 10 queries you should be concerned with? It isn’t if you have other queries that use 5,000 reads each but are called once per second. Slice and dice it as you want, but if you tackle the top 10 offenders in your system, you’ll probably tackle 90 percent of the overall problems and get the biggest bang for your effort.

Read more about:

Top 10
Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like