Last month, I started telling you about some simple changes you can make to your code that can potentially have an enormous performance benefit. The example I gave you concerned the use of scalar-valued, user-defined functions in your SELECT list. Someone who commented on the article indicated that he wasn’t seeing the “bad” behavior I reported, but someone else said that applying my recommendation made a big difference to his application performance. So, as I tell my students, my suggestions are just things to try. Some of them will work for you and not for other people, and some of my suggestions will work for others and not for you.
This month I’m going to tell you about a simple change that is in regard to behavior that has been a problem in one form or another since the earliest version of SQL Server. Some improvements have been made, but not enough for me to stop being concerned about this behavior. The change has to do with search arguments (SARGs). The loose definition of a SARG is any predicate used in your WHERE clause to specify what rows you’re interested in. The formal definition of a SARG is a predicate in the following form:
WHERE <column> <operator> <value>
So the following predicates are SARGs:
WHERE name = ‘Joe’ WHERE 10 > price WHERE Total BETWEEN 10 and 20
These predicates aren’t SARGs:
WHERE substring(name, 1,2) = ‘Mc’ WHERE Total * 10 BETWEEN 100 and 200
The key is that the column must be all by itself on one side of the comparison operator. So why are SARGs important? If you don’t have a SARG, the optimizer won’t consider using an index seek on the column in the WHERE clause, and in many cases will just ignore the index. Let’s look at an example. In the AdventureWorks database, there’s a table called Production.Product. We’ll make a copy of this table in the dbo schema and build a couple of indexes on it, as the following shows:
USE AdventureWorks; GO SELECT * INTO dbo.Product FROM Production.Product WHERE ListPrice > 0; GO CREATE INDEX fname_index on Product(Name); CREATE INDEX price_index on Product(ListPrice); GO
Now let’s look at some queries. I’ll be capturing only the logical reads that are required for each query and telling you whether the index is used; if you want to look at the plans on your own, that’s great! My dbo.Product table has only nine pages in it, so when I see a logical reads value of 9, I’ll know that SQL Server performed a table scan and did not use an index seek.
USE AdventureWorks; SET STATISTICS IO ON; GO SELECT * FROM Product WHERE Name LIKE 'Half-Finger%'; GO SELECT * FROM Product WHERE SUBSTRING(Name, 1, 11) = 'Half-Finger'; GO
The first SELECT query needed five logical reads and used an index seek. The second needed 9 logical reads and scanned the table. Notice that the LIKE used with a value with the wildcard at the end can be translated into a BETWEEN or into two inequalities, and SQL Server will do this translation automatically. Each of the comparisons is a SARG, and two comparisons on the same column combined with AND are also a SARG.
WHERE Name >= 'Half-Finger' AND Name < 'Half-Finges'
If you used a LIKE comparison and a string with a wildcard at the beginning, there would be no way to translate the comparison into a SARG. Now what about the <value> component? In much older versions of SQL Server, the value had to actually be a constant, but SQL Server has gotten smarter over the years and now the value can be an expression, so the following predicates are SARGs:
WHERE Name = SUBSTRING(‘Half-Fingered Gloves’, 1, 11) WHERE Price > 16/4
But not all expressions are allowed in SARGs. For example, UDFs and many built-in mathematical functions aren’t allowed in SARGs. The following isn’t a SARG:
WHERE Price > sqrt(16)
So try executing the following three queries, all of which return exactly the same results:
SET STATISTICS IO ON; GO SELECT * FROM Product WHERE ListPrice < 4; GO SELECT * FROM Product WHERE ListPrice < 16/4; GO SELECT * FROM Product WHERE ListPrice < sqrt(16); GO
The first two used an index seek on the index on ListPrice and needed four logical reads; the last one used a table scan and needed nine logical reads. Just because your query has a SARG doesn’t guarantee an index will be used for a seek. It just means the optimizer will consider using a seek. It could still turn out that the optimizer decides that using the index would be too expensive, as in the following query:
SELECT * FROM Product WHERE ListPrice < 100; GO
Although there’s a SARG, there are too many rows returned to make the index worthwhile.
So how do you know whether your expression is acceptable as a SARG? There’s no list of what functions you can use in your SARGs and which you can’t. Microsoft is making improvements to the optimizer in every service pack, and a function that isn’t allowed in a SARG today might be allowed tomorrow. So my advice is always to test it and see. If you think SQL Server should be using an index seek on a column and it isn’t, check your SARGs to at least make sure they follow the format described here. Try rewriting the predicates to use different functions and constructs that give the same result. And always make sure your column is isolated; that requirement never changes!