DISTINCT vs. GROUP BY

Q: Should I use DISTINCT or GROUP BY to eliminate duplicates in a result set?

A: A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. For example, say you want to generate a unique list of product IDs from the Order Details table in Northwind. The two following queries both give you a correct answer:

SELECT DISTINCT od.productid
FROM \[order details\] OD
SELECT od.productid
FROM \[order details\] OD
GROUP BY od.productid

Which one is more efficient? Checking execution plans is a simple way to determine the relative efficiency of different queries that generate the same result set. Enable Show Execution Plan in Query Analyzer by pressing Ctrl+K or by selecting Show Execution Plan from the Query menu. Then, execute the above queries. Figure 1 shows that the execution plans of both queries are the same. In most cases, DISTINCT and GROUP BY generate the same plans, and their performance is usually identical.

So, how do you decide which SQL command to use? GROUP BY is required if you're aggregating data, but in many cases, DISTINCT is simpler to write and read if you aren't aggregating data. Pick whichever syntax you prefer for your situation.

Comments

Plain text