Skip navigation

Catch That Bug! - 22 Feb 2006

Downloads
49041.zip

February's Bug Solution: From the Customers and Orders tables that you create and populate by running the code in Listing A, you need to return the details of customers who made orders.You build the following query to return the details:

SELECT custid, custname 
FROM dbo.Customers 
WHERE custid IN(SELECT custid 
  FROM dbo.Orders); 

Given the sample data in the Customers and Orders tables, you probably expect your query to return customers A and B. Instead, you get back all customers, including those who made no orders. Why?

The bug here is elusive. Look carefully at the definition of both tables in Listing A. Notice in the Customers table that the column for customer ID is called custid, but the Orders table calls this column customerid. So, the query just given looks for a column name, custid, that doesn't exist in Orders.

With this discrepancy in column names, why does the query return all customers rather than producing an error that the custid column doesn't exist in Orders? If you run only the inner query

WHERE custid IN(SELECT custid 
  FROM dbo.Orders); 

you'll get the expected error. But SQL Server resolves query attributes beginning from the current level and working outward. So, SQL Server first looks for a column called custid in Orders and doesn't find one. Then, it looks for custid in the outer level (Customers) and finds it. Thus, unintentionally, the query becomes a correlated one, as if you had written:

SELECT custid, custname 
FROM dbo.Customers 
WHERE custid IN(SELECT 
  Customers.custid 
  FROM dbo.Orders); 

Now the reason for the unexpected result is clear: The filter ends up comparing the value to itself, returning TRUE. A best practice that can help you avoid such bugs is to always table-qualify columns in queries, even when the queries are self-contained. For example, if you had written the original query as follows, SQL Server would have trapped the error at resolution time, letting you know that Orders doesn't have a custid column:

SELECT custid, custname 
FROM dbo.Customers AS C 
WHERE custid IN(SELECT O.custid 
  FROM dbo.Orders AS O); 

You could have then changed the query to

SELECT custid, custname 
FROM dbo.Customers AS C 
WHERE custid IN(SELECT 
  O.customerid FROM dbo.Orders 
  AS O); 

This query correctly returns customers A and B.

March's Bug: The following query returns the maximum OrderID from the Orders table in the Northwind database:

SELECT MAX(OrderID) FROM 
  dbo.Orders 

And SQL Server's system stored procedure sp_who returns current users and processes. You use the following code to try to run both statements in the same batch:

SELECT MAX(OrderID) FROM 
 dbo.Orders 
sp_who 

However, you get only the max OrderID from Orders.You don't receive any output from sp_who—not even an error. It seems that only the first line of code ran and that sp_who wasn't invoked at all. Can you find the bug in this code?

TAGS: SQL
Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish