My company provides pricing software and services for retailers and distributors in the consumer goods industry. One of the company engineers recently asked me to write a T-SQL statement that he could use to find out which stores didn't have any transactions during a one-week period. However, the data in the SQL Server table included only the IDs of the stores and the dates on which each store had a transaction. So, I was faced with the mind-challenging task of writing a T-SQL statement that would produce data that wasn't in the underlying table.
To demonstrate how the T-SQL statement I came up with works, I created the sample data shown in Table 1.
Table 1: Data in the MySalesTable Table
As you can see, it includes a week's worth of transaction dates (October 1-7, 2009) for five stores, which have the IDs of 100, 200, 300, 400, and 500. Only Store 500 has transaction dates for all seven days.
Listing 1 shows the T-SQL statement, GetNonTransactionDaysForStores.sql, I devised.
Listing 1: GetNonTransactionDaysForStores.sql (click to enlarge)
This code begins by using the FROM clause in callout A to produce two derived tables. This subquery first retrieves the unique Store_ID values from MySalesTable and puts them in a derived table named st1. If every store had a transaction on at least one day, the subquery returns every store ID once. The subquery then retrieves the unique TransactionDate values from MySalesTable and puts them in a derived table named st2. If a store has transactions on all seven days, the subquery returns all seven transaction dates.
The code in callout B performs a Cartesian product operation to match all rows in st1 with all rows in st2 in order to get every possible store-transaction date combination. The results in the Cartesian product (st3) are then filtered. Because we're interested in only those dates on which there weren't any transactions, the code looks for dates that are in st3 but not st2 for each store. Note that if none of the stores have a transaction on a certain day, the query won't return any result for that particular day for all the stores. The query assumes that at least one store will have a transaction on any given day in the specified date range.
Table 2 shows the results from running GetNonTransactionDaysForStores.sql.
Table 2: Results from GetNonTransactionDaysForStores.sql
As you can see, it displays only those stores that didn't have transactions (which is why store 500 isn't included) and the dates of those transaction-less days.
You can download GetNonTransactionDaysForStores.sql and the script I used to create and populate the sample table (MySalesTable.Table.sql) by clicking the 102744.zip hotlink. Note that because this code uses Cartesian product operation between the st1 and st2 tables, this solution doesn't scale well when you have more than a few thousand records to consider.