Skip navigation

Reporting Book Sales

Congratulations to Narasimhan Jayachandran, a database management consultant for HTC Global Services in Troy, Michigan, and John Hanson, vice president of operations for MEDePass, Inc.. Narasimhan won first prize of $100 for the best solution to the February Reader Challenge, "Reporting Book Sales." John won second prize of $50. Here’s a recap of the problem and the solution to the February Reader Challenge.

Problem:


Phil generates reports for a sales team in a company that sells books. The company stores publication data for its books in a SQL Server 2000 database. Phil receives sales data updates in a Microsoft Excel file that has the following header labels for columns: stor_id, yr, Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, and Dec. The spreadsheet contains quantity of books sold in each store by year and month. You can generate sample data for the Excel spreadsheet from the Sales table in the Pubs database by using the following code:

SELECT s.stor_id, year(s.ord_date) AS yr,
    SUM(CASE month(s.ord_date) WHEN 1 THEN s.qty ELSE 0 END) AS Jan,
    SUM(CASE month(s.ord_date) WHEN 2 THEN s.qty ELSE 0 END) AS Feb,
    SUM(CASE month(s.ord_date) WHEN 3 THEN s.qty ELSE 0 END) AS Mar,
    SUM(CASE month(s.ord_date) WHEN 4 THEN s.qty ELSE 0 END) AS Apr,
    SUM(CASE month(s.ord_date) WHEN 5 THEN s.qty ELSE 0 END) AS May,
    SUM(CASE month(s.ord_date) WHEN 6 THEN s.qty ELSE 0 END) AS Jun,
    SUM(CASE month(s.ord_date) WHEN 7 THEN s.qty ELSE 0 END) AS Jul,
    SUM(CASE month(s.ord_date) WHEN 8 THEN s.qty ELSE 0 END) AS Aug,
    SUM(CASE month(s.ord_date) WHEN 9 THEN s.qty ELSE 0 END) AS Sep,
    SUM(CASE month(s.ord_date) WHEN 10 THEN s.qty ELSE 0 END) AS Oct,
    SUM(CASE month(s.ord_date) WHEN 11 THEN s.qty ELSE 0 END) AS Nov,
    SUM(CASE month(s.ord_date) WHEN 12 THEN s.qty ELSE 0 END) AS Dec
    FROM Sales AS s
    GROUP BY s.stor_id, year(s.ord_date)

Phil needs to import the data from the Excel file into a SQL Server table called StoreSalesSummary by unpivoting the month columns from the spreadsheet. You can create the StoreSalesSummary table by using the following code:

CREATE TABLE StoreSalesSummary (
    stor_id int NOT NULL,
    qty int NOT NULL,
    yr smallint NOT NULL,
    mn tinyint NOT NULL,
    PRIMARY KEY(stor_id, yr, mn)
    )

Help Phil import only the Excel spreadsheet data into the StoreSalesSummary table, then insert and update each store’s sales from the spreadsheet. Import only the stores with a nonzero quantity value for any month.

Solution:


Phil can use the OPENDATASOURCE() rowset function in SQL Server 2000 to read the Excel file data as a table. If the Excel file contains a worksheet called YearlySales, he can use the following code to read the Excel spreadsheet as a table:

SELECT *
    FROM OpenDataSource( 'Microsoft.Jet.OLEDB.4.0',
    'Data Source="C:\Data\YearlySalesSummary.xls";User ID=Admin;Password=
    ;Extended properties=Excel 5.0')...\[YearlySales$\] s

Phil could also set up a linked server connection to the Excel file or use the OPENROWSET() function to read the data from the Excel file as a table.

Now Phil can manipulate the data from the Excel spreadsheet in a SQL Server table. The following code puts the Excel data into a temporary table called #Sales in the desired format:

SELECT s1.stor_id, s1.qty, s1.yr, s1.mn
    INTO #Sales
    FROM (
    SELECT s.stor_id,
    CASE m.mn
    WHEN 1 THEN s.\[Jan\]
    WHEN 2 THEN s.\[Feb\]
    WHEN 3 THEN s.\[Mar\]
    WHEN 4 THEN s.\[Apr\]
    WHEN 5 THEN s.\[May\]
    WHEN 6 THEN s.\[Jun\]
    WHEN 7 THEN s.\[Jul\]
    WHEN 8 THEN s.\[Aug\]
        WHEN 9 THEN s.\[Sep\]
    WHEN 10 THEN s.\[Oct\]
    WHEN 11 THEN s.\[Nov\]
    WHEN 12 THEN s.\[Dec\]
    END AS qty,
    s.yr,
    m.mn
    FROM OpenDataSource( 'Microsoft.Jet.OLEDB.4.0',
    'Data Source="C:\Data\YearlySalesSummary.xls";User ID=Admin;Password=
        ;Extended properties=Excel 5.0')...\[YearlySales$\] s
    CROSS JOIN (
    SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
    UNION ALL
    SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8
    UNION ALL
    SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12
    ) AS m(mn)
    ) s1
    WHERE s1.qty > 0.0
SELECT * FROM #Sales

The query’s cross join lets Phil generate a row for each month column. Once Phil converts the columns to rows, he can obtain the correct quantity value by using a CASE expression in the SELECT list, based on the pseudo month column, mn. Finally, the code’s WHERE clause filters the rows that have a nonzero qty value for each year and month combination.

Now Phil can use the data from the temporary table #Sales to update the data in the StoreSalesSummary table, as the following code shows:

-- Update the existing rows first:    
    UPDATE s2    
    SET qty = s2.qty + s1.qty    
    FROM #Sales AS s1    
    JOIN StoreSalesSummary AS s2    
    ON s2.stor_id = s1.stor_id    
    AND s2.yr = s1.yr    
    AND s2.mn = s1.mn
-- Add the new rows next:
INSERT INTO storesalessummary
SELECT s1.stor_id, s1.qty, s1.yr, s1.mn
    FROM #Sales AS s1
    WHERE NOT EXISTS(SELECT *
    FROM StoreSalesSummary AS s2
    WHERE s2.stor_id = s1.stor_id
    AND s2.yr = s1.yr
    AND s2.mn = s1.mn)
SELECT * FROM #Sales

Phil could also unpivot the Excel spreadsheet’s month columns by using a UNION ALL query. The following example assumes that Phil has dumped the data from the worksheet into a temporary table called #ExcelSales:

SELECT stor_id, yr, 1 AS mn, Jan AS qty
    FROM #ExcelSales
    WHERE Jan > 0
    UNION ALL
SELECT stor_id, yr, 2 AS mn, Feb AS qty
    FROM #ExcelSales
    WHERE Feb > 0
    UNION ALL
SELECT stor_id, yr, 3 AS mn, Mar AS qty
    FROM #ExcelSales
    WHERE Mar > 0
    UNION ALL
SELECT stor_id, yr, 4 AS mn, Apr AS qty
    FROM #ExcelSales
    WHERE Apr > 0
    UNION ALL
SELECT stor_id, yr, 5 AS mn, May AS qty
    FROM #ExcelSales
    WHERE May > 0
    UNION ALL
SELECT stor_id, yr, 6 AS mn, Jun AS qty
    FROM #ExcelSales
    WHERE Jun > 0
    UNION ALL
SELECT stor_id, yr, 7 AS mn, Jul AS qty
    FROM #ExcelSales
    WHERE Jul > 0
    UNION ALL
SELECT stor_id, yr, 8 AS mn, Aug AS qty
    FROM #ExcelSales
    WHERE Aug > 0
    UNION ALL
SELECT stor_id, yr, 9 AS mn, Sep AS qty
    FROM #ExcelSales
    WHERE Sep > 0
    UNION ALL
SELECT stor_id, yr, 10 AS mn, Oct AS qty
    FROM #ExcelSales
    WHERE Oct > 0
    UNION ALL
SELECT stor_id, yr, 11 AS mn, Nov AS qty
    FROM #ExcelSales
    WHERE Nov > 0
    UNION ALL
SELECT stor_id, yr, 12 AS mn, Dec AS qty
    FROM #ExcelSales
    WHERE Dec > 0

The cross-join technique typically performs better than the UNION ALL query for unpivoting operations. The cross-join involves less code and fewer joins on the main table. You can compare the performance of both solutions by looking at their execution plan costs, I/O statistics, and the time SQL Server takes to execute the queries.

MARCH READER CHALLENGE:


Now, test your SQL Server savvy in the March Reader Challenge, "Restoring a Database" (below). Submit your solution in an email message to [email protected] by February 19. Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We’ll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.

Keith is the DBA for a company that runs several SQL Server 2000 data warehouses. He has a crucial database that contains the fact and dimension tables for the data warehouse. Keith performs full backups every week and periodic log backups after the database backup. The company’s development team has requested the latest copy of the database along with any log backups for testing purposes. The team needs to run ad hoc queries against the database at different points in time to collect statistics. Keith uses the following sequence of steps to create the database and log backups:

CREATE DATABASE DW

ALTER DATABASE DW SET recovery bulk_logged
RAISERROR ('-- Performing full backup...', 0, 1) WITH nowait

-- Full backup of database
BACKUP DATABASE DW TO DISK = 'c:\temp\DW.bak' WITH init
GO

-- Create table t1
CREATE TABLE DW..t1 ( i int IDENTITY )
INSERT INTO DW..t1 DEFAULT VALUES

-- Initial log backup
RASIERROR ('-- Initial log backup...', 0, 1) WITH nowait
BACKUP log DW TO DISK = 'c:\temp\DW.trn.1' WITH init

-- Create table t2 for bulk loading
CREATE TABLE DW..t2 ( c char( 8000 ) DEFAULT 'x' )
INSERT DW..t2 DEFAULT VALUES

-- Add new log file on a different volume because of space constraints
ALTER DATABASE DW ADD log FILE ( name = 'DW_TempLog' , filename = 'c:\temp\DW_TempLog.ldf' )

-- Bulk inserts and other operations here

-- Log backup after first ALTER DATABASE command
RAISERROR ('-- Log backup after first ALTER DATABASE...', 0, 1) WITH nowait
BACKUP log DW TO DISK = 'c:\temp\DW.trn.2' WITH init

-- Remove temporary log file
ALTER DATABASE DW REMOVE FILE 'DW_Templog'

-- Log backup after second ALTER DATABASE command
RAISERROR ('-- Log backup after second ALTER DATABASE...', 0, 1) WITH nowait
BACKUP log DW TO DISK = 'c:\temp\DW.trn.3' WITH init
DROP DATABASE DW
GO

Keith also needs to provide the commands for restoring the database (in read-only format) up to and including the latest log backup, DW.trn.3. Help Keith write the script to restore a read-only copy of the database after different backups have been restored on a development server.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish