Puzzled By T-SQL Blog

MERGE Puzzle and a Request for Warning

This is a result of e-mail exchange I had with Manuel Lopez…

Given the following DDL and sample data:

SET NOCOUNT ON;

USE tempdb;

 


IF OBJECT_ID('dbo.CustomersNew', 'U') IS NOT NULL

  DROP TABLE dbo.CustomersNew;

 

IF OBJECT_ID('dbo.DomesticCustomers', 'U') IS NOT NULL

  DROP TABLE dbo.DomesticCustomers;

 

IF OBJECT_ID('dbo.ForeignCustomers', 'U') IS NOT NULL

  DROP TABLE dbo.ForeignCustomers;

 


CREATE TABLE dbo.DomesticCustomers

(

  custid   INT         NOT NULL PRIMARY KEY,

  custname VARCHAR(20) NOT NULL,

  /* ...other attributes... */

);

 


CREATE TABLE dbo.ForeignCustomers

(

  custid   INT         NOT NULL PRIMARY KEY,

  custname VARCHAR(20) NOT NULL,

  /* ...other attributes... */

);

 


INSERT INTO dbo.DomesticCustomers(custid, custname) VALUES

  (1001, 'Cust domestic aaa'),(1002, 'Cust domestic bbb');

 


INSERT INTO dbo.ForeignCustomers(custid, custname) VALUES

  (2001, 'Cust foreign aaa'),(2002, 'Cust foreign bbb');

 


CREATE TABLE dbo.CustomersNew

(

  custtype CHAR(1) NOT NULL CHECK (custtype IN ('D', 'F')),

  custid   INT         NOT NULL PRIMARY KEY,

  custname VARCHAR(20) NOT NULL,

  /* ...other attributes... */

)

 


INSERT INTO dbo.CustomersNew(custtype, custid, custname) VALUES

  ('D', 1001, 'Cust domestic xyz'),

  ('D', 1003, 'Cust domestic ccc'),

  ('F', 2001, 'Cust foreign xyz'),

  ('F', 2003, 'Cust foreign ccc');

 


SELECT * FROM dbo.DomesticCustomers;

 


custid      custname

----------- --------------------

1001        Cust domestic aaa

1002        Cust domestic bbb

 


SELECT * FROM dbo.ForeignCustomers;

 


custid      custname

----------- --------------------

2001        Cust foreign aaa

2002        Cust foreign bbb

 


SELECT * FROM dbo.CustomersNew;

 


custtype custid      custname

-------- ----------- --------------------

D        1001        Cust domestic xyz

D        1003        Cust domestic ccc

F        2001        Cust foreign xyz

F        2003        Cust foreign ccc

 

You’re supposed to write a MERGE statement that deals only with domestic customers (those in CustomersNew with 'D' in the custtype attribute); updating the nonkey attributes of customers that already exist in the target DomesticCustomers table, and inserting new customers that don’t exist. Consider the following MERGE statement, and before you run it, see if you can guess what the output of the subsequent SELECT will be?

 

MERGE INTO dbo.DomesticCustomers AS TGT

USING dbo.CustomersNew AS SRC

  ON SRC.custtype = 'D'

  AND SRC.custid = TGT.custid

WHEN MATCHED THEN

  UPDATE

    SET TGT.custname = SRC.custname

WHEN NOT MATCHED THEN

  INSERT(custid, custname)

    VALUES(SRC.custid, SRC.custname);

 

SELECT * FROM dbo.DomesticCustomers;

 

Scroll down to see the result…

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

 

I bet you didn’t guess the following:

 

custid      custname

----------- --------------------

1001        Cust domestic xyz

1002        Cust domestic bbb

1003        Cust domestic ccc

2001        Cust foreign xyz

2003        Cust foreign ccc

 

But that’s the result I got. It appears that this behavior is documented. From http://msdn.microsoft.com/en-us/library/bb522522.aspx:

 

“Reducing the number of rows in the input stream early in the process by specifying the additional search condition to the ON clause (for example, by specifying ON Source.EmployeeID = Target.EmployeeID AND EmployeeName LIKE 'S%') might seem to improve query performance. However, doing so can cause unexpected and incorrect results. Because the additional search conditions specified in the ON clause are not used for matching the source and target data, they can be misapplied.

The following example demonstrates how incorrect results can occur. The search condition for matching the source and target tables and the additional search condition for filtering rows are both specified in the ON clause. Because the additional search condition is not required to determine source and target matching, the insert and delete actions are applied to all input rows. In effect, the filtering condition EmployeeName LIKE 'S%' is ignored. When the statement is run, the output of the inserted and deleted tables shows that two rows are incorrectly modified: Mary is incorrectly deleted from the target table, and Bob is incorrectly inserted. “

There are simple workarounds like specifying the additional predicate in the WHEN … AND clause, or filtering the data in a table expression, like so:

WITH DomesticCustomersNew AS

(

  SELECT custid, custname

  FROM dbo.CustomersNew

  WHERE custtype = 'D'

)

MERGE INTO dbo.DomesticCustomers AS TGT

USING DomesticCustomersNew AS SRC

  ON SRC.custid = TGT.custid

WHEN MATCHED THEN

  UPDATE

    SET TGT.custname = SRC.custname

WHEN NOT MATCHED THEN

  INSERT(custid, custname)

    VALUES(SRC.custid, SRC.custname);

 

Or like so:

MERGE INTO dbo.DomesticCustomers AS TGT

USING dbo.CustomersNew AS SRC

  ON SRC.custid = TGT.custid

WHEN MATCHED AND SRC.custtype = 'D' THEN

  UPDATE

    SET TGT.custname = SRC.custname

WHEN NOT MATCHED AND SRC.custtype = 'D' THEN

  INSERT(custid, custname)

    VALUES(SRC.custid, SRC.custname);

 

But I bet that many aren’t aware of this behavior.

Manuel Lopez opened a connect item (http://connect.microsoft.com/SQLServer/feedback/details/564676) asking at least for a warning to be issued by SQL Server when you execute a MERGE statement and specify a nonmatching predicate in the ON clause. I think that this is important and added my vote. If you also think it’s important, please add your vote too.

 

Cheers,

BG

 

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish