This is a result of e-mail exchange I had with Manuel Lopez…
Given the following DDL and sample data:
SET NOCOUNT ON;
USE tempdb;
IF OBJECT_ID('dbo.CustomersNew', 'U') IS NOT NULL
DROP TABLE dbo.CustomersNew;
IF OBJECT_ID('dbo.DomesticCustomers', 'U') IS NOT NULL
DROP TABLE dbo.DomesticCustomers;
IF OBJECT_ID('dbo.ForeignCustomers', 'U') IS NOT NULL
DROP TABLE dbo.ForeignCustomers;
CREATE TABLE dbo.DomesticCustomers
(
custid INT NOT NULL PRIMARY KEY,
custname VARCHAR(20) NOT NULL,
/* ...other attributes... */
);
CREATE TABLE dbo.ForeignCustomers
(
custid INT NOT NULL PRIMARY KEY,
custname VARCHAR(20) NOT NULL,
/* ...other attributes... */
);
INSERT INTO dbo.DomesticCustomers(custid, custname) VALUES
(1001, 'Cust domestic aaa'),(1002, 'Cust domestic bbb');
INSERT INTO dbo.ForeignCustomers(custid, custname) VALUES
(2001, 'Cust foreign aaa'),(2002, 'Cust foreign bbb');
CREATE TABLE dbo.CustomersNew
(
custtype CHAR(1) NOT NULL CHECK (custtype IN ('D', 'F')),
custid INT NOT NULL PRIMARY KEY,
custname VARCHAR(20) NOT NULL,
/* ...other attributes... */
)
INSERT INTO dbo.CustomersNew(custtype, custid, custname) VALUES
('D', 1001, 'Cust domestic xyz'),
('D', 1003, 'Cust domestic ccc'),
('F', 2001, 'Cust foreign xyz'),
('F', 2003, 'Cust foreign ccc');
SELECT * FROM dbo.DomesticCustomers;
custid custname
----------- --------------------
1001 Cust domestic aaa
1002 Cust domestic bbb
SELECT * FROM dbo.ForeignCustomers;
custid custname
----------- --------------------
2001 Cust foreign aaa
2002 Cust foreign bbb
SELECT * FROM dbo.CustomersNew;
custtype custid custname
-------- ----------- --------------------
D 1001 Cust domestic xyz
D 1003 Cust domestic ccc
F 2001 Cust foreign xyz
F 2003 Cust foreign ccc
You’re supposed to write a MERGE statement that deals only with domestic customers (those in CustomersNew with 'D' in the custtype attribute); updating the nonkey attributes of customers that already exist in the target DomesticCustomers table, and inserting new customers that don’t exist. Consider the following MERGE statement, and before you run it, see if you can guess what the output of the subsequent SELECT will be?
MERGE INTO dbo.DomesticCustomers AS TGT
USING dbo.CustomersNew AS SRC
ON SRC.custtype = 'D'
AND SRC.custid = TGT.custid
WHEN MATCHED THEN
UPDATE
SET TGT.custname = SRC.custname
WHEN NOT MATCHED THEN
INSERT(custid, custname)
VALUES(SRC.custid, SRC.custname);
SELECT * FROM dbo.DomesticCustomers;
Scroll down to see the result…
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
I bet you didn’t guess the following:
custid custname
----------- --------------------
1001 Cust domestic xyz
1002 Cust domestic bbb
1003 Cust domestic ccc
2001 Cust foreign xyz
2003 Cust foreign ccc
But that’s the result I got. It appears that this behavior is documented. From http://msdn.microsoft.com/en-us/library/bb522522.aspx:
“Reducing the number of rows in the input stream early in the process by specifying the additional search condition to the ON clause (for example, by specifying ON Source.EmployeeID = Target.EmployeeID AND EmployeeName LIKE 'S%') might seem to improve query performance. However, doing so can cause unexpected and incorrect results. Because the additional search conditions specified in the ON clause are not used for matching the source and target data, they can be misapplied.
The following example demonstrates how incorrect results can occur. The search condition for matching the source and target tables and the additional search condition for filtering rows are both specified in the ON clause. Because the additional search condition is not required to determine source and target matching, the insert and delete actions are applied to all input rows. In effect, the filtering condition EmployeeName LIKE 'S%' is ignored. When the statement is run, the output of the inserted and deleted tables shows that two rows are incorrectly modified: Mary is incorrectly deleted from the target table, and Bob is incorrectly inserted. “
There are simple workarounds like specifying the additional predicate in the WHEN … AND clause, or filtering the data in a table expression, like so:
WITH DomesticCustomersNew AS
(
SELECT custid, custname
FROM dbo.CustomersNew
WHERE custtype = 'D'
)
MERGE INTO dbo.DomesticCustomers AS TGT
USING DomesticCustomersNew AS SRC
ON SRC.custid = TGT.custid
WHEN MATCHED THEN
UPDATE
SET TGT.custname = SRC.custname
WHEN NOT MATCHED THEN
INSERT(custid, custname)
VALUES(SRC.custid, SRC.custname);
Or like so:
MERGE INTO dbo.DomesticCustomers AS TGT
USING dbo.CustomersNew AS SRC
ON SRC.custid = TGT.custid
WHEN MATCHED AND SRC.custtype = 'D' THEN
UPDATE
SET TGT.custname = SRC.custname
WHEN NOT MATCHED AND SRC.custtype = 'D' THEN
INSERT(custid, custname)
VALUES(SRC.custid, SRC.custname);
But I bet that many aren’t aware of this behavior.
Manuel Lopez opened a connect item (http://connect.microsoft.com/SQLServer/feedback/details/564676) asking at least for a warning to be issued by SQL Server when you execute a MERGE statement and specify a nonmatching predicate in the ON clause. I think that this is important and added my vote. If you also think it’s important, please add your vote too.
Cheers,
BG