The MERGE statement goes beyond the fundamentals I covered in the main article. Here are some additional details, some of MERGE’s more advanced aspects, and some of its restrictions.
Use of the Semicolon
Versions of SQL Server prior to 2005 don’t require statements to end with a semicolon. SQL Server 2005 introduced a few new language elements that require the use of a semicolon to avoid ambiguity. For example, if you have a SQL Server 2005 statement preceding a WITH clause that defines a common table expression (CTE), you have to terminate the previous statement with a semicolon to avoid ambiguity in regards to the meaning of the WITH clause. SQL Server 2008 requires you to terminate MERGE statements with a semicolon; otherwise you’ll get a parsing error.
To be safe, I terminate all my T-SQL statements with a semicolon. Doing so doesn’t make a difference when a semicolon isn’t required, but it saves me grief in cases when a semicolon is required, because the error messages generated by the parser aren’t always clear and explicit. I recommend that you adopt this practice as well, to save yourself grief now and in the future, as the instances in which a semicolon is necessary increase.
Use of Table Expressions
A MERGE statement’s source and target aren’t limited to tables (with a few restrictions); they can also be other table expressions (e.g., view, derived table, inline table function, CTE, rowset function, OPENXML). This capability is important not only in terms of functionality but also for code clarity and maintenance. For example, if you want to represent both the source and the target as CTEs, you can use the following form:
WITH SRC AS (SELECT …), TGT AS (SELECT …) MERGE INTO TGT USING SRC ON … WHEN MATCHED THEN… WHEN NOT MATCHED THEN…;
Remember that if a statement precedes the WITH clause that defines the CTEs, the previous statement must be terminated with a semicolon (even if the previous statement is simply BEGIN TRAN).
As another example, suppose that the source customer data resides in the file C:\data\custs.txt, and you defined the format file C:\data\custs.fmt (just as you define a format file for a BULK INSERT operation). You can use the OPENROWSET function with the BULK rowset provider directly as the source, as follows:
MERGE INTO dbo.Customers AS TGT USING OPENROWSET(BULK 'c:\data\custs.txt', FORMATFILE = 'c:\data\custs.fmt') ON … WHEN MATCHED THEN… WHEN NOT MATCHED THEN…;
All WHEN clauses in a MERGE statement (including the WHEN SOURCE NOT MATCHED clause that I discuss later) support adding a predicate beyond the one specified in the ON clause. For example, suppose that in our CustomersStage-Customers scenario, you want to update existing target customers only if one of their non-key attributes is different than the corresponding attribute in the source row. You can do so by revising the WHEN MATCHED THEN clause to a WHEN MATCHED AND (predicate identifying change) THEN clause, as Listing A shows. Only customers 2 and 5 will be modified; customer 3 doesn’t qualify because all of customer 3’s attributes are the same in the source and target rows.
Note that I’m relying on the fact that all attributes in both the CustomersStage table and the Customers table are defined as NOT NULL. In cases in which the attributes allow NULLs, you need to add expressions to handle those NULLs. For example, supposed the phone attribute allows NULLs. You must revise the expression
TGT.phone <> SRC.phone
( TGT.phone <> SRC.phone OR (TGT.phone IS NULL AND SRC.phone IS NOT NULL) OR (TGT.phone IS NOT NULL AND SRC.phone IS NULL) )
You can have as many as two WHEN MATCHED clauses. If you specify two clauses, the first must be WHEN MATCHED AND (predicate) THEN, and the second must be WHEN MATCHED THEN, as follows:
… WHEN MATCHED AND (<predicate>) THEN <action> …
In such a case, the action specified after WHEN MATCHED AND (predicate) THEN will be applied when both the ON clause and the additional predicate are satisfied. The action specified after WHEN MATCHED THEN will be applied when the ON clause is satisfied but the additional predicate isn’t.
The WHEN MATCHED clauses aren’t allowed to apply an INSERT action. If two WHEN MATCHED clauses are specified, one must apply an UPDATE action and the other must apply a DELETE action.
The WHEN NOT MATCHED clause allows only an INSERT action. Only one WHEN NOT MATCHED clause is allowed, optionally with an additional predicate—WHEN NOT MATCHED AND (predicate) THEN.
So far, I’ve described actions you can take when a source row finds a match in the target and when a source row doesn’t find a match in the target. But what if you want to take a certain action when a target row (in Customers) doesn’t find a match in the source (CustomersStage)? In our case, customers 1 and 4 exist in the target but not in the source. Suppose you want to delete the rows from Customers, or alternatively update an attribute with a value representing the fact that the rows are discontinued. For this purpose, the MERGE statement supports a WHEN SOURCE NOT MATCHED THEN clause. For example, the code in Listing B deletes rows from the target for customers that exist in the target but not in the source. Besides the UPDATE and INSERT changes applied by the previous MERGE statement, this MERGE statement also deletes customers 1 and 4 from the target. The semantics of this MERGE statement are similar to those of a FULL OUTER JOIN, which can produce outer rows from both sides.
Several similarities exist between the WHEN NOT MATCHED clause (i.e., when the target isn’t matched) that I discussed earlier and this WHEN SOURCE NOT MATCHED clause. However, in the WHEN SOURCE NOT MATCHED clause you can’t apply an INSERT action—only UPDATE or DELETE. In addition, in the WHEN SOURCE NOT MATCHED clause, you can apply conditional logic by adding AND (predicate) to the clause, and if you want, you can have two clauses—the first with AND (predicate) and the second without. If two clauses are used, one must apply an UPDATE action and the other a DELETE action.
Determinism of Changes
SQL Server doesn’t allow a MERGE statement to modify the same row more than once. A MERGE statement will fail if it tries to update the same row more than once, or update and delete the same row.
The target table can have enabled triggers. Remember that a trigger in SQL Server executes per statement rather than per row. Only one instance of each relevant trigger (i.e., INSERT, UPDATE, DELETE) will execute per MERGE statement. So if you have two UPDATE actions in your MERGE statement (one for WHEN NOT MATCHED THEN, and one for WHEN SOURCE NOT MATCHED THEN), only one instance of an UPDATE trigger defined on the table will execute, and within the trigger you’ll find all updated rows in the inserted and deleted tables. Finally, a technical implementation restriction is that the target table can’t have enabled triggers if the statement contains an OUTPUT clause without an INTO clause.
In order to demonstrate trigger behavior with the MERGE statement, run the code in Listing C. This code creates INSERT, UPDATE, and DELETE triggers on the Customers table. Then, run all the MERGE statement examples that I provided in the article, and examine the output you get from the triggers. Feel free to revise the MERGE statements to see what kind of output you get for different scenarios.