Preemptive RowGuidCols on Merge Replicated Databases

One thing I’ve started doing more and more frequently with Merge Publications is preemptively adding in a rowguid column to tables that I know will end up being published. Doing so can make a huge difference in the amount of time (and log space) that would occur otherwise.

Non-Preemption (or the Out-of-the-box Way)

When you don’t pre-RowGuid your tables, the process behind sp_addmergearticle checks the table being added and looks to see if there’s already a uniqueidentifier column with the rowguidcol property in place or not.

If a column of rowguidcol is NOT already in place, then as part of the magic of enabling a table to become a merge-capable article, SQL Server has to go in and add a new uniqueidentifier column with the rowguidcol property enabled.

And, of course, the problem with this operation is that it’s a size-of-data operation—meaning that the larger your table is, the longer this will take (and the more log-space it will require). Consequently, when you go ahead and ‘rowguidcol’ a large number of tables in a larger database, you can spend a lot of time waiting around for SQL Server to go to each row in those columns and plunk in a new column and value—while everything is being logged (so that, just in case the operation fails, SQL Server is able to gracefully recover without losing data—something that most of us find very helpful even if it does mean that these kinds of operations take longer).

The Double-Whammy Effect

Where this becomes an even bigger issue though, is when you tear down a publication. Because when you completely remove or destroy a publication—all of that magical rowguid ‘goodness’ comes off. Which, in turn, is another size of data operation.

Consequently, since some publications (especially those with lots of older code/sprocs) need to be created multiple times before they’ll actually initialized correctly, I find that paying the ‘price’ of adding a rowguidcol 1x (in preemptive fashion) is a much better approach.

Especially since I tend to make heavy use of scripts to both create and tear-down my publications—as that’s a much more scalable approach from an administrative perspective.

And, frankly, this also ends up making much more sense from an HA/continuity perspective anyhow. Because anyone who has played around much with replication in a production environment knows that while replication is actually QUITE robust and can really work as advertised, it can and will occasionally run into speed-bumps, network disconnects, or other ugliness where tearing a publication down and rebuilding it actually makes much more sense. Unless, of course, you don’t want to wait around forever while SQL Server goes ahead and removes rowguidcol columns from all articles—only to re-add them later.

Accordingly, if you know that you’re going to be merge-replicating specific databases or tables for the long-haul, preemptively going in and adding a rowguid/rowguidcol column makes a lot of sense because you only end up incurring the size-of-data operation once. And, while I feel that I should mention that adding a rowguid column to all of your tables CAN and WILL potentially break some stored procedures and views where non-best practices such as SELECT * ugliness is going on, the great news is that you were ALREADY going to bump into these problems anyhow when you went ahead and replicated these tables because you’d be adding that column anyway. Likewise, if you’re dealing with large amounts of data or huge tables, adding a rowguid column can take a while—but it was going to take a while ANYHOW—and by taking the ‘preemptive’ approach, you only incur this pain once.

Preemptively RowGuidCol-ing your Tables

When Merge replication goes into your tables and adds rowguidcol columns all’z it’s doing is making sure that there’s a column on the table that is of type uniqueidentifier, and that if it’s there, it’s marked with the rowguidcol property—so that SQL Server can use this rowguidcol to keep tabs on each row across each subscriber (and at the publisher) as a way of ensuring uniqueness all across your replication topology.

As such, it’s not too hard at all to mimic the kind of logic needed to go ahead and add a rowguid/rowguidcol column into all tables.

For my own purposes, I’ve started using the following script—which just uses a simple cursor and then PRINTs out the syntax needed to add columns to specific tables (then I just copy/paste the output of this script into a new window and run it—but where I can more easily monitor the script for any potential hiccups or errors – which I haven’t seen yet after using it on more than 6 databases).

DECLARE rowguider CURSOR FAST_FORWARD
FOR
SELECT o.[object_id]
	,o.[name]
FROM sys.objects o
WHERE type = 'U'
	AND OBJECTPROPERTY(o.object_id, 'IsMsShipped') = 0
	AND o.NAME NOT IN ('sysdiagrams')
ORDER BY NAME

DECLARE @tableId INT
DECLARE @tableName SYSNAME
DECLARE @rowguidcol BIT
DECLARE @sql NVARCHAR(300)

OPEN rowguider

FETCH NEXT
FROM rowguider
INTO @tableId
	,@tableName

WHILE @@FETCH_STATUS = 0
BEGIN
	SET @rowguidcol = 0

	-- see if there's a rowguid col:
	SELECT @rowguidcol = ISNULL(is_rowguidcol, 0)
	FROM sys.columns
	WHERE object_id = @tableId
		AND is_rowguidcol = 1

	IF @rowguidcol = 0
	BEGIN
		SET @sql = 'ALTER TABLE dbo.[' + @tableName + '] ADD rowguid uniqueidentifier rowguidcol CONSTRAINT [DF_' + @tableName + '_RowGuidCol] DEFAULT NEWSEQUENTIALID() NOT NULL'

		PRINT @sql
	END

	FETCH NEXT
	FROM rowguider
	INTO @tableId
		,@tableName
END

CLOSE rowguider

DEALLOCATE rowguider
GO

I’m sure a more elegant script could be easily created for the same purposes, but in my case I’m just looking for a way to avoid needlessly incurring size-of-data operations when working out the kinks in larger publications—and this script works just fine for that.

Comments

Plain text