Migrating Data from Database to Database

Downloads
125441.zip

Most developers work with at least three sets of data—production, test, and development data. The problem with the latter two data sets is that the information in them becomes so mangled from testing and development efforts that you soon lose any semblance of comparison to real data. Moreover, when users perform evaluation testing against these mangled data sets, they become frustrated when the data doesn’t look like what they’ve been working with.

Related: Copying Data with Dependencies

It’s easy to back up and restore production data over test and development versions of data. However, doing so will overwrite new tables, columns, views, and stored procedures. This article describes a data-driven stored procedure, spc_CopyProd2Test, which copies the data from one server to another while leaving existing objects and structures intact. (You can download spc_CopyProd2Test by clicking the Downloads file at the top of the article page.)

Key Idea Behind Stored Procedure

The key idea behind this stored procedure is to determine which table/column combinations exist on both the source database and the target database, and then migrate only the data in those tables and columns. There are several issues that must be dealt with when migrating data in this fashion, especially when foreign key constraints are present. First, you need to turn off the table constraints. This doesn’t remove the constraints; it simply ignores them until they’re turned back on to avoid any further unintended consequences. Likewise, if you have any triggers on your tables, you might want to disable them as well. The final goal is to create a list of the SQL commands that will affect the migration. These commands will be added to a temp table and then executed in sequence.

You’ll also need to decide if any of the tables in the source database shouldn’t be migrated. For example, user permissions tables likely wouldn’t be appropriate to migrate because your permissions on the production database will be far less than those available on the development database. You can prepare a list of these tables by creating a table variable and populating it with the following code:

DECLARE @SkipTables TABLE
(
  TableName varchar(261)
)
INSERT INTO @SkipTables (TableName) VALUES ('\\[dbo\\].\\[User_Permission\\]')

Next, you can use Listing 1 (below) to pull a list of all the constraints from the INFORMATION_SCHEMA view and suspend them by applying the NOCHECK option to them. Then, you’ll need to delete the data in the target tables. Unfortunately, you can’t use the TRUNCATE statement here because these tables might have foreign key constraints and TRUNCATE won’t work in that case. Instead, you’ll need to perform the much slower process of deleting the data, using Listing 2 (below).

You only want to delete data from the tables that exist in both the source and target databases. Tables that don’t exist in both databases are assumed to be in progress and are left alone. Another major consideration here is referential integrity. If there is, for example, a Dictionary table that contains foreign key references to a series of other tables, the other tables must be deleted before the Dictionary table is deleted. Otherwise, a referential integrity error will occur. One way to avoid this error is to use the extended properties of the table objects to establish a sort order value so that objects such as Dictionary tables are cleaned out last.

Note that tables with XML columns are excluded because such tables can’t participate in distributed queries. Even pulling only the non-XML columns isn’t permitted. Such tables can be handled by creating views that pull all but the XML columns. By extracting the data from these views you can successfully perform the migration.

Don't Exceed Allocated Space

Because your database could be quite large, you should speak with your DBA about allocating the required space or turning off the logging that accompanies such massive inserts. You could easily exceed the allocated space for the database, which means your migration will end in the middle of the process. A data migration is typically far too large to wrap a transaction around for a rollback, so sections of your data might be missing. Although the accompanying source code doesn’t show it, you might want to use SQL Server’s email features to send an email alert in the event the migration terminates before it’s completed.

Once the data has been cleaned out of the target database, you can use the code in Listing 3 (below) to create a temp table that includes the tables and column combinations that exist in both databases. Once the temp table is in place, you can iterate through each column in a given table to create a SQL statement that looks like the following:

INSERT INTO \\[TargetServer\\].\\[MyDB\\]. Employees (EmployeeID, LastName, FirstName)
SELECT EmployeeID, LastName, FirstName
FROM \\[SourceServer\\].\\[MyDB\\]. Employees WITH (NOLOCK)

This SQL statement will migrate the data for the common columns between the source database and the target database. If this table has an IDENTITY key, the INSERT SELECT statement will be preceded by SET IDENTITY_INSERT Employees ON and followed by SET IDENTITY_INSERT Employees OFF. This statement will allow the migration of the primary key column. The INSERT SELECT statements must be performed in a certain order to avoid referential integrity conflicts. Tables with all the foreign key pointers, such as the Dictionary example, need to be populated first this time.

Finally, you’ll need to turn all the constraints back on. Now that you have a temp table filled with all the SQL statements needed to perform a data migration, you can execute these SQL statements in sequence. The code in Listing 4 (below) will iterate the SQL commands in the temp table and execute them one by one. For each table, the stored procedure will execute SQL statements similar to those shown in Listing 5 (below).

Because this stored procedure is data driven, it will work with any SQL Server database, regardless of the structure. The only customization required is altering the name of the source server and database and enumerating the tables you want to exclude.

Listing 1: Code to Turn Off Specific Constraints

INSERT INTO #SQLtemp

   SELECT 'ALTER TABLE ' + QUOTENAME(SCHEMA_NAME(schema_id)) + '.' +

   QUOTENAME(OBJECT_NAME(parent_object_id)) +

   ' NOCHECK CONSTRAINT ' + QUOTENAME(OBJECT_NAME(OBJECT_ID))

   FROM sys.objects

   WHERE type_desc = 'CHECK_CONSTRAINT' OR type_desc = 'FOREIGN_KEY_CONSTRAINT'

Listing 2: Code to Delete Data in Target Tables

INSERT INTO #SQLtemp

   SELECT 'DELETE FROM ' + QUOTENAME(t1.TABLE_SCHEMA) + '.' + QUOTENAME(t1.TABLE_NAME)

   FROM INFORMATION_SCHEMA.tables t1

   INNER JOIN \\[SETON-NOTEBOOK\\].CampaignCommander.INFORMATION_SCHEMA.tables t2

                        ON t1.TABLE_SCHEMA = t2.TABLE_SCHEMA

                        AND t1.TABLE_NAME = t2.TABLE_NAME

   WHERE t1.TABLE_TYPE = 'BASE TABLE'

   AND t1.TABLE_SCHEMA + '.' + t1.TABLE_NAME NOT IN (SELECT TABLE_SCHEMA + '.' + TABLE_NAME

                                                                                                                                                             FROM INFORMATION_SCHEMA.columns

                                                                                                                                                             WHERE DATA_TYPE = 'xml')

   AND QUOTENAME(t1.TABLE_SCHEMA) + '.' + QUOTENAME(t1.TABLE_NAME)

            NOT IN (SELECT TableName FROM @SkipTables)

   ORDER BY t1.TABLE_NAME

Listing 3: Code to Create a List of Table and Column NamesI

NSERT INTO #Tabletemp

   SELECT c1.TABLE_SCHEMA, c1.TABLE_NAME, c1.COLUMN_NAME

   FROM \\[SETON-NOTEBOOK\\].\\[CampaignCommander\\].INFORMATION_SCHEMA.columns c1

   INNER JOIN INFORMATION_SCHEMA.tables t1 ON c1.TABLE_SCHEMA + '.' + c1.TABLE_NAME =

            t1.TABLE_SCHEMA + '.' + t1.TABLE_NAME

   INNER JOIN \\[SETON-NOTEBOOK\\].\\[CampaignCommander\\].INFORMATION_SCHEMA.columns c2

            ON c1.TABLE_SCHEMA = c2.TABLE_SCHEMA

            AND c1.TABLE_NAME = c2.TABLE_NAME

            AND c1.COLUMN_NAME = c2.COLUMN_NAME

   WHERE t1.TABLE_TYPE = 'BASE TABLE'

   AND c1.TABLE_SCHEMA + '.' + c1.TABLE_NAME NOT IN (SELECT TABLE_SCHEMA + '.' + TABLE_NAME

                                                                                                                                                             FROM INFORMATION_SCHEMA.columns

                                                                                                                                                             WHERE DATA_TYPE = 'xml')

   AND QUOTENAME(t1.TABLE_SCHEMA) + '.' + QUOTENAME(t1.TABLE_NAME) NOT IN

            (SELECT TableName FROM @SkipTables)

   ORDER BY c1.TABLE_SCHEMA, c1.TABLE_NAME, c1.ORDINAL_POSITION

Listing 4: Code to Execute SQL Statements in Temp Tables

SELECT @Cnt = MAX(ID) FROM #SQLtemp

SET @x = 0

WHILE EXISTS (SELECT TOP 1 SQL

                                      FROM #SQLtemp

                                      WHERE ID > @x

                                      ORDER BY ID)

   BEGIN

      SELECT TOP 1 @SQL = SQL

      FROM #SQLtemp

      WHERE ID > @x

      ORDER BY ID

        BEGIN TRY

         SET @StartTime = GETDATE()

         EXEC(@SQL)

         SET @ElapsedTime = DATEDIFF(SECOND, @StartTime, GETDATE())

         --Write every successfully executed SQL command to SyncLog

         INSERT INTO SyncLog

         (ErrorNumber, Message, SQL, ErrorDate, ElapsedTime)

         VALUES

         (Null, 'OK', @SQL, GETDATE(), @ElapsedTime)        

      END TRY

      BEGIN CATCH

         SET @SQLError = ERROR_NUMBER()

         --If an error was found, write it to the SyncLog table.

         --One of the most common errors will be caused by trying to insert a value from a larger column

         --into that of a smaller column. This will happen if you reduced the size of a column in

         --your target to less than that of your source. In other cases, the data type may have changed

         --and this will throw an error as well.

         IF @SQLError <> 0

            INSERT INTO SyncLog

            (ErrorNumber, Message, SQL, ErrorDate)

            VALUES

            (@SQLError, Error_Message(), @SQL, GETDATE())

      END CATCH

      SET @x = @x + 1

   END

Listing 5: SQL Statements Used to Migrate Data

ALTER TABLE \\[dbo\\].\\[MyTable\\] NOCHECK CONSTRAINT PK_MyConstraint

DELETE FROM \\[dbo\\].\\[MyTable\\]

SET IDENTITY_INSERT \\[dbo\\].\\[MyTable\\] ON (if applicable)

INSERT INTO \\[dbo\\].\\[MyTable\\] (\\[MyColumn1\\],\\[MyColumn2\\]) SELECT \\[MyColumn1\\],\\[MyColumn2\\] FROM \\[MyServer\\].\\[MyDataBase\\].\\[dbo\\].\\[MyColumn1\\]

SET IDENTITY_INSERT \\[dbo\\].\\[MyTable\\] OFF (if applicable)

ALTER TABLE \\[dbo\\].\\[MyTable\\] CHECK CONSTRAINT PK_MyConstraint

Comments

Plain text