Over the years, I created several custom solutions to help identify Data Manipulation Language (DML) changes and track historical changes. While effective, these solutions were sometimes cumbersome to use and maintain. SQL Server 2012 eliminates the need for such custom solutions, making life easier for everyone. In SQL Server 2012, there are two features that can help you track changes the painless way: change data capture and change tracking.
For the most part, the change data capture and change tracking features are interchangeable, with one exception. Both features utilize a synchronous tracking mechanism, which means SQL Server is minimally impacted. Table 1 shows what can be tracked with the two features. As you can see, the main difference between them is historical data tracking. Only the change data capture feature lets you track the actual data that was changed.
|Item Tracked||Description||Change Data Capture Feature||Change Tracking Feature|
|DML changes||Inserts, updates, or deletes||Yes||Yes|
|Historical data||Data changes over time||Yes||No|
|Column changes||Add, alter, or drop columns||Yes||Yes|
|DML type||Inserts, updates, or deletes||Yes||Yes|
Before you can use the change data capture and change tracking features, you need to do the following.
Execute sys.sp_cdc_enable_db. Although the tracking will be taking place at the table level, you need to make the database aware of this capability. You can do so by executing the sys.sp_cdc_enable_db system stored procedure, as shown below:
Execute sys.sp_cdc_enable_table. You need to make the table you want to track aware of the tracking by executing the sys.sp_cdc_enable_table system stored procedure. Here's its syntax:
EXEC sys.sp_cdc_enable_table [ @source_schema = ] 'source_schema', [ @source_name = ] 'source_name', [ @role_name = ] 'role_name' -- Specify the gating role if gating is desired -- or set to NULL if you don't want gating. [,[ @capture_instance = ] 'capture_instance' -- Name instance-specific changes. [,[ @supports_net_changes = ] supports_net_changes ] [,[ @index_name = ] 'index_name' | NULL ] [,[ @captured_column_list = ] 'captured_column_list' | NULL ] -- Use NULL if you want all columns captured. [,[ @filegroup_name = ] 'filegroup_name' ] [,[ @allow_partition_switch = ] 'allow_partition_switch' ]
As you can see, you need to provide information such as the source table's schema and name, who has permission to access the change data, and the gating role member if gating is desired. For more information about the syntax, see "sys.sp_cdc_enable_table (Transact-SQL)" in SQL Server 2012 Books Online (BOL).
After tracking is enabled on the table and database, information about any changes made to that table is written to system tables in that database. The tables will begin with the schema name cdc. In addition, the dbo.systranschemas table is automatically created to track replication schema changes.
Let's Get Started
To introduce you to the types of information that you can capture, I wrote a script that:
- Creates a database named Learn_CDC.
- Creates and populates a table named MyCDCPlay.
- Enables the change data capture feature on the Learn_CDC database.
- Enables the change data capture feature on the MyCDCPlay table.
- Adds data to the MyCDCPlay table, which will be tracked by the change data capture feature.
Listing 1 contains this script, which you can download by clicking the Download the Code button near the top of the page.
CREATE DATABASE Learn_CDC GO USE Learn_CDC GO CREATE TABLE MyCDCPlay (ID INT IDENTITY(1,1) PRIMARY KEY, cdccol1 VARCHAR(10) NOT NULL, cdccol2 CHAR(3) SPARSE NULL, cdccol3 VARCHAR(20) SPARSE NULL); GO INSERT INTO MyCDCPlay (cdccol1,cdccol2, cdccol3) VALUES ('CDC1Play', 'YES', 'Some cool stuff') GO INSERT INTO MyCDCPlay (cdccol1) VALUES ('CDC2Play') GO -- Make sure you have data. SELECT * FROM MyCDCPlay; -- Enable CDC on database. USE Learn_CDC GO EXECUTE sys.sp_cdc_enable_db; GO -- Enable CDC on table. Make sure that -- SQL Server Agent is running. USE Learn_CDC; GO EXECUTE sys.sp_cdc_enable_table @source_schema = N'dbo' , @source_name = N'MyCDCPlay' , @role_name = NULL GO -- Insert more data, which will be tracked. INSERT INTO MyCDCPlay (cdccol1,cdccol2, cdccol3) VALUES ('CDC4Play', 'No', 'Some cool stuff') GO INSERT INTO MyCDCPlay (cdccol1) VALUES ('CDC5Play') GO -- Make sure you have data. SELECT * FROM MyCDCPlay;
After you run the code in Listing 1, you'll see a folder structure similar to that shown in Figure 1.
How to Access the Information About the Tracked Changes
As I mentioned previously, the script added data to the MyCDCPlay table after the change data capture feature was enabled. To see the information that was captured by the change data capture feature, you can query the system tables created by the feature. Note that SQL Server 2012 BOL recommends that you don't query system tables directly. Table 2 provides the stored procedures and functions that SQL Server 2012 BOL recommends you use to query these system tables.
|System Table||Query With|
|cdc.captured_columns||sys.sp_cdc_get_source_columns stored procedure|
|cdc.change_tables||sys.sp_cdc_help_change_data_capture stored procedure|
|cdc.ddl_history||sys.sp_cdc_get_ddl_history stored procedure|
|cdc.index_columns||sys.sp_cdc_help_change_data_capture stored procedure|
|cdc.lsn_time_mapping||sys.fn_cdc_map_lsn_to_time and sys.fn_cdc_map_time_to_lsn system functions|
To start, you can query the cdc.captured_columns table to see the information about the columns that were tracked. Figure 2 shows the results of this query.
Querying the cdc.change_tables table gives you information about the table. Figure 3 shows the results of this query. Note that the capture_instance parameter wasn't specified when the table was enabled with sys.sp_cdc_enable_table, so the table uses the default schema and the default table name.
Querying the cdc.dbo_MyCDCPlay_CT table gives you information about the data changes. As Figure 4 shows, this is the information that was specified during the two INSERT operations.
At this point, let's update the data in the MyCDCPlay table by running the command:
UPDATE MyCDCPlay SET cdccol2 = 'be' WHERE cdccol1 = 'CDC4Play'
Figure 5 shows the results. Notice that the results include both the historical data and new data. You can collate them to know exactly what changed.
Querying the cdc.ddl_history table gives you information about any schema changes. However, in this case, the script didn't make any schema changes, so you wouldn't get any results. Let's make some schema changes to see what happens:
ALTER TABLE MyCDCPlay ADD cdccol5 varchar(30) Sparse NULL ALTER TABLE MyCDCPlay ALTER COLUMN cdccol1 VARCHAR(50) NOT NULL
If you now query the cdc.ddl_history table, you'll get results like that in Figure 6. As you can see, the schema changes are captured, including when the changes occurred.
The last three tables you can query are:
- cdc.index_columns. Querying the cdc.index_columns table provides information about the indexed columns associated with the table that was changed.
- cdc.lsn_time_mapping. Querying the cdc.lsn_time_mapping table provides information about the time mapping for the changes.
- dbo.systranschemas. Querying the dbo.systranschemas table shows you the schema changes and the type of schema that was changed.
There are a few shortcomings with the change data capture and change tracking features. Here are some shortcomings that I've come across:
- The features provide a table-by-table solution. If you have an environment with hundreds tables, it would be cumbersome to implement. This usually pushes people to use third-party tools.
- Information about the user who made the change isn't captured. Microsoft recommends the use of SQL Server's audit features for this capability.
- Synchronous operations contribute to slow data commits.
- Changes to computed columns aren't tracked. In addition, there are limitations when tracking changes in columns containing XML, Sparse, Timestamp, and BLOB data types. For more information about these limitations, see "Track Data Changes (SQL Server)" in SQL Server 2012 BOL.
Document Changes and More
The change data capture and change tracking features let you easily track changes to your tables. You can use these features to not only document changes but also create back-end systems for applications. For example, you could use the features as part of a parcel tracking system that keeps track of where parcels are located at any given point during the shipping process. You could even include a mechanism to evaluate the system and determine where efficiency can be improved.
Leonard Mwangi is a design architect with Alexander Open Systems, concentrating mostly on information worker solutions and applications integration. Leonard has more than 12 years of experience in the IT industry, primarily focusing on SQL Server and SharePoint.