Using SSIS to Collect Baseline Data for Merge Replication

SQL Server has a vast number of features available to DBAs and database developers to ensure that tasks are properly implemented, designed, and maintained with great stability. Two features that you might not have considered combining but that greatly complement each other are SQL Server Integration Services (SSIS) and merge replication.

Related: SSIS Package Pings Servers

SSIS was introduced in SQL Server 2005 as a replacement for DTS. With SSIS, you gain the ability to implement enterprise solutions for extraction, transformation, and loading (ETL). It also gives you an incredibly robust automation and job services platform. A DBA can implement various SSIS packages to automate high-level preventative tasks to ensure SQL Server instances are always running as best they can in their environments.

In SQL Server, merge replication has evolved into a true enterprise data synchronization and distribution layer. As with many features in SQL Server, setting it up is easier than maintaining it. An improperly maintained merge replication system can lead to downtime, slower overall performance, and the need for redesigns. The cost of these problems is always seen by the business as a budget loss, not accounted for in the initial budget process.

To prevent such problems, there are steps you can take to improve the processes needed to maintain merge replication. One of the most important steps is obtaining a performance baseline to assist with any problems that might arise later.

Knowing the current state of merge replication on the database server is crucial to being proactive and preventing problems. There are three primary areas that you should concentrate on for baseline information collection:

1. Distribution

2. Publication

3. Subscription

Distribution statistics can be captured by monitoring the execution plan cache and index utilization. This involves monitoring fragmentation and other areas related to query tuning. Because most administrators and DBAs are familiar with this type of monitoring, I'll move on to the publication and subscription areas.

Both publications and subscriptions can be monitored using a joint set of operations. Several system stored procedures and tables are available to assist in collecting information about how the publications and subscriptions are functioning. For example, in a publisher and pull subscriber landscape, the following system stored procedures and tables provide performance indicators for baseline collections:

Note that the type of subscriptions being used in a replication landscape can affect where the merge replication tables are located and where merge-replication system stored procedures are executed. Pull versus push subscriptions cause differences in where the metadata is located and where work is performed when historical and active sessions are synchronized. For more information about the differences between pull and push subscriptions, see "Implementing Replication Overview" in SQL Server Books Online (BOL).

Gathering the Subscription Header Information

The first step in collecting the baseline data for the publications and subscriptions is gathering the subscription header information. You can use the code in Listing 1 (below) to create the SubscriptionHeader table, which will contain the subscription information for each subscriber to the specified publication. Figure 1 shows the layout of this table.

Figure 1: Layout of the SubscriptionHeader Table

To populate the SubscriptionHeader table, you use MSmerge_agents (the distribution database table) and sysmergesubscriptions (the publication database table). The query in Listing 2 (below) shows how to retrieve the required data. To use this code, you need to replace . with the name of your publication database in the code in callout A.

Using the query in Listing 2, you can also periodically perform an incremental update of the SubscriptionHeader table. The query checks to see whether subscriptions have been added, have been removed, or require updates. This is a crucial step. It ensures that the appropriate subscriptions are included in the baseline collection so that the statistics about those subscriptions can be gathered.

For example, suppose a new subscription is added to a publication landscape that covers the entire United States, and each state is set as a region. Each region has the potential to have varying latency based on connectivity, the amount of data (which can be affected by filtering in the publication), and concurrent subscriber connections. Regularly running the query will ensure that the new subscription is added to the baseline collection so that its statistics can be gathered. If the query isn't run regularly, the new subscription's statistics won't be gathered. Consequently, if there's a performance problem, it would be much more difficult to determine whether the new subscription is causing the problem.

Automating the Incremental Updates

Using SSIS is typically the best way to handle incremental updates because they can be achieved quickly and with little development cost. The development cost is an important consideration. To calculate the cost, you need to compare the hours that would be spent on developing the SSIS package against the cost of manually running the incremental updates. For example, would a one-time development cost of 20 hours outweigh a daily 20-minute manual process? In this case, spending 20 hours on automating the task with SSIS is better than putting in an estimated 87 hours a year (5 days ´ 20 minutes ´ 52 weeks / 60) manually running the incremental updates. The ROI can be seen, but this is only due to the hours being in favor of SSIS. Make sure that the hours are in favor of SSIS before you begin developing your own automated process.

Figure 2 shows the incremental update task in SSIS. Before I describe what's happening in this data flow, you need to know some background information. If a subscription is removed and reinserted, the subscription's name will be the same but its ReplNick name might change. The schema version often changes as well. So, the SubscriptionHeader table includes the replnickname and schemaversion columns to ensure the ReplNick name and schema version are changed when necessary.

Figure 2: Incremental Load for the SubscriptionHeader Table

In the data flow in Figure 2, the first Lookup task determines whether the name returned from the source (i.e., the query in Listing 2) exists in the SubscriptionHeader table. If the name doesn't exist, the Lookup task proceeds with a No Match condition and inserts the row into the SubscriptionHeader table as a new subscriber. If the name exists, the next step is to determine whether the schema version or ReplNick name has changed. To do this, a Conditional Split transformation is placed in the path. The following expressions validate whether the source data is the same as the destination data and ensure the data type comparison is valid:

!ISNULL(Lookup.schemaversion)

&& !ISNULL([Source - Query].schemaversion)

&& (DT_STR,300,1252)Lookup.schemaversion !=

[Source - Query].schemaversion

!ISNULL(Lookup.replnickname)

&& !ISNULL([Source - Query].replnickname)

&& (DT_STR,300,1252)Lookup.replnickname !=

(DT_STR,300,1252)[Source - Query].replnickname

If either expression returns True, the appropriate OLE DB command will be executed to update the schemaversion or replnickname column in the SubscriptionHeader table. To learn more about how to set up incremental updates, read the blogs " Backup File Contents with SSIS – INSERT/UPDATE Decisions " and " SSIS Design Pattern – Incremental Loads."

Now that the incremental update task for the subscription information is developed, you need to add a task that validates the SubscriptionHeader table's existence. (You can obtain the code for this task from the SSIS package developed for this article; you can download the Merge Replication Baseline Collector from CodePlex. You'll also find instructions on how to set up the variables and configurations.) Both tasks can then be wrapped in a Sequence container. Sequence containers are used to logically group similar tasks, which can be useful in error handling. Figure 3 shows the Sequence container for loading the subscription header data.

Figure 3: Sequence Container for Loading the Subscription Header Data

Gathering Publication Statistics

Next, you need to gather statistics about the subscriptions in the SubscriptionHeader table, including their overall performance, number of replicated events, and status. To collect the subscription statistics, you need to use sp_replmonitorhelpmergesession. Replication Monitor uses this system stored procedure to visually monitor subscriptions. It relies heavily on the agent that processes the data to be synchronized between the publishing data source and the subscribing data sources. However, before you can use sp_replmonitorhelpmergesession, you need to collect the values needed for several of its parameters, including the unique agent ID of the subscriptions. The importance of the agent ID goes back to the previous container that obtains and updates the subscription header information. Like the ReplNick name and schema version, it's possible for an agent ID to change given a subscription's removal and reinsertion.

The parameters required to execute sp_replmonitorhelpmergesession are:

· @agent_name. This parameter specifies the agent name, which will be obtained from the agent_id underlying the agent_name.

· @hours. By default, this parameter is set to -1 so that all the previous sessions (up to 100) will be returned.

· @publisher. This parameter specifies the name of the SQL Server instance hosting the publication.

· @publisher_db. This parameter provides the name of the database being published.

· @publication. This parameter specifies the name of the publication taking part in the merge events.

To retrieve the values for these parameters and gain the ability to execute the stored procedure for all the subscribing SQL Server instances, you use the MSreplication_monitordata table in the distribution database. This table holds a cached row for every subscriber that has attempted to synchronize to the publication. This can be a successful or failed merge attempt.

The information retrieved from MSreplication_monitordata can prove useful for historic retention. For this collection process, the agent_name, publisher, publisher_db, and publication keys will be used to fulfill the parameter expectations of sp_replmonitorhelpmergesession. Before you can run a SELECT statement against MSreplication_monitordata, there needs to be a unique key that allows an iteration of each row in the results -- a primary key, if you will. Using the ROW_NUMBER() function is an ideal way to accomplish this. It can quickly and efficiently number the rows sequentially and add those numbers to a new column. This column can then be used to iterate through the results and perform specific operations given the values returned.

To pull this all together, the Create_And_Populate_MERGE_STATS.sql script is used. After creating the MERGE_STATS table, this script retrieves the subscriptions found in MSreplication_monitordata, then loops through each row and executes the sp_replmonitormergesession stored procedure. The results of each execution are inserted into a secondary global temporary table named ##GLOBALSTATS.

The data in ##GLOBALSTATS will later be inserted into the MERGE_STATS table, the layout of which is shown in Figure 4. Note that when using a task to create a temporary table in SSIS, you need to set the task's DelayValidation setting to True. This will prevent the task from forcing an error in the development environment and allow the package to be built without the objects pre-existing.

Figure 4: Layout of the MERGE_STATS Table

In SSIS, the query in Create_And_Populate_MERGE_STATS.sql is used as the source for the Data Flow task that retrieves the final results and imports them into the MERGE_STATS table. Figure 5 shows the Sequence container holding this process. As you can see, the global temporary tables are first validated for existence. If they exist, they're dropped to prevent errors later. The last step in the process is to insert the contents of the resulting global temporary table into the MERGE_STATS table.

Figure 5: Sequence Container for Loading the Subscriber Statistics

Gathering Article Statistics

The last step in the collection process is to retrieve the change counts that are occurring for each article in the publication. Although gathering subscription information is important in setting baselines for merge replication, knowing which articles changed and when they changed can be even more valuable. Gathering article statistics helps you determine what is considered a normal replication event so that you can act on the abnormal events.

Significant changes in an article's count will typically coincide with changes in its performance. For example, suppose an article logs a 300 percent increase in changes. All the subscribers to that article likely experienced poor performance. The baseline collection will uncover these types of spikes for you.

Gathering article statistics lets you react to and prevent spikes, or warn the subscribing community of the events prior to them occurring. Luckily, merge replication exposes article change counts fairly easily through the sysmergearticles, sysmergesubscriptions, and MSmerge_genhistory tables. You can use the query in Listing 3 to query all three tables.

Note that MSmerge_genhistory can be an extremely large table to query. It contains a row for every generation that each subscriber is aware of (within the retention period), so it can grow to thousands or millions of rows, depending on the size of the replication environment. Thus, you need to make sure that this table is properly indexed for the collection you're creating. Also note that placing a shared lock on MSmerge_genhistory for long periods of time can cause problems because it's an internal merge replication table and is required for synchronization and updates.

The results from the query in Listing 3 are used to populate the MERGE_COUNTS table, which is created by the code in Listing 4. Figure 6 shows the layout for that table.

Figure 6: Layout of the MERGE_COUNTS Table

In SSIS, the process to gather the article statistics is wrapped in the Sequence container shown in Figure 7. The query in Listing 4 is used as the source of the Data Flow task. The final step is importing the results to the MERGE_COUNTS table.

Figure 7: Sequence Container for Loading the Change Counts

Reporting Baseline Information

Given all the data you've collected in these three tables, you can use a simple query to monitor the baseline statistics. The query in Listing 5 retrieves information about how many upload events (inserts, updates, and deletes) occurred. You can import the query results into Microsoft Excel, SQL Server Reporting Services (SSRS), or another reporting solution to build a chart that displays the overall events and noticeable spikes. For example, in the chart in Figure 8, the tall bars on 5/22/2012 represent a spike that warrants further investigation. The other spikes might represent known spikes that occur in normal operations. If they're known spikes, the system owner could relay the fact that synchronizing on those specific days of the week or month might take longer.

Gain a Deeper Understanding

Collecting baseline data for a feature, system, or task can help you gain a deeper understanding of how it's performing as well as learn more about how it might react to and function in specific situations. When you collect baseline information, it's common to start to understand how something like merge replication works internally. This leads to better designs, maintenance, and scalability.

There's another benefit in collecting baseline data. Imagine that a new SQL Server machine will be introduced into a merge replication landscape and act as an additional subscriber. If baseline data has been collected, it'll be much easier to size and predict the hardware and resource needs of this new SQL Server machine. Similarly, implementing other features such as mirroring is much easier. Knowing how the feature, system, or task runs in every unique situation is a powerful tool to use in the current environment and to accommodate future growth.

Listing 1: Code to Create the SubscriptionHeader Table

CREATE TABLE [dbo].[SubscriptionHeader] (

  [name] [nvarchar](100) NOT NULL

  ,[publication] [nvarchar](128) NOT NULL

  ,[subscriber_db] [nvarchar](128) NULL

  ,[subscriber_name] [nvarchar](128) NULL

  ,[agent_id] [int] NOT NULL

  ,[application_name] [nvarchar](128) NULL

  ,[replnickname] [binary](6) NULL

  ,[schemaversion] [varchar](50) NULL

  ) ON [PRIMARY]

GO

Listing 2: Query to Populate the SubscriptionHeader Table

SELECT agents.[name]

  ,agents.publication

  ,agents.subscriber_db

  ,agents.subscriber_name

  ,agents.id AS [agent_id]

  ,subnames.application_name

  ,subnames.replnickname

  ,subnames.schemaversion

FROM distribution..msmerge_agents AS agents

-- BEGIN CALLOUT A

LEFT JOIN ..sysmergesubscriptions

-- END CALLOUT A

  AS subnames

  ON agents.subscriber_name = subnames.subscriber_server

GROUP BY agents.[name]

  ,agents.publication

  ,agents.subscriber_db

  ,agents.subscriber_name

  ,agents.id

  ,subnames.application_name

  ,subnames.replnickname

  ,subnames.schemaversion;

Listing 3: Query to Obtain the Article Change Counts

SELECT NAME AS [Object Name]

  ,MSMerge_genhistory.subscriber_number

  ,IsNull(subs.subscriber_server, 'Unknown')

[Subscriber Name]

  ,MAX(subs.last_sync_date) [Last Sync Date]

  ,MAX(coldate) [Generation Date]

  ,SUM(changecount) AS [Change Count]

FROM dbo.MSmerge_genhistory WITH (NOLOCK)

INNER JOIN dbo.sysmergearticles WITH (NOLOCK)

  ON art_nick = nickname

LEFT JOIN dbo.sysmergesubscriptions subs WITH (NOLOCK)

  ON MSmerge_genhistory.subscriber_number =

subs.subscriber_number

WHERE genstatus = 1

  OR genstatus = 2

GROUP BY NAME

  ,MSMerge_genhistory.subscriber_number

  ,subs.subscriber_server

Listing 4: Code to Create the MERGE_COUNTS Table

CREATE TABLE [dbo].[MERGE_COUNTS] (

  [Object Name] [nvarchar](128) NOT NULL

  ,[subscriber_number] [int] NOT NULL

  ,[Subscriber Name] [nvarchar](128) NOT NULL

  ,[Last Sync Date] [datetime2](3) NULL

  ,[Generation Date] [datetime2](3) NULL

  ,[Change Count] [int] NULL

  ,[CollectDate] [datetime] NULL

  ) ON [PRIMARY]

GO

ALTER TABLE [dbo].[MERGE_COUNTS]

  ADD DEFAULT (getdate()) FOR [CollectDate]

GO

Listing 5: Query to Monitor the Baseline Statistics

SELECT hdr.publication

  ,hdr.subscriber_name

  ,stats.delivery_rate

  ,stats.upload_inserts

  ,stats.upload_updates

  ,stats.upload_deletes

  ,counts.TotalChanges

  ,counts.[Object Name]

  ,CONVERT(VARCHAR(10), InsertDateTime, 121) CollectionDate

FROM MERGE_STATS stats

INNER JOIN SubscriptionHeader hdr

  ON stats.agent_id = hdr.agent_id

INNER JOIN (

  SELECT CollectDate

,[Subscriber Name]

,[Object Name]

,[Last Sync Date]

,SUM([Change Count]) TotalChanges

  FROM dbo.MERGE_COUNTS

  WHERE [Last Sync Date] > '1899-01-01'

  GROUP BY CollectDate

,[Subscriber Name]

,[Object Name]

,[Last Sync Date]

  ) counts ON hdr.subscriber_name = counts.[Subscriber Name]

GROUP BY InsertDateTime

  ,hdr.publication

  ,hdr.subscriber_name

  ,stats.delivery_rate

  ,stats.upload_inserts

  ,stats.upload_updates

  ,stats.upload_deletes

  ,counts.TotalChanges

  ,counts.[Object Name]

Comments

Plain text