Automatic Pilot

Downloads
27523.zip

When you apply snapshot replication to large-scale databases, the step that deletes the target records and the step that bulk copies the source data to indexed target tables can create performance-draining bottlenecks. To eliminate these bottlenecks—and hence optimize the replication—you can add customized steps to the Snapshot and Distribution Agent jobs. In the first article of this three-part series, "Bottleneck Blues," May 2002, InstantDoc ID 24518, I analyzed a snapshot replication's performance and introduced the customized steps. The second article, "Manual Transmission," August 2002, InstantDoc ID 25538, detailed full manual implementation of these steps. In this concluding article, let's look at how you can achieve a fast, reliable, and reproducible customization by using SQL Distributed Management Objects (SQL-DMO) to automate the customization process.

How the Optimization Works

Optimizing snapshot replication improves performance dramatically, but manually customizing Snapshot and Distribution Agent jobs to achieve optimization can be tedious and time-consuming in a large-scale environment that involves many servers. Because you have to manually resynchronize the customization each time you make these changes, the demand on your time is particularly burdensome when you add new publications or reconfigure existing ones.

Manually, the optimization process goes like this. You insert a new step right before the last step of a Snapshot Agent job. The command that this step executes starts a Distribution Agent job. You then insert two new steps into a Distribution Agent job, one before and one after its Run agent step. The before step drops all indexes and key constraints against the subscription tables. The after step recreates the indexes and key constraints that the before step dropped. Figure 1 shows the four steps of the optimized Snapshot Agent job; Figure 2 shows the five steps for each Distribution Agent job. The new customized steps are step 3 in the Snapshot Agent job and steps 2 and 4 in the Distribution Agent job.

As you can see, manual customization in Enterprise Manager involves a fair bit of work: inserting the new steps, entering hard-coded commands or a stored procedure for execution in each new step, and modifying the flow of the new steps and the affected existing steps. What's worse, any mistake or inconsistency that you introduce during customization will break the optimization. Instead of bogging down in these labor-intensive manual steps, you can create an automation tool that makes your customization process efficient, consistent, and reproducible. This automation tool implements the customization tasks through SQL-DMO.

Using SQL-DMO to Automate Customization

SQL-DMO exposes an object model that helps you administer SQL Server programmatically—and therefore automatically. In this model, publication and job objects expose the agent jobs of a publication as properties and provide methods to update the job properties. Let's look at a SQL-DMO application that can programmatically customize any replication agent job on any server. (This application is in the DMO_job.zip file that you can download at http://www
.sqlmag.com, InstantDoc ID 27523. Just click Download the code.) You can add this application to Enterprise Manager's Tools drop-down menu as an external tool. This tool looks and feels like part of Enterprise Manager and lets you perform standard and custom configuration without leaving Enterprise Manager.

Here's how the application works. First, it displays a list of registered servers. For each server the user selects, the application finds and lists the names of all subscriptions to publications created on that server, as Figure 3 shows. In this application, a subscription's listed name is a concatenation of publisher, replication database, publication, subscriber, and destination database names. By using these names, you can find the publication and subscription objects in the SQL-DMO model. In that model, each publication object contains a collection of subscription objects. Each subscription object contains the DistributionJobID property, which you can use to find the subscription object's own Distribution Agent job object. Each publication object also has as an attribute the SnapshotJobID property, which you can use to find the publication object's Snapshot Agent job object. Once you've found these two agent-job objects for each selected subscription, you're ready to insert customized steps into these jobs. Listing 1 shows the code that creates a list of all registered server names. The ListServers() procedure lists registered servers. When a user selects a server, the application records the server name in the sServer variable. Listing 2 concatenates and displays the listed names of all the subscriptions to the publications that were created on the selected server. Figure 3 shows examples of some subscriptions' listed names. To save space, I omitted data-type declarations.

As I mentioned, each subscription's listed name is in the format of server-replication database-publication-subscription. This is a combination of the names of a publishing server, a replication database, a publication, and a subscription, respectively. When a user selects a listed name, the application parses it into server, (replication) database, publication, and subscription names, and saves the parts in the variables sServer, sDB, sPublication, and sSubscription, respectively. For space reasons, I omitted the code that parses a listed name. Listing 3's code uses these saved variables to connect to the publisher and identifies the replication database, publication, and subscription objects in their respective collections. Note that a replication database object has a collection of publication objects named TransPublications as its property, and in turn, each publication object has a collection of subscription objects named TransSubscriptions as its property. Listing 3 shows how to identify a publication object and a subscription object in the TransPublications and TransSubscriptions collections, respectively. Next, the application searches for the selected publication's Snapshot Agent job object and the subscription's Distribution Agent job object by their job IDs, as Listing 4 shows.

After you've located these two agentjob objects, you're ready to begin customizing the jobs. First, you insert a new step into the Snapshot Agent job after the job's second step. This new step becomes the job's third step, and you assign it StepID 3 to indicate its position, as Listing 5 shows. Recall that the command at step 3 is to start one or more Distribution Agent jobs (I show only one job start here). In SQL-DMO, you insert a new step by creating a new job-step object, assigning the step's properties, then adding the new step to the job object's steps. You also have to modify the flow of the second step; in case of failure, its OnFailStep used to flow to the old step 3, but that step is now step 4, as Figure 2 shows. Listing 5 shows the process for creating the third step and modifying the second step.

Next, you customize the selected Distribution Agent jobs, as Listing 6, page 40, shows. Again, I use only one Distribution Agent job. Each job includes two new inserted steps. One new step calls a stored procedure before the original second step to drop indexes and keys on destination tables and save their scripts. After you insert this new step, it becomes step 2. You insert the other new step after the original second step (which is now the third step). The new step calls a remote procedure stored on the subscriber to run the scripts that the first new step (now step 2) saves. This new step becomes step 4.

Mechanically, the code in Listing 6 is similar to the code in Listing 5 (which customizes the Snapshot Agent job), so Listing 6 shows only important lines. You can find the complete listing for this automated SQL-DMO application in the DMO_job.zip file.

Fast Replication, Efficient Production

In a distributed enterprise that involves loosely connected remote locations (which use a T1 or wireless connection) or many servers, a huge synchronized snapshot can easily timeout or run out of off-peak hours. In my tests, the optimization this series describes cuts nearly an hour off the replication time for each server. If your environment can't detect changes and can't drop and recreate production tables easily, snapshot is your only replication choice. With this optimization, you can speed up your snapshot replication by truncating tables, a process that's faster and less destructive for production environments than dropping tables.

Comments

Plain text