Distributed data environments present a constant challenge. The idea of giving users in the field individual copies of data or subsets of data horizontally filtered for their areas of responsibility is easy to grasp but not necessarily easy to implement. Typically, field users must periodically synchronize their data with a central database to maintain an enterprise data set. The ability of field and headquarters users to update the same records in different copies of the database exacerbates the complexity of such a distributed environment. Add slow modem connections from hotel rooms, and you have the potential for some big headaches.
SQL Server 7.0 adds merge replication to SQL Server 6.5's snapshot and transactional replication. Merge replication is more complicated than the other types but is potentially the most useful for distributed data environments. Of the three types, merge replication provides the highest level of autonomy. Publishers and subscribers can work independently and reconnect periodically to merge their results. If users at multiple sites create a conflict by trying to change the same data element, those changes are resolved automatically. These characteristics make merge replication an ideal solution for applications in which users need full read/write access to local replicas of data in a highly disconnected environment. This article describes a real-world solution we developed for a client using merge replication and shows you how to use the replication wizards to configure replication. (For information on how SQL Server incorporates merge replication, see "Behind the Scenes," page 29.)
The Show Must Go On
Our client operates a popular circus and other live entertainment performances. About 40 users are responsible for various engagements, which headquarters assigns. As headquarters changes assignments, users in the field must adjust their data sets to maintain only records for which they are responsible. The client's previous solution used Sybase's SQL Anywhere, which supports SQL code in a trigger that changes the user's subscription based on the assignments stored in a table in the headquarters database. The client decided to move to SQL Server 7.0 but needed to continue supporting the users' distributed data requirements.
The users need only the data they're responsible for. So when they synchronize their data, they might gain new data or give up data, based on their user ID in the field_user table. Therefore, we needed to filter the data in the field_user table based on user ID, and we needed to enable inserting and deleting of rows in related tables. The solution also had to support the synchronization of field and headquarters data modifications. We believed that dynamic and join filtering, combined with the conflict-resolution capabilities of merge replication, could meet these requirements.
To demonstrate the functionality that the client wanted, we used the Pubs sample database to simulate the operational environment. We added a table called field_user, which matches users with events. Screen 1 shows a partial database diagram of our modified Pubs database with the field_user table and related tables. Using this structure, we set out to demonstrate to our client the power of combining dynamic and join filtering in merge replication.
The Mechanics of Merge
The first step was to create the tables in the database. The SQL script in Listing 1 creates these tables. Before you designate a publisher, you need to specify a distributor for that publisher. Either the local server or a remote server can act as the distributor for the publisher. In this case, use the Configure Publishing and Distribution Wizard to configure your server as both a publisher and a distributor. To start the Configure Publishing and Distribution Wizard, select and double-click a server group, then select and double-click a server. On the Tools menu, point to Replication, then click Configure Publishing and Subscribers.
The next step prompts you to specify a location for the distribution database and the log file or accept the default values, which you see in Screen 2. Then click Finish in the next dialog box to enable both the distributor and the publisher. A message box tells you that the configuration process was successful. Note that designating a server as a distributor uses additional server disk space to store the distribution database and the snapshot of the publication. Also, the replication agents running on the server use additional processor time.
After you set up the distributor and publisher, you have a tool called Replication Monitor installed on your system. Using this tool, you can monitor details about the current activity and the task history of each replication agent. Now you're ready to create publications.
To start the Create Publication Wizard, click the Publisher. On the Tools menu, point to Replication, then click Create and Manage Publications. In the databases and publications list, click the database to create a publication from, then click Create Publication to bring up the Create Publication Wizard welcome dialog box. Click Next to proceed. On the Choose Publication Type screen, choose Merge publication, as you see in Screen 3, page 26. The next dialog box lets you specify subscriber types. In this case, all subscribers are SQL Servers. As Screen 4 shows, Specify Articles lets you choose all or a subset of the articles (tables) for publication. Subscribers can subscribe only to a publication, not to individual articles within a publication.
When publishing a table under merge replication, SQL Server changes the database schema: It identifies a unique column for each row in the table being replicated. This column identifies the row uniquely across multiple copies of the table. If the table contains a uniqueidentifier column with the ROWGUIDCOL property, SQL Server automatically uses that column as the row identifier for the replicated table. Otherwise, SQL Server adds a rowguid column (with the ROWGUIDCOL property) to the table. SQL Server adds an index and the rowguid column, as in Screen 5.
After you specify articles, the wizard prompts you to provide a name and description for the publication, as in Screen 6. Next, build the dynamic and merge-filtering properties into the publication. Select Yes to implement filtering. The next screen confirms your decision to filter the data in the publication. We wanted to filter the data to the subscribers based on rep_id, which maps to the Windows NT logon ID. So we selected the dynamic filtering option, as in Screen 7. With the dynamic filter, all field users can subscribe to the same publication. During synchronization, the Merge Agent uses the subscriber's username, which maps to the value in the rep_id column of the field_user table, to connect to the publisher and transmits only the data specific to that username. We used the intrinsic function SUSER_SNAME() to return the login ID name from a user's SID, as you see in Screen 8.
The next screen leads to the creation of the join filters. Join filters let you extend the filtering of rows in one table to rows in related tables. We had already filtered the field_user table horizontally for publication. We also wanted to include in the publication other tables that contain data related to the data in the field_user table. However, we didn't want the publication to include all the data in the related tables. Join filters let you include only those rows in the related table that you define as relevant to the publication. In Screen 9 you can see that the wizard automatically specifies the necessary joins on all related tables.
On the next screen, you can choose to allow anonymous subscribers or enabled subscribers. We chose enabled subscribers because we wanted to limit access to the publications to only qualified field representatives. Next, a dialog box lets you set the schedule for the snapshot process. The wizard automatically creates a job based on the schedule you set. Before a subscriber can participate in the merge replication process, the subscription must have a starting point. The subscriber must have tables with the same schema and data as the publisher tables. The snapshot process generates the initial synchronization by copying the complete current publication from the publisher to the subscriber.
Now click Finish, and the wizard creates the publication based on the options you selected and returns you to the Create and Manage Publications dialog box, where you'll see under the Pubs database an icon displaying the new pubs_merge article that you created. Congratulations! You created your first merge article with dynamic and merge filtering.
The Subscriber's Side
Let's take a look at the other side of the replication process—the subscriber. This stage of replication is generally self-explanatory, and the Pull Subscription Wizard does an excellent job of walking you through the steps. However, in our scenario, it was important to look more closely at two aspects of security planning: publication access lists (PALs) and agent login security.
PALs. SQL Server lets you determine which logins have access to publications. SQL Server creates the PAL with default logins, but you can add or delete logins from the list. You can view or modify the properties of a publication through the Tools menu: Point to Replication, then click Create and Manage Publications. Select the publication and click Properties and Subscription. In the Properties dialog box, click Publication Access List. We added all the field reps' logins to the Publication Access List tab of the Properties dialog box, which Screen 10 shows.
Agent login security. Replication implements login security by requiring a user to have a valid login account and password to connect to a publisher, distributor, or subscriber. Replication agents, which run under SQL Server Agent, use the associated logins and passwords to connect to the various replication objects and to perform their roles in the synchronization process. On Windows 9x platforms, which our client uses on the client side, SQL Server Agent and the replication agents run under the security account of the user logging on to Windows. On NT platforms, the replication agents run under the login or security context of the SQLServerAgent service.
Dynamic filtering criteria based on the login ID works well for Win9x subscribers, but not for NT subscribers. For NT subscribers, you need to modify the Merge Agent's login properties so that it runs under the account of the user who is logging on to NT. You can access this feature by clicking the Pull Subscriptions folder in the database folder, as in Screen 11, then right-clicking the pull subscription icon and selecting Properties from the object menu. At the Pull Subscription Properties dialog box, click the Security tab. Here, you specify the login account the Merge Agent uses to access both the publisher and the distributor. We used the user's login account.
Whether you're supporting a mobile sales team, a data warehouse, or another distributed environment, replication becomes the keystone in carrying data quickly and reliably throughout the enterprise. SQL Server 7.0 provides a variety of scalable replication solutions that can fulfill the broadest spectrum of information requirements.Corrections to this Article:
- Letters (January 2000)- Correction In "Merge Zone" (November), the database diagram contained an error. The correct diagram has an additional table, field_user, that is used for the dynamic filtering criteria. See sidebar in Letters (January 2000).