How to Fix an Unbalanced DAG

How to Fix an Unbalanced DAG

It is an undeniable fact of system administration that programs and data operate well, most neatly, just after they have been first deployed. Over time components have a nasty habit of degrading, of transforming themselves into non-optimum configurations, or simply not working as well as they might. And so it is with Database Availability Groups (DAGs), which brings us nicely to RedistributeActiveDatabases.ps1, a script that’s provided for your use with the Exchange 2010 kit.

DAGs have proven to be the big success story for Exchange 2010 and are a major motivating factor in the decision that many companies have made to upgrade from previous releases. For the first time, Exchange includes native high availability features that scale past two nodes, a limitation that’s been in place ever since the introduction of the original “Wolfpack” clusters on Windows NT 4.0 with Exchange 5.5 in late 1997. DAGs scale to sixteen nodes, a limitation imposed by the underlying Windows Failover Clustering technology, and can accommodate hundreds of active databases if you run the enterprise edition of Exchange 2010. The standard edition supports DAGs too, but a server can only mount up to five databases at a time rather than the hundred supported by the enterprise edition.

So good so far. To get back to the point in hand, DAGs operate spiffingly well when they are first deployed and the active databases all run on their preferred server. The Active Manager component, which runs on every server in a DAG, attempts to mount databases on the most preferred server and uses a property of a database copy called the activation preference to help make this decision. For example, if a database copy on server ExServer1 has the highest activation preference (1), then Active Manager knows that, all things being considered, if it can make that database copy active, it should do so. Those who design DAGs pay attention to the activation preferences assigned to database copies to ensure that active database copies are distributed evenly across the available DAG member nodes.

Of course, things don’t always run smoothly and over time there’s a fair chance that the database copies with the highest activation preferences can’t be brought online for some reason. For instance, a server might be taken offline for maintenance such as the installation of a new service pack or roll-up update. In these cases, Active Manager will do the best that it can by selecting the database copy with the highest available activation preference. If you leave Exchange to do its thing, it’s possible to end up with an unbalanced DAG where some servers host more active database copies than others because of a series of non-optimal activations. You might have one server hosting four active database copies and another hosting ten while a third is not doing too much because it only hosts one. The ideal condition is for each server to host five active databases as this divides the workload evenly across the three.

Although this is a somewhat unrealistic example (hopefully the system administrator would have noticed that one server was doing twice the work that it was supposed to handle), it’s conceivable that it might happen. Handling situations like this is just what RedistributeActiveDatabases.ps1 is designed to do.

Essentially, the script looks at the available DAG members and the active databases that are currently mounted on each server. It then compares the current state with the best possible state using database activation preference as a guide. The best state is obviously to have the database copies with activation preference of one (1) mounted and active, so that’s what the script will attempt to do by activating the necessary databases.

The script supports a number of parameters to control how it operates, all explained in an informative TechNet article. Even equipped with this knowledge, RedistributeActiveDatabases.ps1 is not a script that you’d run on a whim, just because Friday afternoon has turned out to be boring and you have time on your hands. This is definitely a script to be tested before it can be appreciated by running at a well-chosen time when the resulting database transitions will exert no baleful influence over the affections of users.

DAG transitions are quick. In fact, they’re so quick that most users don’t realize that a transition has occurred, especially if they run Outlook in cached Exchange mode. But it only takes one user who doesn’t like losing service to complain to ruin your day. If only the user realized that you’re only running the script to rebalance the DAG and put everything back to where it should be. Alas, all they care about is a reliable email service. Where’s the fun in that?

Follow Tony @12Knocksinna

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.