Active Directory (AD) is an important technology. However, AD presents certain problems that can negate the service’s power and strength. Learn how to troubleshoot common AD annoyances such as time synchronization after hardware replacement, cross-forest authentication, usability of the least privilege model, and 64-bit Windows challenges.
Over the past eight years I’ve helped plan, implement, and operate various Active Directory (AD) infrastructures. And as much as I value AD’s power and strength, I’ve also learned quite a few annoying things about AD that sometimes prevent it from operating as smoothly as possible. In this article I discuss some of these annoyances and explain how to best work around them.
In general, all AD domains are rather tolerant to hardware problems that take down a single domain controller (DC). Of course this is only true if you follow the best practice of implementing more than one DC per domain and if you continuously monitor that they’re replicating the changes amongst themselves. This way, if one DC fails for some reason, clients wanting to authenticate to the domain will leverage DNS to find another DC in the network to connect and authenticate to. For normal operations, no problems occur even if one of the special Flexible Single-Master Operation role-holder DCs goes down for a few hours or even a few days. AD is designed to operate without all the FSMO DCs being available all the time. Obviously, you shouldn’t update your schema or mass-create new objects in your domain when specific FMSO DCs are down. But normal operations, such as users changing their passwords or administrators adding an occasional new object to the domain, will still run. This is one of the key strengths of AD and its multi-master replication model.
But sometimes it isn’t the hardware failure of the DC that causes a problem. Sometimes problems don’t start until you repair the hardware and reboot the DC—especially if the DC is your domain’s PDC emulator. By default, all DCs in an AD domain synchronize their time with the PDC emulator of the respective domain. Computers and servers joined to the domain then synchronize their time with the DC they use for authentication—usually a DC within their AD site. For Kerberos authentication to work, all these clients and DCs must be synchronized in time. (In an AD domain, Windows 2000 Server and later clients and servers leverage Kerberos by default.) If the time skew (difference) is too large between a client and the server it wants to access a resource from, such as a file share, authentication to the resource server fails. The default accepted time skew in an AD forest is five minutes. So even if a user or computer properly authenticates to a domain, it might fail to access a server because of a time difference.
What does all of this have to do with the hardware failure of a DC in your domain, potentially even the PDC? Quite simple: If your hardware repair involves replacing a server’s motherboard, you usually also replace the on-board clock. And it’s highly unlikely that the time set on the new motherboard’s system clock is in sync with the rest of your AD forest. If you then just reboot the PDC while it’s on the network, the other DCs will synchronize their times with the PDC when they see that it’s online again. Thus, you might introduce a time skew on various machines in your environment that’s unacceptable to Kerberos. Although your PDC might have been properly configured to replicate with an external time source, the wrong time has now made its way into your network and will cause problems such as Microsoft Exchange Server servers not being able to leverage Global Catalogs (GCs) in their site for LDAP lookups, or users not being able to access file shares. Your environment might not normalize for hours or days and might even require manual intervention.
The solution to this problem is as simple as the problem itself: If you need to replace a DC’s motherboard, particularly for a broken DC that hosts the PDC emulator FSMO role, remove the network cable before you reboot the DC. After the DC reboots successfully (which might take longer than normal because the DC won’t be able to find other DCs to replicate with), you need to log on locally and update the time on the DC. Afterwards, you can plug the network cable back in. Alternatively, if your PDC is still responsive, you can temporarily transfer the PDC role to another DC and transfer it back after replacing the motherboard. These methods prevent time-synchronization problems that in turn cause trouble with network authentication.
It’s difficult enough for most AD administrators to understand how clients leverage DNS to locate DCs of their own forest or domain within their own network. So before we look at how this process might work across different AD forests and networks, let’s quickly review the DC location process within an AD domain.
In short, a Win2K or later client that has never authenticated to an AD domain will query DNS to ask for any DC that’s responsible for its own domain. The client does so by asking the DNS server to return the list of all DCs that have registered the generic DC locator record (which by default includes all DCs in an AD domain). To retrieve these records, the clients first query for the generic LDAP service records in the DNS hierarchy’s _msdcs zone. For an AD domain called MyCompany.net, these generic records are located in the following DNS hierarchy: _ldap._tcp.dc._msdcs .MyCompany.net.
The client then contacts a few of the DCs in the list returned from the DNS server, notifies them of its intention to be authenticated, and waits for the first DC to respond. However, the DCs are smart enough to understand the situation and therefore check the IP address that the client is using in its request. They see that the client is joined to the domain, and they compare the client’s IP address with the site and subnet data stored in the AD configuration partition. With this data, the DCs determine the client’s proper site, and they tell the client to connect back to the DNS server and query for the actual DC to authenticate to in its own site. The client then requests the proper Kerberos service record.
Let’s assume the client is located in a branch office of the MyCompany.net AD domain and the AD site name is BranchSite. The client would query DNS for the sitespecific Kerberos service records registered for this site. These records are located in the following DNS hierarchy: _kerberos._tcp .BranchSite._sites.dc._msdcs.MyCompany .net. Figure 1 also shows this hierarchy, from the Microsoft Management Console (MMC) DNS snap-in.
The DNS server will then return only DCs that are responsible for the client’s site, which the client in turn leverages to authenticate to the domain. Fortunately, the client stores the information for the last AD site it belonged to in the registry and leverages this information directly the next time it needs to locate a DC. To find the AD site name that a client cached for itself, go to the HKEY_LOCAL_MACHINE\SYSTEM\Current ControlSet\Services\Netlogon\Parameters DynamicSiteName registry subkey.
Even after a user authenticates to his or her proper domain, cross-domain resource access involves a few more steps in multi-domain forests. Part of the DC locator process is repeated when the user accesses resources in another domain in the forest. Although all the domains in a forest trust one another, the Kerberos Ticket Granting Ticket (TGT) that the client received from its own DC at logon is valid only for requesting service tickets that in turn grant access to resources in the client’s own domain. When a user accesses a resource in another domain in the same forest (e.g., a file server), the client again first queries DNS to locate a DC of the file server’s domain to request a TGT that’s valid in this domain. The good thing is that the client will immediately find the correct DC to use. Because the client already knows what site it’s in, it uses a site-specific DNS query (such as the one I described) to locate a DC of the other domain.
One reason this process works so efficiently within a forest without the need to query for the generic DC locator records of the other domain is that all domains in the same forest replicate and use the same AD configuration partition. This partition also contains the site and subnet information, so that DCs from any domains in the forest properly register locator records for the respective AD sites. If no DC exists in an AD site, or if a site doesn’t have a DC for every domain in the forest, the AutoSiteCoverage mechanism ensures that the closest DC will register a locator record in DNS. This means that within its forest, a client can always locate a site-specific DC for any domain in the forest. The client doesn’t have to leverage the generic DC locator records, which might direct the client to a DC on the other side of the world to authenticate with. However, remember that this process assumes that the site-specific DCs are available—if not, the client will fail back to leveraging the generic DC locator records and might therefore experience slow authentication and GPO processing.
Suppose that you’ve just acquired another company that has its own AD forest. To efficiently allow collaboration between the employees of both companies, you decide to establish a trust across both forests. Your plan might be to later consolidate both forests; however, a forest trust is often the first step to allow access to resources in different forests for both parts of a merged company.
Continued on Page 2
The annoying thing about cross-forest authentication is that the client frequently fails to locate the correct DC in a trusted forest and is instead authenticated by a random DC—often not the DC you prefer the client to use. The good news is that you can easily solve this problem.
Similar to the cross-domain DC location process that I already described, when a client accesses a resource across a forest trust, the client also queries the trusted forest’s DNS servers for the correct DCs for authentication purposes. The key thing to understand here is that the client queries DNS again for the site-specific Kerberos service records. The client does so by combining the site name from its own domain (MyCompany.net) with the domain name of the trusted forest’s domain (OtherCompany .net) and would thus query in the following DNS hierarchy: _kerberos._tcp.BranchSite._sites.dc._ msdcs.OtherCompany.net. If no such AD site exists in the trusted forest, which is highly likely for a forest designed by a different AD team, the DNS query fails and the client must request the generic DC locator records. The client might then be authenticated by DCs of the trusted forest that are less than ideal to use, which will effectively slow down the cross-forest access to data and applications.
To prevent this problem, simply ensure that you create AD sites with the same names in both forests. Consider these new sites to be “shadow” sites of the trusted forest. You don’t need to add new IP subnets to these sites for this solution to work, nor do the shadow sites need to contain real DCs. Instead, create a dedicated site link between each “shadowsite- for-MyCompany” and the closest “realsite- of-OtherCompany” connecting the two sites. Make sure no other sites are joined to this site link and that the shadow sites themselves aren’t joined to any other site links. Doing so will guarantee that the proper DC of the OtherCompany.net forest adds the required SRV records to the respective shadow site via AutoSiteCoverage. These shadow sites will now ensure that your clients leverage the correct and closest DCs for authentication when accessing resources across a forest trust.
Enhancing the Usability of
the Least Privilege Model
As companies have become more sensitive to IT security, AD administrators have begun to use at least two accounts with different privileges: one account for logon to clients and to perform regular office tasks (this account doesn’t have any administrative rights in AD) and another account with sufficient privileges in AD to perform administrative tasks such as user and group management. As is often the case, increasing security doesn’t make the processes more user friendly for the end user. In this case the end user is the AD administrator. Handling multiple accounts can be a hassle even when using the different OS features, such as right-clicking the MMC Active Directory Users and Computers snap-in and selecting Run as to start the tool with the administrator account. Although this method works, I find it particularly annoying that the Run as dialog box used to enter the administrative credentials never remembers my AD account—I need to enter it for every situation in which I need to elevate my privileges.
My favorite solution to this challenge is to set up a centrally hosted terminal server where all relevant administration tools are installed. AD administrators can then connect to this terminal server, authenticate with their administrator account, and perform all required administrative tasks through the terminal server session. There’s no requirement to use Run as, and I can even use RDP files on my desktop that are preconfigured with the correct account name. Of course you shouldn’t opt to store the administrator’s account password when saving the RDP files. Administrators can also use a central terminal server to connect from any client to perform their duties—the Windows Server Administration Tools Pack (adminpak.msi) tools don’t have to be installed locally on the client. Many environments, however, might be too small to allow implementation of a terminal server purely for administrative purposes, or other reasons might exist for running the AD management tools from your local machine.
Another option to avoid the tedious and error-prone task of right-clicking the icons and entering your credentials is to create your own shortcuts that execute the Runas utility directly, fully leveraging its command-line options as well as those of the snap-ins themselves. And if your administrator accounts have a clear prefix or postfix added to the normal user account, you can leverage simple things such as the expansion of the USERNAME environment variables to further simplify managing shortcuts for multiple users.
As an example, my normal (unprivileged) user account might be called GUIDOG, whereas my administrative account is ADM.GUIDOG. To allow multiple administrators to use the same shortcut on different machines, I need to expand various environment variables in the shortcut I create, which is achieved by enclosing the respective environment variables between two percent signs. The following is a sample command for a shortcut I’d use to start the Active Directory Users and Computers snap-in with the administrative account and request that it binds to a specific DC in my domain:
%windir%\system32\runas /env /
“mmc %windir%\system32\dsa.msc /
When I create the shortcut for this command, I can further enhance the warning effect and notify myself that I’m switching to a higher privileged account by using the color options of the shortcut itself—as for any command prompt, you can also configure background and font colors for a shortcut. Figure 2 shows the window that appears if I double-click the user GUIDOG shortcut. The command prompt’s red background color warns me that I’m about to elevate my privileges. Because the Run as command in the shortcut appropriately expands %userdomain% ADM.%username%, I no longer have to enter my account name. Instead I’m prompted to enter my password for the CORP\ADM.GUIDOG account right away.
Because Windows shortcuts are real files (with .lnk extensions), you can copy them to an appropriate share that’s available to any other administrator in your AD forest. So if a different user, JOER for example, also has an administrative account using the same naming convention, he can use the same shortcut for launching the Active Directory Users and Computers snap-in and will be asked to authenticate as CORP\ADM.JOER.
64-Bit Windows Challenges
The current trend is a strong move toward 64-bit Windows versions, especially for applications that can benefit from the increased memory space that 64-bit OSs offer. AD is one of those applications—if your AD database doesn’t fit within the memory limitations that it incurs on 32-bit Windows Server versions (which Table 1 shows), upgrading your DCs to 64-bit Windows on hardware with sufficient physical memory can greatly improve AD’s performance, as well as the performance of dependent applications (e.g., Exchange).
I won’t go into the details of when and where you should consider leveraging 64- bit Windows for your DCs. In general, you won’t run into any problems combining 32-bit DCs with 64-bit DCs in your AD forest (e.g., leveraging Windows Server 2003 x64). No 64- bit–specific schema extensions are required, and 64-bit DCs replicate just fine with their 32-bit counterparts. Of course you need to ensure that you have proper versions of your drivers, antivirus software, and monitoring agents that support your 64-bit Windows OS. However, some of the Microsoft ADrelated management tools that used to run on your DCs might no longer function. The most important ones that I know of are the AD Replication Monitor support tool (Replmon), the Group Policy Management Console (GPMC), and the Active Directory Migration Tool (ADMT) Password Export Server (PES) service. Most 32-bit applications run with no problem on a 64-bit Windows OS because they leverage the OS’s Windowson- Windows 64-bit (WoW64) compatibility feature. Replmon, GPMC, and ADMT PES are true exceptions to this rule because they have other dependencies that WoW64 can’t meet. Replmon and GPMC run without a problem on a 32-bit member server and can thus be further leveraged in a pure 64-bit AD forest and connect remotely to the 64-bit DCs. ADMT PES must be installed on a DC, so you either need to have a 32-bit DC that you can leverage for this service, or you need to wait for the upcoming ADMT release—which is slated for release with Windows Server 2008 and will include a 64-bit version of the PES service.
Forewarned Is Forearmed
AD is a powerful directory and authentication service. Its multi-master replication model ensures the high availability that companies request from such an important service. But sometimes the small things that might not work as expected are the ones that cause the most pain for an administrator. Knowing about some of these annoyances ahead of time will hopefully help you avoid them in your own infrastructure.