Ask the Doctor - 30 Mar 2000 | ITPro Today: IT News, How-Tos, Trends, Case Studies, Career Tips, More

SEND US YOUR TIPS AND QUESTIONS.
For answers to more of your Windows 2000 and Windows NT questions, visit our online discussion forums at http://www.win2000mag.com/support.

When I try to create an Emergency Repair Disk (ERD) for my machine, Windows NT formats the disk, then proceeds to copy the Registry files to the disk. However, at the end of the process, a message states that NT couldn't copy all the information to the disk. Why does this error happen?

This message appears when the size of the Registry on the local machine (on which you're creating the ERD) is larger than the capacity of a 3.5" disk (i.e., 1.44MB). Although the ERD-creation process compresses the Registry files that will reside on the ERD, sometimes even the compressed files don't fit. If you're using the /S option with Rdisk, try not using this option. This tip might help if the SECURITY hive is big enough to cause the Registry backup to exceed the 3.5" disk's space limitations. However, if the Registry's SOFTWARE hive (which you can't exclude) is causing the problem, you can't create a complete ERD.

In such a situation, the only way to create a copy of the Registry is to manually copy the files from the \winnt\repair folder (i.e., the source folder that NT creates the ERD from) or use the Microsoft Windows NT Server 4.0 Resource Kit's Regback utility to make a backup of the Registry. You can then put the backed-up Registry hive files (e.g., SAM, SECURITY, SOFTWARE, SYSTEM) on a different medium, such as a CD-Recordable (CD-R) drive, DVD-RAM drive, Zip drive, or another system's hard disk. However, this workaround isn't as convenient as having the ERD because NT's Repair process (which uses the ERD during recovery) asks for only the 3.5" disk version of the ERD during the Setup Repair process.

Windows NT Service Pack 4 (SP4) and later contain the Setprfdc utility, which you use to assign a preferred domain controller for a site, without using an LMHOSTS file. We're using one domain, with backup domains across the country. We need to be sure that Windows 95 machines are also using the local BDC to authenticate the local users, but we don't want to use LMHOSTS files. What can we do?

The Setprfdc utility is NT-centric. You can use other methods to control the secure channel establishment and domain controller validation, including #PRE and #DOM tags in the workstation's LMHOSTS file. However, you mention that you don't want to use this method, so you might consider an alternative approach such as setting M-node resolution on the Win95 clients (e.g., through a DHCP scope option or a config.pol system policy file to modify the corresponding Registry option related to the WINS node type).

By default, a client sends a broadcast to establish a secure channel with a domain controller and authenticate a logon. However, clients don't wait very long for a response. Because of this impatience, a client will often end up using WINS to discover a domain controller, a process that might return the address of a domain controller located across a slow WAN link. However, if you set M-node name resolution—which uses broadcasts first and point-to-point name server resolution second (i.e., the opposite of the default H-node type for WINS-enabled clients)—the client will wait longer for a reply. Therefore, a BDC on the local LAN segment will be more likely to respond to the client authentication request. (For more information about this topic, see the Microsoft article "Secure Channel Manipulation with TCP/IP" at http://support.microsoft.com/support/kb/articles/q181/1/71.asp.) Note one caveat regarding M-node resolution: The increased broadcast traffic that this setting causes makes it more appropriate for smaller branch-office LANs than larger LAN segments with many machines.

I understand that if a primary disk fails, Windows NT's drive mirroring (i.e., RAID 1) capability automatically fails over the system to the shadow disk. However, when one of my servers recently experienced a primary disk failure, the system failed to boot. I reconfigured the shadow disk as the primary disk, but the system still wouldn't boot. When I tried to recreate the boot sector by installing NT on the shadow disk (which I jumpered to be the primary disk), NT told me the disk was a member of a fault-tolerant disk set and wouldn't let me perform the installation. Are these bugs in NT or in my disk subsystem?

The behavior you describe falls more squarely into the category of feature than bug—at least as far as Microsoft is concerned. This feature is an important but highly misunderstood NT behavior, so I'm devoting a good portion of this month's column to your question.

First, I'd be willing to bet that your failed system used IDE disks instead of SCSI. If the disks were SCSI, they probably weren't on SCSI IDs 0 and 1 or they weren't numbered consecutively. I'm not sure about your scenario, so I'll start with the IDE scenario. NT's software-based RAID 1 fault tolerance (which NT's ftdisk.sys driver provides) allows automatic failover only on SCSI disk controllers and only under certain circumstances—specifically, when the primary and shadow disks are on IDs 0 and 1, respectively. Although you might assume that a similar event occurs on IDE disks that you've configured as master (i.e., primary) and slave (i.e., shadow) on the same channel, such is not always the case. IDE devices aren't logically independent, as SCSI devices are. On IDE devices, a master/slave relationship exists between disks on the same channel. The ability of the system to properly recognize the slave typically depends on the presence and proper operation of the master disk. If the master disk isn't present or isn't functioning properly, the slave disk won't be able to function. As a result, to satisfy IDE's physical configuration requirements, you might need to replace the failed primary disk or re-jumper the slave disk as an IDE master (or standalone/single) disk.

General limitations in NT's software-based disk mirroring might also be causing your problems. Microsoft doesn't officially support or recommend using software-based RAID in NT to mirror the boot partition—only the data on that partition—or the ability of the system to boot. Although failover might work, many scenarios exist in which it won't work, as you've experienced. Your problems are common because NT's disk mirroring doesn't mirror the primary disk's Master Boot Record (MBR) partition table entries. On an x86-compatible system, this on-disk code is responsible for locating and passing control to the boot sector on the currently active partition. Unfortunately, the MBR is also an essential element of the boot process, so you won't be able to boot from a disk without it. (An exception exists: When the shadow disk was at one time a bootable disk that had a similar disk-partitioning scheme as the primary disk, the shadow disk might have the required MBR code necessary to boot from that disk. However, if you mirrored onto a shadow disk that was originally a clean disk, you're probably out of luck.)

If this scenario describes your situation, you'll probably need to use a special NT boot disk to regain access to NT. This special bootable 3.5" disk contains a boot.ini file with an Advanced RISC Computing (ARC) pathname that instructs the disk to boot NT from either the primary or shadow disk. (For information about creating an NT boot disk, see the Microsoft article "Creating a Boot Disk for an NTFS or FAT Partition" at http://support.microsoft.com/support/kb/articles/q119/4/67.asp.) Therefore, you can boot off this disk and access your NT installation on the working shadow disk. Even if you don't have an NT boot disk handy, you can use another system that has a similar disk configuration to create one. When you can successfully boot into NT, your next step is to use Disk Administrator to break the mirror, then reestablish it after you install the replacement disk.

If your system uses a SCSI-based disk subsystem and won't fail over to the shadow disk after a primary disk failure, you might have an addressing problem. You need to ensure that you configure the primary and shadow disks as SCSI IDs 0 and 1, respectively. Even if the failover occurs successfully, another potential cause for your nonbooting shadow disk exists: differing disk geometry translations of the BIOS for the primary vs. the shadow disk. Whether this problem occurs depends on the particular system BIOS and disk controller in use, but the resolution is typically the same in each case: You need to reconfigure the shadow disk so that it occupies the same logical position as the failed primary disk (e.g., SCSI ID number, master/slave IDE configuration). After you perform this reconfiguration, the BIOS translates the repositioned shadow disk and uses the same geometry as the drive that formerly occupied the same position, and your system will boot properly.

These limitations underscore the advantages of hardware-based RAID solutions. Hardware RAID controllers that provide mirroring render all the previous steps unnecessary—the system automatically fails over to the shadow disk, without requiring user intervention. Most hardware RAID solutions convince NT that the set of mirrored drives is one drive—they essentially abstract the set's individual members from NT. Also, most of these devices mirror the MBR as well as the disk's other data, so the shadow disk is a true image of the primary disk. And boot.ini file pathnames always point to the same drive, so NT boot disks aren't usually necessary during the recovery process. However, you might want to keep one handy, just in case the MBR, boot.ini, NT Loader (NTLDR), or some other essential boot component on the disk ever becomes deleted or damaged. If you decide to use NT's software-based disk mirroring, make sure that you have two NT boot disks handy—one that you've configured to boot from the primary disk and another that you've configured to boot from the shadow disk.

Finally, regarding your inability to install NT to the shadow disk, NT Setup doesn't allow installation to fault-tolerant disk sets. If you want to break the mirror set outside of NT, you can use the Microsoft Windows NT Server 4.0 Resource Kit's Disksave utility. Disksave runs under MS-DOS, and you need to place it onto a bootable DOS disk. After you boot the system from the 3.5" disk, you can use Disksave's Disable fault tolerance on the startup disk (F6) option to manually change the fault-tolerance portion that identifies the volume as a fault-tolerant disk set member. Performing this change causes NT to see the disk as a regular independent disk. Now you can use Setup to install NT on the system. However, if you use the NT boot disk I discussed earlier, you don't need to worry about restoring the MBR onto the shadow disk—you can simply boot the system from the 3.5" disk until you replace the primary disk and reestablish the mirror.

How do I disable the Dr. Watson crash debugger utility and reenable it after another utility (e.g., Symantec's Norton Utilities) replaces Dr. Watson with its own debugger? Also, how can I control the way Dr. Watson creates its log files?

You can control some of Dr. Watson's options through the program's GUI, but you need to control others directly in the Registry. To change options such as the logging directory for the program's primary log file (drwtsn32.log) and the binary crash dump file (user.dmp), simply fire up Dr. Watson (drwtsn32.exe) and set the options in the program's configuration dialog box, which Screen 1 shows. To change options such as which debugger Windows 2000 (Win2K) or Windows NT currently uses, you need to edit the Registry.

In the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug Registry subkey, the Debugger Registry value (type REG_SZ) defines the current debugger on Win2K and NT. By default, the system sets this option to a value of drwtsn32 -p %ld -e %ld -g, which instructs the system to use Dr. Watson as the debugger. However, this value might be different if you've installed an alternative debugger (e.g., WinDbg, NTSD) or a third-party utility (e.g., Symantec's Norton CrashGuard) that replaces Dr. Watson. To restore Dr. Watson as the default system debugger, simply change the value back to the original default value.

Another value under the AeDebug subkey that might interest you is Auto. This value controls whether Dr. Watson runs automatically when an application error occurs. The default value of 1 causes Dr. Watson to run automatically when an application crashes. If you set the value to 0, the system notifies you when the application error occurs, but the debugger doesn't run automatically.

If you want to disable Dr. Watson, simply delete the AeDebug key. However, I strongly recommend that you first save this key's information; you might want to restore Dr. Watson later. To do so, highlight the AeDebug key, choose Export from the Registry menu, and give the saved key a filename (e.g., aebug.reg, drwtsn32.reg).

Reenabling Dr. Watson after you've disabled it is a two-step process. First, at a command prompt, type

drwtsn32.exe -i

or run the command from Start, Run. Then, double-click the .reg file you saved.

Comments

Plain text