\[Editor's Note: Do you have something to share with other Windows NT Magazine readers? We want to know about it. Write for Reader to Reader online, and you can tell others about your NT discoveries, comments, problems, solutions, and experiences. Email your contributions (700 words or less) to [email protected] along with your name and phone number. We edit submissions for style, grammar, and length. If we print your submission, you'll get $100.\]
I recently discovered a major bug with NT 4.0 Server and SQL Server 6.5 Server. The bug stems from the SNMP Trap passing between SQL Server and the OS. Although you would expect that two of Microsoft’s most popular and supposedly stable software products would have all the major kinks worked out (especially when both versions incorporate the latest service packs), I encountered a situation that might make you think otherwise.
I started with a clean install of NT Server without Internet Information Server (IIS) installed. I installed the SNMP Service in the Network Services dialog box, rebooted the machine, and installed Service Pack 3 (SP3) first and then proceeded to install SP4 after my problem arose. I let the server reboot, and I modified the SNMP properties on my server (currently I have the SNMP traps set for an HP OpenView Network Node Manager). Next, I tested the SNMP Service by stopping it using the Services applet in Control Panel. When I started the SNMP Service again, I saw a cold-start trap on my OpenView box. Next, I installed SQL Server 6.5 and rebooted the system. I then installed SQL Server 6.5 SP4 and then SP5 (i.e., I tried both service packs independently and then sequentially). I rebooted the server and tried writing a test SNMP trap statement in ISQL. Within ISQL, I initiated the trap and saw the trap on my OpenView box. I reinstalled the NT service pack (just like Microsoft recommends you do after you modify your NT server with other software). After I rebooted the system, I went into SQL Enterprise Manager, started ISQL, and raised the trap again. When I went to my OpenView box, I was surprised to see that it didn't display the trap. I used the snmputil.exe utility in the Microsoft Windows NT Server 4.0 Resource Kit to set up my NT and SQL Server system to trap to the machine where I started the utility. I started snmputil.exe from a DOS prompt and sent the trap from my SQL Server box—still no trap.
In this example, I've even gone so far as to use a local network sniffer. At no time did I see a SNMP packet being passed. I worked with Microsoft support on this problem for a month, but they wouldn't admit that it was a problem. One engineer suggested I execute a workaround that involved writing an elaborate ISQL statement sending the SNMP trap directly to the OpenView box. His solution did work, but it was inconvenient, I would have to write this script for every alarm SQL Server could generate, and it wouldn’t be able to page the DBAs who support the databases if the software encountered a problem. As a result, this solution was not an option. I finally found a Microsoft Support Online article that talked about this problem and the two workarounds I could use, which was surprising considering that the SQL Server support engineers I talked to claimed they’d never heard of this problem. (Although I've already described the first workaround, I ended up using the second workaround.) After speaking with two Microsoft support engineers and much frustration, I sent Microsoft my broken NT and SQL Server box built exactly the way I described previously. The lead engineer told me that the server I sent him worked properly with snmputil.exe, which is strange because I had previously tested the system with HP OpenView, and snmputil.exe.
I did find an acceptable solution to the problem, and I have the Microsoft support engineers to thank for it. During our many phone discussions, the engineers pointed me to the Registry key that stores the NT and SQL Server SNMP keys. At the time, I had built a new server using the same steps I described previously. However, on this new system, I didn’t reinstall the OS service pack. As a result, this system was sending traps from SQL Server. After probing deeper into the Registry of both the new SQL Server system without the OS service pack and my old SQL Server system with the OS service pack, I realized that the old SQL Server system wasn’t trapping because the following line was missing from the Registry:
For some reason when you reapply the OS service pack, the software deletes this Registry key, which is located under HKEY_LOCAL_MACHINE\ SYSTEM\ CurrentControlSet\ Services\SNMP\Parameters\ ExtensionAgents. To fix this problem, you must re-run SQL Server setup from the MSSQL\binn directory. From the command prompt, type Setup /t RegistryRebuild = ON. You must type in this command exactly as you see it here (this command is case sensitive). This command will start the SQL Server setup process again and inspect and update the Registry—it doesn’t rewrite your MasterDB or change any SQL Server settings. After the setup program completes, you must reboot your system before the SNMP trapping will work. Microsoft hasn’t officially admitted this is a bug, but I raised quite a bit of grief with them. After I discovered this fix to my problem, I just dropped the issue with Microsoft support. Let’s hope that Microsoft have these problems fixed for Windows 2000 and SQL 7.0.