If you've ever managed a large network with multiple Windows NT domains and hundreds of NT servers, you're familiar with the intensive labor hours you must spend gathering data for management reports and performance baselines. You typically spend a lot of time fighting fires as well. You might be looking at servers in the Eastern domain, while potential problems are building in the Western domain. You know third-party tools are available to simplify your job, but management can't see the added value and therefore won't approve the purchase. If you're suffering from this type of crisis-mode administration, consider a simple formula that can change the way you manage your servers: datalog.exe + monitor.exe = economical metrics and reporting.
Datalog.exe and monitor.exe are Microsoft Windows NT Server 4.0 Resource Kit tools that let you capture performance data from your servers without feeling the effect of running NT's Performance Monitor. With these two utilities, you won't need to leave a management server or workstation continually logged on and running Performance Monitor to gather metrics. Likewise, you won't need to drag performance data across your network every 30 minutes to a management server or workstation. Finally, you won't need to worry about losing performance data every time your management server or workstation shuts down or hangs.
Datalog.exe lets you run Performance Monitor as a service, and monitor.exe lets you easily control the Datalog service. Become familiar with these tools, and you'll soon be spending your time designing and optimizing instead of troubleshooting and explaining.
Using the Datalog Service
When you use datalog.exe to run Performance Monitor as a service, you can capture performance data at regular intervals without degrading the target server's performance. Microsoft maintains that gathering data in this fashion adds less than 0.5 percent overhead on the server. This reduced overhead is possible because the Performance Monitor GUI is not running. Instead, the Datalog service collects data from the target server at user-specified intervals and writes the data to a local .log file. As a result of this process, the Datalog service doesn't affect network performance because it doesn't need to send the collected data across the network. Whenever possible, configure the Datalog service when you build the server (i.e., before you bring the server into production) so that you'll have performance data available for tweaking and tuning the server.
Unlike Performance Monitor, the Datalog service doesn't require that you log on as a user to gather data. The service is invaluable for troubleshooting network performance problems and for gathering baseline measurements. However, the data you gather and the frequency of collection can vary widely depending on the type of reporting you are performing.
Setting Baselines and Gathering Reporting Data
When you capture data for baseline and reporting purposes, set the collection interval to every 30 minutes or every hour, depending on the maturity and stability of your servers. Baseline data helps you set performance standards so you can evaluate your servers and network performance. After you determine the average performance baseline for an object or instance, you can use that information as a reference point to determine when tuning or upgrades are necessary. Systems administrators who don't set baselines and then make recommendations to customers or management miss a valuable opportunity to demonstrate effective resource management. By regularly tracking and analyzing performance data against the baselines, you can respond quickly and appropriately to changes in your network. You also have reliable data at your fingertips the next time you broach the subject of additional upgrades or servers.
Another important function, and the one most visible to customers and management, is reporting. Service-level agreements and contracts specify that the service provider report information to the customer on a weekly, monthly, quarterly, or annual basis. Both customers and management want to know whether they are taking full advantage of the existing hardware, how well they're managing their hard disk space, and how much bang they're getting for their hardware expenditures. You can add value to your service by proactively informing your customers or management whether they're using the existing hardware to its full extent. Furthermore, you can use the historical data you collect to predict trends for future growth.
Knowing Which Objects to Log
Determining which objects to track depends on the function of each server. For example, a server running SQL Server has many objects, as you see in Screen 1, that don't appear on a file-and-print server or on a Primary Domain Controller (PDC). Be sure to log the following objects on all servers: memory, processor, system, physical disk, paging file, logical disk, network interface, and objects specific to the function of the server.
Use the collected data from the previous objects to establish baselines, troubleshoot your systems, and prepare customer reports. Periodically compare the data you collect with the baseline standards you see in Table 1. These standards are suggested starting points for interpreting performance data. These counters don't mean a lot when you view them individually. To gain any meaning and insight into your server's performance, you must look at the counters as a whole in relation to the performance baselines you obtain.
Setting Up the Datalog Service
After you determine the objects you want to log, you need to set up the Datalog service. Specifically, you need to perform the following steps:
Step 1: Define the performance metrics. Use Performance Monitor to choose the specific objects you want to log. Then, create a .log file with a valid path and filename to contain the performance data. Set the collection interval to reflect the desired period between collection times; NT measures this interval in seconds.
Step 2: Create the .pmw file. After you create the .log file, select Save Workspace from Performance Monitor's File menu to save a .pmw file on the target server, as Screen 2 shows. This .pmw file is the workspace the Datalog service will use to draw monitoring information. Enter a path and filename for the workspace file (the path you specify must already exist on the local computer).
Step 3: Copy the .pmw file to the target server. Copy the .pmw file you just created to the \winnt\system32 directory on the target server. To save time, you can copy this same .pmw file to all your target servers; just make sure the specified path exists on each server.
Step 4: Install datalog.exe on the target server. Copy datalog.exe from the resource kit to the \winnt\system32 directory on the target server.
Step 5: Install monitor.exe on the monitoring computer. Copy monitor.exe from the resource kit to the \winnt\system32 directory on your monitoring server or workstation.
Step 6: Start collecting data. Use the appropriate monitor.exe commands to start collecting data, as you see in Screen 3. Table 2 lists the commands that are available with monitor.exe. When you finish collecting data and are ready to view the data log, use Server Manager or type
at the command line to stop the Monitor service on the target computer. Move, copy, or rename the .log file to another location so that you don't accidentally overwrite this file when you restart the Datalog service.
Step 7: View the data. Open Performance Monitor, and select Log Data From from the Options menu. Enter the path and filename of the .log file you saved. From the View menu, select Chart. Select Add to Chart from the Edit Menu, and select the objects and instances that you want to view in your chart.
Controlling the Datalog Service with the Monitor Service
Monitor.exe runs from the command line on your local workstation. You use this service to start, stop, and set up the Datalog service on the target server.
You use the NT Command Scheduler, which comes with the resource kit, and monitor.exe to turn the Datalog service on and off. I suggest that you create a batch file to gather the .log files and write them to a central repository or database. Log files easily compress, so archiving the data isn't resource intensive. By using such a batch file, you can analyze data from many servers and create enterprisewide reports with your database or spreadsheet application, as Figure 1 shows. You can log data from many servers into one file using Performance Monitor, but monitoring and logging multiple sources across the network can seriously degrade network performance. By setting up an automated logging and archiving routine, you create a performance baseline and reporting mechanism that doesn't require a lot of attention.
For example, imagine you want to log data between 7 a.m. and 7 p.m., Monday through Friday. Begin by using the Command Scheduler to give the monitor start command at 7 a.m. every day, as Screen 4 shows. Schedule the corresponding monitor stop command for 7 p.m. every day. You can write a batch file that copies your .log file to a database server, and use the Command Scheduler to run the batch file every Saturday night. Listing 1 shows an example batch file (this file assumes you've mapped a Y drive to your database server). Then on Monday morning, you just sit down with your cup of coffee and a highlighter pen. You simply run a report from your database of collected information, highlight the problem areas, and begin proactive maintenance before those embers become full-fledged fires.
Analyzing the Log Files
After you successfully create a .log file, you have several options for reviewing it. You can use Performance Monitor to open the .log file and chart the data. Likewise, you can open the .log file with Performance Monitor and save the data as a .csv or .tsv file that you import into your favorite spreadsheet or database package to create various graphs and reports. Using this last method, you can then combine this information with historical data. For information about a related resource kit utility that stores .csv and .tsv files natively, see the sidebar, "PerfLog Performance Tool," page 178.
A full explanation of how to interpret the data from the various logged objects is beyond the scope of this article. For a list of sources that provide excellent assistance about analyzing performance data, see "Performance Data Analysis Resources," page 179.
Troubleshooting a Server with the Datalog Service
When you use the Datalog service to troubleshoot a server, the type of data you monitor and collect will be particular to the problems the server is experiencing. If you are unsure about the nature of the problem affecting the system, capture data from all available objects and counters on the server for approximately 30 minutes to an hour. Keep the collection interval short (i.e., gather data every minute or less).
After you finish gathering the data, analyze the associated logs and pinpoint possible problem areas. Then refine the data capturing process by gathering information that relates only to specific objects and at longer collection intervals. For example, if the preliminary data indicates your server is encountering a processor-related bottleneck, gather data on the Processor object and System object in subsequent logs. Monitor the logs over an extended period at 5-minute or 10-minute intervals. If you use shorter collection intervals, the .log files become large and might become unmanageable.
Evaluating Hard Disk Space Considerations
Make sure you have enough hard disk space available to store the .log file when you are monitoring your systems with the Datalog service. The log files grow more quickly when you monitor and log data for several objects at frequent collection intervals. For example, I recently needed to troubleshoot a problematic server, so I collected information about all objects on this system at 5-minute intervals. The resulting .log file collected approximately 16MB of data per day.
To make the best use of space, collect only the data you need to troubleshoot the problem. Archive the .log file daily if you're troubleshooting a server, and consider scheduling an automated archival process to move the .log file to a database server. Avoid filling the available space on your hard disk and causing the server to halt.
An Economical and Effective Solution
Although excellent third-party products such as BMC Software's PATROL and Computer Associates' Unicenter TNG are available for enterprise data gathering and reporting, you can manage most monitoring information using native NT tools, resource kit utilities, and a simple database or spreadsheet package. Datalog.exe and monitor.exe give you an economical and effective alternative that might be right for your environment.