We've established the importance of monitoring your servers, the applications that run on them, and your network devices in the interest of catching and fixing problems before your users notice. Effective network monitoring is also invaluable for the simple reason of understanding exactly what's happening on your network, as well as who's accessing it and when. In Part 1 of this series targeted at the small-tomidsized business (SMB), I talked about these networking benefits, and I identified the most common sources of data (aka telemetry) that you can monitor, including Windows event logs, Syslog, and SNMP. In this followup article, I talk about how you can build a barebones monitoring solution by using free or inexpensive tools that are Windows event log-centric.
Specifically, I want to introduce you to three nifty tools that you can add to your network-monitoring arsenal: Log Parser, a free tool from Microsoft; tail, a cool UNIX utility, and Kiwi Syslog Daemon, a utility that comes in a free edition and a more powerful but still inexpensive edition. Even if you already own or plan to buy a tool, I encourage you to keep reading: Most tools on the market provide only alerting and reporting functionality, perhaps with a few canned reports and sample alert rules. You must still understand your environment and define your actual reports and alerts. The design and analysis methods I outline below will come in extremely handy even if you're fortunate enough to have an in-place monitoring application.
Monitoring the Security Log
Do you need a tool that monitors Windows event logs and alerts you (via email or pager) when it finds events that match certain criteria? I recommend that you use VBScript and Windows Management Instrumentation (WMI) to implement custom alerting. In "Use WMI to Monitor Your Web Site for Changes" (InstantDoc ID 25235), I provide a sample script that monitors for specific events and sends an email message to a designated address whenever matching events occur. The script consumes no CPU cycles while waiting for the next matching event, and you can easily modify the WMI query that determines which events the script processes and notifies about. WMI queries use a SQL Select command similar to the commands from which Microsoft Access queries are built. The challenge is determining which events should generate an alert. If you aren't careful, you'll overwhelm your pager or Inbox with more alerts than you'd ever wish to receive.
To begin creating alert criteria, you need to use a reporting tool to gather the information you need from your event logs. You'll find no better free event-log querying tool than Microsoft's Log Parser. Log Parser lets you execute queries against the event log by using a Select statement, which is similar to a WMI query but better for queries against an existing log. I've written several articles about Log Parser that include many sample queries that you can adapt to your needs. Your goal should be to write one or more queries for each type of event log (e.g., Application, Security, System) that filters out events you don't care about but reports the important events. I recommend that you query Security logs for the most common and important event IDs, which Table 1 shows. (You can also access a free quick-reference chart from my Ultimate Windows Security site at http:// www.ultimatewindowssecurity.com.) Such a query would look like
logparser "select TimeGenerated,EventID,Message from \\mtg1\security where EventID in (675;676;681;642;624;644;617;632;660;636)"
Monitoring the Other Logs
For the other event logs (e.g., Application, System), I recommend that you start your report by looking for warnings and error events and ignoring informational messages. Even after doing so, you'll still have certain warning or error messages that occur in great quantity but that you don't care about. For example, I get many errors from source MRxSmb that I'd rather ignore. So, your next step is to identify those "noise" events and use a Where clause in your Log Parser query to exclude them. In the following command, I've added an expression that excludes event ID 3019 for source MRxSmb and informational events from any source (Event-Type
logparser "select TimeGenerated,EventID,Message from system where EventID3019 and SourceName 'MRxSmb' and EventType
Remember that event logs grow very large. Also, EVT files have no indexing, which can give you a fast way to find information without scanning an entire log. Log Parser can take a while to run reports because it must scan the entire event log each time you run a report. Thankfully, with its checkpoint functionality, Log Parser can remember where it left off scanning the last time it processed an event log. (You can learn more about checkpoints in Log Parser 2.2's Help under Incremental Parsing and Aggregated Data.)
Further compounding the complexity of monitoring Windows event logs is the fact that each computer has its own set of logs, and there's no native corollary to Syslog for Windows that lets you collect these events into one place. If you don't have a tool that merges your logs into one central database for alerting and monitoring, you have two options for tackling the problem. If you manage only a handful of servers and devices, you might consider simply setting up alerts and reports for each system. If you want a more centralized approach, you can take advantage of Log Parser's ability to query multiple computers. All you have to do is list each computer's event log in the From clauses. For example, the following command queries the System log on server1, server2, and server3.logparser "select TimeGenerated,EventID,Message from \\server1\system, \\server2\system, \\server3\system"
After you have one or more reports built for each log type, you can set them up to execute daily or weekly as a scheduled task. Simply redirect Log Parser's output to a text file by adding"> C:\path\file.txt"
to the command, then view that file regularly. Better yet, have the scheduled task email the file to you. To do so, add one line to your batch file that calls Blat. (Blat is a public-domain utility that lets you use SMTP to easily email files.)
Now that reports are functioning, it's time to design your alert criteria. It would be nice if all application and device vendors documented the events and messages that their products generate so that you could write informed alert criteria—but life is never easy. Instead, you have to feel your way along. Coming up with alert criteria is similar to putting your reports together, as I've described. However, you want to be even more selective when deciding what gets sent to your pager or Inbox as an alert. Most alert programs let you build both Exclude and Include filters, and then order them in a sequence that weeds out minor events. If you're using WMI as I've described, you can use the Where clause to filter out most of the events you don't want. If you run into some limitations with WMI's Where clause, you can perform additional filtering in the VBScript script.
I recommend using Log Parser to design your alert criteria rather than setting up your alerts and constantly tweaking them until your pager settles down. Using the event data already collected in your logs, start with the same criteria you used for your reports and refine them even further so that you filter out less important events that don't merit an alert. If you have log data for about a month, this approach lets you essentially run simulations of alert processing over an extended period of time to see whether you're going to be overloaded with alerts. Once you're satisfied with your alert criteria in the form of Log Parser reports (or those of another reporting tool, if you have one), you can convert the criteria to WMI queries inside VBScript scripts. To ensure that the alert scripts are always active, you can configure them as startup scripts so that they start automatically at bootup. You can configure startup scripts in any Group Policy Object (GPO) under Computer Configuration\Windows Settings\Startup and Shutdown Scripts. As startup scripts, the alerts run in the context of the local System account, so they'll have all the authority they need.
The key to designing successful report and alert criteria is to think in terms of filtering out what you don't want to see in the report or be alerted on—not what you do want to see. Because of the sheer number of event sources and poor documentation associated with most applications and network devices, there's no way to enumerate every important event or message. If you limit your criteria to looking for known critical events, you run the very real risk that a device or system will develop a problem and report an event that your criteria aren't looking for. To achieve some balance, however, you might want to create some alerts or reports that do look for specific criteria. You would build such alerts and reports for common problems with which you're already familiar. However, in general, you should think exclusion—rather than inclusion— when you're designing your criteria.
Monitoring Text Log Files
Most Microsoft server products and Window components log any important events to either the System or Application logs, but each service and major OS component (e.g., IIS, DHCP, SMTP, Internet Authentication Service—IAS) tends to write more detailed information to its own text log file, each of which has a proprietary format. If you need to monitor information that resides only in these text log files, Log Parser comes to the rescue again. Log Parser accepts any format of delimited text file, such as tab delimited and comma separated value (CSV) formats, and lets you use the same SQL Select command for querying text log files.
So, Log Parser solves the reporting challenge that text-based log files present, but what about real-time alerts for critical events logged to text files? I recommend using a tool ported from the UNIX world called tail. Tail monitors specified text files for new appended lines. As soon as tail detects new data, it sends it to standard output (stdout). You can pipe tail's output into a script that analyzes new records as they're logged and generates alerts as necessary. For instance,tail /f logfile.txt | LoopOnNewMessages.cmd
watches for new messages appended to logfile.txt and pipes those messages into LoopOnNewMessages.cmd, which Listing 1 shows. LoopOnNew-Messages.cmd sends each message to [email protected] .com, but you can change this email address to an appropriate address in your environment. To prevent the mailing of unimportant messages, you can insert filtering logic into Loop-OnNewMessages.cmd. To learn more about tail, see "Toolbox: Tail," Instant-Doc ID 47176.
Monitoring SNMP and Syslogs
Monitoring SNMP and Syslog telemetry sources is easy, thanks to the freeware version of Kiwi Enterprises' Kiwi Syslog Daemon, a Windows Syslog server and SNMP manager that lets you collect all your network-device telemetry into one program. With the tool's GUI, you can configure filters that gather messages matching certain criteria, then specify one or more actions to take in response to the message. You can create filters that discard unwanted messages and specify that remaining messages should generate an alert or insertion into a database for subsequent reporting. Kiwi Syslog Daemon lets you filter messages based on time of day, day of week, facility, level, the IP address of the reporting agent, or strings within the message. Also, the tool can perform a variety of actions—email notification, insertion into an ODBC database, running a program, and more—in response to specified events.
The freeware version of Kiwi Syslog Daemon runs as an interactive program on your desktop, so you must remained logged on to monitor devices. However, the enhanced version of the product runs as a system service and costs only $100 per server. When you enable SNMP trap monitoring with Kiwi Syslog Daemon, you also specify the Facility and Level field that the tool should use when converting the trap to a Syslog message. For instance, you can specify SNMP traps as Facility Local4 and Level 3 Error. Then, you can build alert rules that deal specifically with SNMP traps by filtering for Local4 messages.
Buy vs. Build
As you can see, resources exist to help you monitor all of your various sources of telemetry data. Before you go to the effort of building your own monitoring solution, be sure to look at the tools available on the market. They're affordable and feature rich. However, merely buying a tool is never the entire solution. Make sure you design your reporting and alert criteria so that you aren't overwhelmed with noise events—but don't go to the opposite extreme and create such selective criteria that your monitoring solution defeats the very purpose for which you've implemented it. The secret to accomplishing this balance is to design your criteria with the mindset of filtering out the noise events instead of selecting the important events. The only exception to this rule is the Security log, which is much more finite and documented. For a complete database of Security log events and their meanings, see my Security Log Encyclopedia at the Ultimate Windows Security site.
Setting up an effective and comprehensive monitoring process takes some work, but the investment is worthwhile. There's nothing worse than learning about a problem from your users and—upon reviewing your logs—realizing that the system was exhibiting warnings 3 days earlier. The business and legislative worlds are setting much higher expectations with regard to information security and accountability. Security breaches might occur, but you and your company will fare much better if you've been doing your due diligence. There's no substitute for effective monitoring.