Mining the Depths of Exchange Tracking Logs

Downloads
8903.zip

You can use the Microsoft Exchange Administrator program to perform many tasks easily. However, you can't use the program to gather meaningful statistics about message flow, average message size, or destination domains. You can use Windows NT Performance Monitor for some of these tasks, but sometimes even that utility isn't adequate. For example, neither Exchange Administrator nor Performance Monitor can tell you whether you've distributed your users in a multiserver site so that Exchange delivers 80 percent of the mail traffic locally and 20 percent off server. Nor can these tools tell you how frequently users use a distribution list (DL) each day or which Internet domains send the most inbound Internet mail to your system.

However, you can obtain this information and much more from Exchange Server's tracking logs. Let's look at how tracking data files are organized and how you can use scripting to turn raw log data into usable statistics. For more information about Exchange Administrator's built-in message-tracking functions, see Tony Redmond, "How Message Tracking Works," June 1998. This article describes how to use the message tracking center to see a graphical representation of message flow and the relationship between components as a message travels through your enterprise.

Enabling Message Tracking
To use message tracking, you must enable it on the Information Store (IS), the Message Transfer Agent (MTA), and your connectors, such as the Internet Mail Service (IMS), Microsoft Mail (MS Mail), or Lotus cc:Mail connectors. Message tracking records details about the movement of messages between system components. You don't need to enable message tracking separately on the Site or X.400 connectors because enabling tracking on the MTA automatically logs messages for these components. You activate message tracking for the IS and the MTA by selecting the Enable message tracking check box on the General tab of the components' respective Site Configuration Properties dialog box, as Figure 1 shows.

Enabling message tracking on the IS lets you gather details about messages' travel between server and local recipients. Enabling message tracking on the MTA lets you track information as messages move between servers both within the same site and between sites. Because the MTA also handles the operations related to expanding DLs, tracking the MTA lets you gather details about DL usage. You also need to enable message tracking on the components that provide connectivity to external systems (e.g., the IMS, Lotus Notes, or MS Mail connectors). If you don't use tracking on these objects, you'll have trouble gathering end-to-end information about message traffic, especially traffic that enters your system from the Internet or legacy connectors.

Even if you don't want to use the statistical details that you can gather from the tracking logs, best practice is to enable message tracking and configure the System Attendant to retain logs for 15 to 30 days. You can develop a mechanism to archive the logs beyond the period you defined in the System Attendant's Properties pages. The tracking logs give you a record of message delivery, which is beneficial if you need to verify a message's recipient and delivery time.

The logs also provide you with point-in-time DL membership information. Because messages typically contain only a reference to a DL and a list's membership can change at any time, the logs can help you determine who was a member of a DL when Exchange delivered a particular message. For example, at 10:00 a.m., someone sends a message to the Legal Team DL. At 10:00 a.m. list members were John, Frank, and Mary. At 10:15 a.m., an administrator removes John and adds Barbara. If you use Microsoft Outlook or Exchange Administrator to look at the DL's membership after 10;15, you don't know that John received the message and Barbara didn't. However, if you look at the details of the tracking logs, you can determine exactly who received the message. Because you're working with a log, you can determine the membership even if an administrator deleted a member account from the system rather than simply removing it from the list's membership.

Tracking Log Format
The files that contain the tracking details are in the \exchsrvr\tracking.log directory. The file names are based on the current system date (as of midnight Greenwich Mean Time*GMT) and are in the form YYYYMMDD.log. The files are tab-delimited ASCII files, and you can easily read them in almost any text editor. I prefer Wordpad over Notepad because the files can be large and Wordpad generally performs better than Notepad. You'll also probably want to turn off line wrapping to make the content more readable on screen. Because the newsletter format doesn't let me turn off line wrapping, in the example below, I've also replaced the tab characters that delimit the fields with arrows (=>)to make the fields' delineations more apparent.

Table 1 shows an extract of nine lines from a tracking log to give you an idea of the information that you'll see in a typical log. In the first column, I've added line numbers to help you see where one line ends and another begins*you won't see these numbers in a real log file. Each log entry has at least two lines. For example, lines 1 and 2 are one entry, as are lines 5 through 9. I'll explain the relationships of the lines and fields in more detail later. Table 2 describes the tracking log fields. You can obtain most of this information in the Microsoft article "XADM: Tracking Log Field Descriptions" (http://support.microsoft.com/support/kb/articles/q173/2/80.asp).

Each event has 15 field entries. However, lines 1, 3, and 5 in Table 1 show only 13 fields, separated by tab (*) characters. Two consecutive tabs signify a blank field. The last two fields (14 and 15) follow on subsequent lines, one line for each recipient. You can see this most clearly by examining lines 5 through 9 from Table 1. Line 5 contains the first 13 fields of information; field 13 contains the number 4, signifying four recipients. Lines 6 through 9 list the distinguished names (DNs) or X.400 proxy addresses of the four recipients. This information is in field 14. In this example, field 15 is blank because the event recorded isn't a report (such as a delivery receipt). For simplicity, I'll refer to the first 13 fields as the primary event data and the lines that follow as the recipient data.

Event Codes
To use the tracking log information effectively, you need to know what the event codes from field 2 represent. For example, event code 26 is "Distribution List Expanded." You can obtain the list of codes from the Microsoft article "XADM: Tracking Log Event Numbers" at http://support.microsoft.com/support/ kb/articles/q173/3/64.asp.

You can see that the message ID for the events on lines 1 and 3 in Table 1 are the same. The same message ID means that these events are related. Line 1 (event 0) signifies that a message was handed to the MTA (in this case from the IMS). Line 3 (event 9) shows that the message was delivered to a user mailbox; the recipient mailbox's DN appears on the lines following the primary event data. If the message required some other action, such as delivering the message to an alternate recipient or generating a delivery report, other events would follow in the log with the same message ID in field 1.

Turning Data into Information
When you understand how the log files are organized, you can perform simple tasks such as opening the files in Wordpad and gathering information (e.g., DL membership when a message was sent). In this case, you look for event code 26 (DL expansion) and use some other identifying information such as the message ID or DL name and time and date to match the appropriate event.

Figure 2 shows that when this list was expanded, it went to two recipients—Benjamin and Andrew—and two nested DLs. To determine the remaining recipients, you must search for other DL expansion events (event 26) that had the same message ID as that in field 1. To find this information, you might need to search other log files on other servers.

To perform more complex tasks such as gathering statistics or generating reports, you need to pull specific information from the logs and then perform some type of statistical or identifying operation such as sorting, counting, or enumeration. You could pull the file into Microsoft Excel, but because logs usually are large and Excel has a limit of 65,535 rows, you usually can't read and manipulate the entire file.

A better approach is to use a scripting language such as VBScript or Perl to read the file line by line. If you don't know how to use a scripting language, now is a good time to learn. Scripts give you an easy and efficient way to accomplish many tasks you want or need to perform to manage Windows and Exchange 2000. In some cases, scripting is the only way to clean up mistake, apply changes, or save yourself days of repetitive keystrokes and mouse clicks.

Let's assume that you have several servers in a site. You've tried to distribute users over the servers so that the users are in a particular workgroup. Workgroup members reside on the same server so that most of the mail will remain on the server, reducing network traffic and maximizing single-instance storage. Now, several years later you're trying to determine whether your users are still properly distributed. One way you can obtain this information is to look at the tracking logs to determine how many messages are submitted and how many are delivered locally. Event ID 4 represents message submissions, whether they're bound for a local user or someone off server. Event ID 1000 represents a local delivery operation (i.e., an operation in which the sender and recipient are on the same server). If you count the number of submissions and local deliveries, you can generate some simple ratios:

Local delivery % = Number of local deliveries / Number of submissions
Off-server delivery % = 100% - Local Delivery %

The script example in Listing 1 generates the local/remote delivery ratio. Listing 1 is in Perl. Listing 2 is VBScript code for accomplishing the same task; you can view Listing 2 on the Exchange Administrator Web site.

As you can see, the Perl code is much shorter than the VBScript code. Because VBScript is generally well understood, available with Windows Script Host (WSH), and integrated with Microsoft Office, VBScript is probably one of the easiest scripting languages to learn. Perl is also a good language to know and has advantages such as fast execution and a lot of string-processing functionality. Although Perl code might look more syntactically complex than VBScript, Perl isn't hard to learn. I prefer to use Perl for tasks such as tracking log analysis because it was designed for reporting and text processing, and I've found that it runs much faster. For example, running these two programs on an 11.7MB tracking log file with 153,151 lines took just 6 seconds for the Perl script and 22 seconds for VBScript using WSH. Figure 3, page 16, shows the results that the Perl script generates, but the VBScript code produces similar output. The Web-exclusive sidebar "What the Perl Script Is Doing" (on the Exchange Administrator Web site) explains the syntax in Listing 1.

Another task you can perform with tracking log data is to determine the typical number of recipients per message by looking at the number of submissions (event 4), summing the recipient count (field 13), then dividing by the number of submissions. You can also gather information about typical message size by looking at field 9, message length.

A Double Win
These examples show that you can obtain a wealth of information from tracking log data when you understand how Exchange organizes the information and what the events represent. When you use scripts to manipulate tracking log data, you make the task easier.

An added advantage of using scripts in this way is that having a real-life problem to solve or a goal of building a usable tool makes learning a programming language easier. I recommend that you take the time now to learn about VBScript or Perl. The knowledge will serve you well now and later when you're entrenched in Exchange 2000.

Comments

Plain text