Managing the many logs generated in a Windows environment is a time-consuming—though necessary—part of systems administration. Each workstation, member server, and domain controller (DC) has a Security, Application, and System log, all of which contain valuable security and system information. Depending on your environment and which Windows components you use, you might also have logs generated by Internet Authentication Service (IAS), Microsoft IIS, RRAS, and URLScan, not to mention the logs generated by application servers such as Microsoft Exchange Server and Microsoft SQL Server. Each of these logs has a different format and structure as well as a hefty amount of noise—activity that you must filter out before you can find important events. Wouldn't it be great to have a tool that could read and execute SQL-like queries against any type of log? This dream is reality in the form of LogParser, a command-line utility that you can use with Windows 2000 and later. LogParser gives you the data-mining power of a SQL database such as Microsoft Access, and you can use the tool to automatically process the megabytes of data that your network's diverse logs generate every day. As I write this article, the most recent version of the tool, LogParser 2.1, is available for download as part of the IIS 6.0 Resource Kit Tools (http://www.microsoft.com/downloads/details.aspx?familyid=56fc92ee-a71a-4c73-b628-ade629c89499& displaylang=en).
How It Works
LogParser has three main parts: input processor, data engine, and output processor. The input processor supports native log formats such as IIS logs and Windows log (.evt) files. LogParser can also read comma-delimited (.csv) files, ODBC databases, and text files delimited by carriage returns. The input processor converts each log type into a uniform format, which the LogParser data engine can then process in much the same way that a database processes tables.
If you're familiar with SQL SELECT statements, you'll probably have little trouble writing LogParser commands. Within a LogParser command, you can specify which log fields you want to include in the tool's output, which event records to include or exclude (according to field-value comparisons that you specify), and how you want LogParser to sort output records. But you can also do much more. The data engine supports advanced query functions such as record counts, averages, and top X events. You can also manipulate text fields and perform date and numeric calculations to customize your output or refine your selection criteria.
After the data engine massages the input data and produces a result set, the output processor takes over and formats the result set into an output table. Like the input processor, the output processor supports many file types, so you can format the output table any way you like—from plain text files to SQL databases to XML files.
A Simple Query
To give you a taste of what you can accomplish with LogParser, let's begin with a sample command and its output. The following command queries the local system's Security log and produces a report of all locked-out accounts.
logparser "SELECT DISTINCT SID FROM security WHERE EventID = 644"
The above command produces the output that Figure 1 shows. (This output supplies only the SIDs of the user accounts that have been locked out; later I show you how the tool can resolve SIDs to user names.) As you can see, LogParser is powerful and its syntax is straightforward. The only mandatory argument is the SELECT statement, which you must enclose in quotes. (Also be aware that LogParser is case sensitive when reading most input information, so, for example, the tool considers "eventid" as different from "EventID".) Depending on the type of logs you're working with, you might also need to include parameters about the input logs or the format of your output files.
To get the most out of LogParser, you must understand how to formulate your queries. Because LogParser natively supports the Windows event logs, writing queries for the Security log is simple and doesn't require you to give LogParser any additional instruction regarding how to parse the log. Therefore, I use the Security log to show you how to write effective queries. You can then apply that knowledge to other types of logs.
LogParser's SELECT statement comprises two mandatory clauses—SELECT and FROM—and several optional clauses:
SELECT clause FROM clause \[TO clause\] \[WHERE clause\] \[GROUP BY clause\] \[HAVING clause\] \[ORDER BY clause\]
The SELECT clause specifies the fields to include in each record of the query's result set. The FROM clause tells LogParser which log or logs to use as input for the query. The TO clause tells LogParser where to direct the output. The WHERE clause lets you specify criteria for filtering records into or out of the query. The GROUP BY and HAVING clauses are advanced clauses that let you analyze groups of similar records, calculate aggregate functions on those groups, and specify criteria for filtering groups into or out of the query. The ORDER BY clause lets you sort the result set by specified fields. Let's take a closer look at the FROM, SELECT, and WHERE clauses.
FROM. To execute a query against the Windows Security log, use the FROM clause
You can replace "security" with "application" or "system" to query the other two standard Windows event logs. To query other types of logs, you must specify the log filename in the FROM clause. (I'll discuss this technique in greater depth in an upcoming article.)
SELECT. After you've decided which logs to query, the next step in building a LogParser command is to write your SELECT clause. This clause specifies a comma-delimited list of the fields from the input log that you want to appear in the query's output.
LogParser has built-in support for the Windows event logs, so it automatically recognizes these logs' field names (i.e., EventLog, RecordNumber, TimeGenerated, TimeWritten, EventID, EventType, EventTypeName, EventCategory, SourceName, Strings, ComputerName, SID, and Message); the sidebar "Event-Log Fields," page 4, briefly explains each field. LogParser also recognizes the field names in the logs that IIS generates. For other input log formats, LogParser can determine field names from the first line of the log file if the first line is a header line. Otherwise, LogParser labels each input field as Field1, Field2, and so on. If you aren't sure of a log's field names, you can determine the names by issuing the statement
SELECT * FROM filename
to cause LogParser to return all fields from the input file. Stop the query after LogParser outputs a few records. Review the records to determine the purpose of each field, then match each field to the name that LogParser has assigned it.
The SELECT clause lets you specify expressions in addition to simple field names. This capability lets you tell LogParser to perform string manipulation and arithmetic on number and date fields during the creation of your output table. For example, perhaps you want your output table to include the base description of each event, without all the lines that follow that description. In the Security log, each event's base description is the first information that appears in the event record's Message field and is followed by a colon (:). Thus, you must tell LogParser to inspect the Message field, find the first colon, and return the text up to that point:
logparser "SELECT SUBSTR(Message, 0, INDEX_OF(Message, ':')) AS Description FROM security"
This command uses two string manipulation functions that LogParser supports: SUBSTR and INDEX_OF. To obtain the beginning of the Message field (up to the first colon), use the SUBSTR function. This function's first parameter specifies the source string to be subdivided—in this case, Message. The second parameter specifies the position within the source string at which to begin—in this example, the first character (position 0). The third parameter specifies the position within the source string at which to end—that is, the colon. To obtain the colon's position, however, you must use the INDEX_OF function to return the first position of the colon character in the Message field. LogParser supports many other functions for manipulating strings, numbers, and dates; you can learn more about these functions by reading the LogParser.doc file in the LogParser installation folder.
To give the query expression a name, use the AS keyword followed by the field name you want to appear in the output table—Description, for example. The AS keyword is useful for naming expressions in your SELECT clause or renaming a field within the SELECT clause. (I provide an example of using AS to rename fields later, when I introduce subqueries.)
Often, you'll run into situations in which LogParser returns duplicate records because multiple records have the same values for the set of fields you include in your SELECT clause. For example, the command
logparser "SELECT EventID FROM security"
will probably return event ID 528 (successful logon) many times, each instance corresponding to a successful logon. If you simply want a list of each event ID that has been logged, you need a way to eliminate duplicates from your result set. LogParser solves this need with the DISTINCT keyword. To eliminate duplicate rows from your result set, simply insert this keyword after SELECT. For example, the command
logparser "SELECT DISTINCT EventID FROM security"
returns a list of unique event IDs representing the set of all event IDs that have occurred in the log at least once. Remember that the DISTINCT keyword applies only to the fields you specify in the SELECT clause, not to fields that you specify in other clauses such as WHERE, ORDER BY, GROUP BY, or HAVING.
WHERE. After you specify the log or logs that you want to query and the fields from which you want to gather information, your next step is to specify filtering criteria. The WHERE clause specifies an expression that resolves to TRUE or FALSE. The expression can be as simple as
"EventID = 529"
or as complex as
"EventID = 529 AND TimeGenerated >= TO_TIMESTAMP('2003-12-21','yyyy-MM-dd')
This expression yields all the instances of event ID 529 (logon failure: bad username or password) that occurred between December 21, 2003, and December 24, 2003. (Note that MM indicates month; mm would indicate minutes.) Multiple simple expressions are linked by the AND operator within this complex expression. You can also use the OR and NOT operators and parentheses, just as you do in SQL or any other programming language.
In the WHERE clause, you can use the standard SQL comparison operators that Table 1 shows to compare fields with values and expressions. In particular, the LIKE operator is extremely useful for performing wildcard comparisons. For example, the commandlogparser -i:EVT -resolveSIDs ON "SELECT SID FROM security WHERE SID LIKE '%smith%'"
uses the LIKE operator to yield a list of all events triggered by users with "smith" in their username (e.g., Smith, Naismith, Smithe). Replacing "%smith%" with "Wri*" returns names such as Wright and Wrigley; "%son" returns names such as Johnson and Albertson. This sample command also uses another handy LogParser option: -resolveSIDs. Because a raw SID isn't particularly useful, you can use -resolveSIDs to tell LogParser to automatically resolve SIDs to the usernames of the associated user accounts. To use this option, simply insert the following in front of the SELECT statement:-i:EVT -resolveSIDs ON
For example, the commandlogparser -i:EVT -resolveSIDs ON "SELECT DISTINCT SID FROM security"
might yield the results that Figure 2 shows. Similarly, you can use LogParser's RESOLVE_SID() function to translate a SID to its corresponding account name. For example, the commandlogparser "SELECT DISTINCT SID, RESOLVE_SID(SID) AS username FROM security"
returns each unique SID and its corresponding username, as Figure 3 shows.
Usually, you'll want to filter output according to a field's value in relation to another value, such as whether the current record's event ID matches 529 or whether TimeGenerated is greater than or equal to a certain date. Sometimes, though, you need to know whether a field exists in a list of multiple values. In such cases, the comparison operator IN comes in handy. For example, the clauseWHERE EventID IN (529, 530, 531, 532, 533, 534, 535, 537, 539)
returns all failed logon events listed in the parentheses. By inserting a NOT in front of IN, you'd reverse the logic and tell LogParser to return records in which the event ID isn't among those listed.
The IN operator has an even more powerful purpose, however. You can formulate a WHERE clause in which LogParser uses the values of one or more fields of the current row to perform a subquery—the results of which will help determine whether LogParser should include the current row in the output. To do so, you can embed a subquery within the parentheses that follow IN. For example, you might want a list of all users whose accounts were locked out and who subsequently had their passwords reset. To obtain this list, you need to find each account lockout event (event ID 644), then determine whether a password reset event (event ID 642) occurred for the same account after the date and time of the lockout event. To perform these tasks, first write the following simple query:"SELECT SID AS username, TimeGenerated AS LockoutTime FROM security WHERE EventID = 644"
This query returns the usernames of all accounts that experienced an account lockout. Next, add the following IN expression:AND username IN (SELECT SID AS subUserName
FROM security WHERE EventID = 642
AND subUserName = username AND TimeGenerated > LockoutTime)
Add the -resolve SIDs option to return actual usernames and use the AS option to rename the folders, and you end up with the following command:logparser i:EVT resolveSIDs ON "SELECT SID AS username, TimeGenerated AS LockoutTime
FROM security WHERE EventID = 644
AND username IN (SELECT SID AS subUserName FROM security
WHERE EventID = 642 AND subUserName = username
AND TimeGenerated > LockoutTime)"
When you run this command, LogParser finds each occurrence of event ID 644, then finds all event ID 642 occurrences that specify the same username and that occurred after the user was locked out. The subquery's SELECT clause should return only one field, otherwise LogParser won't know which field in the subquery's result set to compare against. In this example, LogParser looks for the contents of the username field in the records returned by the subquery. Notice that I used the AS option to rename fields in both SELECT clauses to avoid ambiguity when comparing the user name (username) field in the parent query with the user name (subUserName) field in the subquery.
Simple to Sophisticated
LogParser is a powerful tool that lets you use SQL-like queries to scan any type of log so that you can find the information you need without wading through thousands of irrelevant log entries. In addition to the functions I've described, LogParser offers other functions for manipulating strings, dates, and number fields. You can scan multiple logs with one query and even output your results to multiple files. LogParser also supports aggregation functions that let you perform high-level analysis involving time periods, averages, minimums and maximums, and top X records for given criteria. In upcoming articles, I'll show you how to use such functions to analyze the valuable details and codes within security events' Strings field.