WPP Tracing with !wmitrace and the Debugger

Expanding on the theme of Windows software trace preprocessor (WPP)–style ETW tracing from my September article "WPP Tracing in Visual C++ 2010 Projects," I would like to show you how you can use the !wmitrace extension within WinDbg, which is the most powerful debugger available for Windows included in the Windows Driver Kit (WDK). In that article, I showed you how you could integrate WPP-style tracing into your Visual Studio 2010 projects, giving you a fast, efficient, and platform-supported way of tracing code execution in released products. Using the Event Tracing for Windows (ETW) support built into the OS, one can turn on tracing at any time before or during the execution of the code in question and gather useful traces.

I showed you how you could use tracelog.exe (included in the WDK) to start and stop WPP traces, saving them to an ETL file that you could then collect from the field and later analyze. Tracelog.exe is merely an ETW controller and is one of several available. You could even write your own. However, the ETW controller included in-box with Windows is logman.exe, and its arguments are slightly different from the tracelog.exe arguments. If you are requesting a customer to collect a trace for you, you don't want to require them to download the WDK first, so in this article, I will mostly be using logman.exe as the ETW controller.

But what if you don't have the luxury or opportunity to even create an ETL log file in the first place? What if the machine is bug-checking (blue-screening) during the test repro? In that case, you certainly won't have the opportunity to save the trace to an ETL file. So what can you do? Let's explore the answer to that question.

Tracing and Non-Paged Pool
When you start a trace using logman.exe with the proper arguments, you can instruct ETW to use an in-memory buffer allocated from the non-paged kernel pool. ETW is meant to be efficient as possible, and in achieving that goal, it is allowed to drop events if it cannot log them fast enough. We certainly would not want the machine to bog down simply because it needed to log events. However, there are some steps you can take to mitigate this problem, and that involves instructing ETW on how big it should make its buffers in memory. In this way, you can ensure that the memory is pre-allocated and available for use. But even more important is that if the machine bug-checks and you have it set up to generate kernel or complete dumps, or if you have a kernel debugger attached to it, you can access those buffers directly from the debugger!

Note: I am assuming that you're using Windows 7; however, just about all this material applies directly to Windows Vista as well. Prior to Vista, your mileage will vary since ETW has evolved significantly.

Let's use poolmon.exe (included in the WDK) to demonstrate the buffer use by ETW. Open a WDK build environment command prompt from the Start menu. Since my machine is 64-bit, I opened up the x64 Checked Build Environment command window. You will find poolmon.exe under Tools\Other\platform. Be sure to use the one aligned with your platform.

By default, Poolmon runs in an auto-update mode where it periodically refreshes its display. The ETW buffers allocated from nonpaged pool have a pool tag of EtwB. Therefore, try executing the following Poolmon command:

poolmon -iEtwB

Poolmon will then show you all EtwB pool allocations currently in use, and you may be surprised if you see an ETW allocation in a nonpaged pool even though we haven't started any tracing. This is because, Windows typically sets up some default ETW traces at install time, and they are continuously running.

Now, with Poolmon still running, start a new elevated command prompt and execute the following command to start a new ETW collector:

logman create trace test -p Microsoft-Windows-NDIS -nb 50 50 -bs 1000 -o c:\temp\log.etl -ets

This command creates a new collector named test using the ETW provider Microsoft-Windows-NDIS, but pay close attention to the remaining arguments. The -nb option sets the minimum and maximum amount of nonpaged buffers to use. Since I set both min and max to 50, it means that all 50 buffers will be allocated at once, and I will be less likely to lose events since ETW will not be growing the buffer dynamically. The -bs option with the argument 1000 sets the size of a single buffer to 1MB. Therefore, after executing this command, you should see the Poolmon display reflect the increase in size of 50MB for the EtwB nonpaged allocation. Notice that I provided the -o option, telling it where to save the ETL file, but this step is optional if you plan to use only the in-memory buffers. Additionally, if you plan to let the trace run indefinitely and only want to capture the traces just prior to the point of interest, consider using the -f bincirc option to log to a circular buffer.

To stop the collection of events, execute the following command; you should see Poolmon reflect the buffer deallocation:

logman stop test -ets

Also, you can query which ETW collectors are running at any time by executing the following:

logman query -ets

And finally, you can query specific information about the collector we created earlier by executing the following:

logman query test -ets

The demonstration utilizing the Microsoft-Windows-NDIS provider uses manifest-based ETW logging. That means that each type of event logged by the provider has its type clearly defined in a manifest. This is in stark contrast with the WPP-style ETW tracing, which is more suitable for developers since the format of the event payload is free form, similar to the arguments passed to a printf function call.

One of the easiest ways to view a manifest-based ETW log is to simply open it within the Event Viewer. When you launch Event Viewer, select the Action, Open Saved Log menu item. After you point it to the file collected previously, it will ask you if you want to convert it to the new format. Say yes, then give the log a name and click OK. Once you do so, you should see Event Viewer presenting a friendly view of the event log to you. Event Viewer is consuming the registered manifest for that provider in order to render the log in human language.

Alternatively, you can use tracerpt.exe (also included in-box) to convert the log to XML or CSV format such as the following:

tracerpt log.etl -o log.xml -of XML

And finally, LogParser is a much more powerful tool, letting you even perform queries again the ETL files. You can find it at the Microsoft Download Center.

WPP and !wmitrace
For the rest of this article, I'll assume that you've read "WPP Tracing in Visual C++ 2010 Projects" and have built a sample application as described in that article. Now I want to show you how you can use the kernel debugger (WinDbg) along with !wmitrace to control and examine your WPP traces. For the purposes of this example, you can set up a live kernel debug session where you either debug a physical target or a virtual machine. However, you'll probably want to debug a physical target with the 1394 transport to facilitate siphoning trace data efficiently. Unfortunately, using a serial debug transport also emulates the slow speed of serial as well.

Once you get your live debug set up, the first thing you'll want to do is bring up the !wmitrace help by executing the following:

!wmitrace.help

The help is fairly explanatory once you understand the concepts of ETW. To display all the currently executing collectors, perform the following from the debugger command line:

!wmitrace.bufdump

The output you see is effectively identical to what you see when you execute the following logman.exe query on the target machine itself:

logman query -ets

Not surprisingly, you can actually use !wmitrace from within the kernel debugger as an ETW controller similar to logman.exe and tracelog.exe; however, I won't be covering that capability of !wmitrace within this article.

Accessing ETL from WinDbg
After you have a working sample app running on the target machine using the techniques from "WPP Tracing in Visual C++ 2010 Projects," start a tracing session using logman.exe. What I like to do is use the -pf option, where I can provide a file with the provider GUIDs as well as the flags and levels. Therefore, I created a file named traceguids.txt, which contains the following single line:

\\{28EE579B-CF67-43b6-9D19-8930E7AAA131\\} 0xffffffff 0xf

Note: This GUID is the WPP trace provider GUID I generated when I created the WPP sample application in "WPP Tracing in Visual C++ 2010 Projects." Your GUID should be different. The curly braces surrounding the GUID are important if you want to avoid the cryptic Element not found logman.exe error.

Now you can start a new trace on the target machine with a command similar to the following, where I have created a 10MB buffer in non-paged pool:

logman create trace test -pf traceguids.txt -nb 10 10 -bs 1000 -ets -o trace.etl

Once the trace has started, execute the sample application and do what it takes to generate some trace logs. The example app from "WPP Tracing in Visual C++ 2010 Projects" logs Windows messages, so simply starting it and shutting it down will do. Once you've done this, break into the kernel debugger.

Note: If you don't have a kernel debugger currently attached, you can set up the machine to generate either kernel or full/complete dumps on crash and manually bug-check the machine using the Ctrl+Scroll+Scroll technique or any other available technique for bug-checking the machine. Once you open the dump file in the debugger, you can do most of the same things as I'll describe shortly, except obviously you cannot use any of the modification commands of !wmitrace.

Once inside the kernel debugger, execute a !wmitrace.bufdump command to see the current event collectors. You should see one named test, which was created by the previous Logman command. The output from the command should look similar to that in Figure 1, where I have removed the other collectors present in the list.

You'll notice that many of the !wmitrace commands accept an argument named LoggerId, which for this example is test. So, to save the ETL buffer to an ETL file, simply use!wmitrace.logsave as follows, where I have saved the trace into a file named trace.etl:

!wmitrace.logsave test trace.etl

As with other tools written for developers by us developers, with !wmitrace there is a nuance to be aware of: !wmitrace.logsave does not like spaces in filenames.

After you've saved the ETL in a file, creating human-readable output is the same as described in "WPP Tracing in Visual C++ 2010 Projects." You can use either tracefmt.exe from the WDK or tracerpt.exe included in-box. Once you're finished, be sure to turn off tracing on the target by executing

logman stop test -ets

Other Techniques Using !wmitrace
If you don't want to go through the extra step of saving the ETL to a file from within the debugger, and if you know where the TMF files are located (which you typically do), then you can use !wmitrace.logdump to list the buffer contents from within the debugger. To do so, use a command similar to the following, where I have pointed directly to my TMF file:

!logdump test -tmf C:\temp\x64\Debug\79f68268-b174-1e2b-afa7-499d2f05bce1.tmf

Alternatively, you can use the !wmitrace.searchpath command to augment the debugger's TMF search path, or can also set the TRACE_FORMAT_SEARCH_PATH environment variable prior to starting the kernel debugger.

As a developer, sometimes you want to see your WPP trace messages in real time within the debugger. To do so, after informing the debugger of where your TMF files are, you can start a trace session using the following tracelog command with -kd and -rt options:

tracelog -start test -guid #28EE579B-CF67-43b6-9D19-8930E7AAA131 -flag 0x00ffffff -level 0xf -kd -rt

The -kd option coupled with the -rt option sets up the kernel debugger filter so that your trace messages will display in the debugger output window. You'll notice some slight latency of those messages caused by the timing of the buffer flush to the kernel debugger.

Finally, you may have noticed some familiar names in the list of collectors in the !wmitrace.bufdump output. Notice the output in Figure 2 showing the Security, Application, and System event log buffers in memory.

If you're debugging a kernel dump from an end user's machine and want to know if anything catastrophic was logged to the event log prior to crashing, you may be able to find it in these buffers. I like to save them to disk using !wmitrace.logsave, then open them in the Event Viewer. The Event Viewer will require the manifests for the events to be registered on the machine from which you're viewing. However, if your OS version and the OS version of the target machine are the same, you should have relatively little trouble reading the event logs.

Another Useful WPP Tool
This series of articles up to this point have demonstrated the usefulness of ETW and WPP. By using the !wmitrace debugger extension, we bring the debugger into the fold so it can take its rightful place in the WPP toolbox. With !wmitrace, you don't have to worry about losing your trace in event of a catastrophic bugcheck of the target machine. As long as the machine is set up to generate kernel or full/complete memory dumps, you'll be able to access the trace logs during dump analysis. Happy debugging!

Trey Nash is a senior escalation engineer at Microsoft working on the Windows OSs and various other products and is the author of Accelerated C# 2010 (Apress). When not working feverishly within the bowels of the OS, he is delivering training on .NET platform and kernel-mode debugging or playing ice hockey.

Comments

Plain text