Profiling and Stack Tracing with Event Tracing for Windows

In a previous article, "WPP Tracing in Visual C++ 2010 Projects," I described how you can use Event Tracing for Windows (ETW) via Windows software trace preprocessor (WPP) tracing to insert free-form tracing that you can put into the release build of your application. In some cases, though, it makes sense to use manifest-based ETW, which was introduced in Windows Vista. Instances of manifest-based ETW are also known as Crimson events, which is the term I will use in this article. The important differences between ETW via WPP and Crimson events include the following:

Crimson event schemas are published via XML and are registered on the target machine, allowing users to easily decode them, whereas decoding of WPP tracing requires data from private symbols. Therefore, WPP is more suitable if the trace messages contain proprietary information. Regardless, avoid tracing any customer's private data.
On Windows 7 and later versions, you can capture stack traces each time a Crimson event fires, but you cannot create such useful data by using WPP tracing.
You can use Crimson events to implement targeted code profiling. Although you can perform profiling with WPP events too, the stacks gathered with Crimson events are more useful.
You can use the Event Viewer to view Crimson events, but not WPP events. Surfacing Crimson events is done via a technology called channels, which I will not cover in this article. I encourage you to read more about channels at tinyurl.com/3c4589t.
WPP events are arguably more versatile during the coding phase because they are free form and use a PRINTF-formatted string, whereas each Crimson event must be clearly defined in a manifest.

You can use Crimson events for profiling as well as for capturing stack traces at specific points in your code. Stack trace capturing is a useful tool for anyone who creates library code or any other code that others consume. Library code writers know their code may be invoked in a variety of unpredictable ways.

In my article, "Sample Profiling with Xperf," I describe how you can use Xperf as a simple sample-based profiler. Sample profilers are good tools for determining where your code is generally spending its time, but are not very useful for timing functions or loops. Instead, you can use Crimson events to perform both of those profiling tasks.

The Demo Code
To demonstrate Crimson events in this article, I created a new MFC project in Visual Studio 2010 and accepted all the default settings in the new project wizard. If you want to follow my steps here, create a new MFC project now.

You create an XML manifest file to define your events. You can read more about how to write a manifest file here. Writing raw XML, even with IntelliSense, can be an error-prone affair. Therefore, you may want to start by using the ECManGen.exe graphical editor provided by the Windows SDK, which is also bundled with Visual Studio. After you generate an XML file, you may find it easier to tweak the file in the Visual Studio editor. In my project directory, I created a subdirectory named man where I saved my manifest file as My-Test-Diagnostics-Provider.man.

Define Your Provider
The first thing you need to do is define your provider. This important step enables the various ETW controllers to control your provider. Defining a provider is easy: In ECManGen.exe, select the Events section, then select New Provider in the far-right column. To declare a provider, follow these steps:

1. Name your provider. I used the name My-Test-Diagnostics-Provider.

2. Create a GUID to uniquely identify your provider.

3. Provide a symbol name for your provider. (This will appear in the generated code.) I named mine MyDiagnosticsProvider.

4. Provide a full path to your binary, which will contain the generated resources in both the Resources and Messages fields in ECManGen.exe. I'll provide more information about this later in the article.

After performing these steps and saving the file, the provider section of the manifest file appears as follows:

Define Your Events
After you create your provider, you need to define your events. However, you must first create what are called templates for any events that accept arguments. A template describes a collection of arguments. An event can then declare that it uses one of the defined templates, thereby indicating that it accepts arguments. In this case, I will create a pair of events for monitoring function entry and exit, each of which accepts a Unicode string representing the function name as an argument. With the provider selected in ECManGen.exe, select New Template to create the new argument template. I named the new template T_UnicodeFunctionName, and I added a field named FunctionName that had an InType of win:UnicodeString and an OutType of xs:string.

Next, we're ready to define the events themselves. Select your provider in the ECManGen.exe application, then select New Event. I created an event named FunctionEntry with an event ID of 101. Additionally, I selected the T_UnicodeFunctionName template, which defines the arguments for this event. I then created a second event named FunctionExit with an event ID of 102 that uses the same template as FunctionEntry.

Finally, I added a PRINTF format message string to my events, which I use to reference the parameters within the template, using substitution specifiers starting with %1. The strings themselves are referenced from a string table in the XML, which also supports localization for international support. When you save your manifest, ECManGen.exe generates a string table that is localized as EN-US.

After declaring my events and saving the file from ECManGen.exe, my complete manifest file looks like Figure 1.

Compile Your Manifest File
Message Compiler (mc.exe) processes your manifest file to generate the headers and resource files that your project consumes. There are several ways to work this into a Visual Studio project. Starting with Visual Studio 2010, C++ projects use the MSBbuild engine, which includes well-defined extension points. But for this example, I used the prebuild event hook point. (In the project's Property Pages, navigate to Configuration Properties, BuildEvents.) I added the following commands to the project:

if not exist "$(ProjectDir)\generated" md "$(ProjectDir)\generated"
mc.exe -um "$(ProjectDir)\man\My-Test-Diagnostics-Provider.man" -z MyDiagnosticEvents -h "$(ProjectDir)\generated" -r "$(ProjectDir)\generated"

Notice that I am saving my generated files into a subdirectory that is named generated. If the generated subdirectory under the project directory does not exist, I create it, and then I start mc.exe. The -um argument tells mc.exe that I want user-mode code (Crimson events can also be written from kernel mode), the -z argument provides the base name of the generated files, the -h argument specifies where to put the generated C++ header file, and the -r argument specifies where to put the .rc file that contains the generated resources.

Register the Provider at Runtime
The first thing you must do within your code is register the provider at runtime. This is simple, given the generated functions and macros provided by compiling the manifest. Based on my manifest, the two macros generated for me are EventRegisterMy_Test_Diagnostics_Provider() and EventUnregisterMy_Test_Diagnostics_Provider().

Because my demo application is an MFC application, a convenient place to perform this registration and unregistration work is within the constructor and destructor of the CWinAppEx derived class. I named my sample MFC application TestApp, so the methods I need to modify are members of the CTestAppApp class. I invoked EventRegisterMy_Test_Diagnostics_Provider() at the very beginning of CTestAppApp::CTestAppApp. And by default, a destructor was not generated for CTestAppApp by the MFC Project Wizard, so I added one and invoked EventUnregisterMy_Test_Diagnostics_Provider() from within CTestAppApp::~CTestAppApp. Finally, for this to work, I had to #include the generated header MyDiagnosticEvents.h at the beginning of TestApp.cpp (your header will be named differently, depending on what you named your demo application).

Don't Forget the Resources
Mc.exe generates an .rc file when it processes your events manifest. You must compile this .rc file into your project at an appropriate location. Because my MFC demo application already has an .rc file (named TestApp.rc), I used #include at the bottom of the file to include MyDiagnosticEvents.rc in the generated .rc file.

The resources are important when it comes time to post-process or render your events. ETW logs events in a very compact and efficient way, so that you must use a rendering tool such as Tracerpt or Xperf to convert them into human-readable events. If ETW cannot find your resources during the rendering phase, your events will look rather generic and arguments to your events will be displayed as a binary blob of data. Terefore, the resources contain information that the rendering engine needs in order to transform binary data into the arguments for the events that you defined in the templates.

In my demo application, TestApp.exe, I built the resources into the .exe file. However, notice that in your manifest, you must provide the complete path of the file that contains the resources. For the sake of the demo, I assumed my demo application would be in the C:\test directory on my machine. This poses a problem because most application installers allow users to install the application anywhere they please. You don't want your installer to have to doctor your manifest file each time someone installs your application! Therefore, it's a good practice to build your resources into a resource-only .dll file and to install that file in the \Windows\System32 directory on the boot volume.

Fire Events As Needed
Now that everything is in place, it's easy to include the generated MyDiagnosticEvents.h file and then invoke the generated eventing macros at strategic points of the code. The two events I created are called FunctionEntry and FunctionExit. If you look in the generated header file, you will see macros named EventWriteFunctionEntry() and EventWriteFunctionExit() that both accept one parameter named FunctionName—which, not surprisingly, is the name of the single Unicode string field within the T_UnicodeFunctionName template in the manifest file.

I was curious to see how long would take to execute the CMainFrame::OnCreate() method in my test application. Therefore, at the beginning of the method, I added the following line:

EventWriteFunctionEntry( L"CMainFrame::OnCreate" );

And at the end of the function, just prior to the return statement, I added the following line:

EventWriteFunctionExit( L"CMainFrame::OnCreate" );

Register Your Manifest
After you compile the test application and are ready to test, you must register your manifest on the machine from which you want to gather events. This is something you can have your application installer do. To do this, use the Wevtutil.exe file. You use the im argument (mode) to install your manifest, and you use the um argument (mode) to uninstall. Therefore, to install my manifest, I used the following command:

wevtutil im My-Test-Diagnostics-Provider.man

After you register your provider, it will show up when you perform a provider query, such as in the following command:

logman query providers | findstr "My-Test"

Collect Events
You can use any ETW controller to collect your events. This includes logman.exe (which is installed with the operating system), tracelog.exe (which is included in the WDK), xperf.exe, and even the system Event Viewer if you have also defined a channel for your events. You can even write your own ETW controller that calls the appropriate API functions as necessary to start and stop tracing. Which one you choose depends on the situation and on how difficult it is to put the tools on the machine from which you need to collect events.

I will demonstrate how to use the Xperf tool, but first understand that there is a huge overlap in functionality between all the ETW controllers. To capture events for the provider that we created using Xperf, run the following command:

xperf -start MySession -on My-Test-Diagnostics-Provider -f MySession.etl

The -start argument gives the event collection session a name, the -on argument tells ETW we want to collect events from our provider in this session (there can be multiple -on arguments), and the -f argument specifies the file in which to save those events. The test application does not have to be already started to initiate an event collection session. In fact, you can initiate event collection on startup if you create an AutoLogger session, as described here.

Now, start and then close the MFC test application, which should generate the two events bracketing CMainFrame::OnCreate(). To stop event collection by using Xperf, run the following command:

xperf -stop MySession

This will stop the session and save any events to the MySession.etl file.

Note: You can get really fancy with buffering in ETW. Xperf and the other ETW controllers allow you to control the size of the buffer, whether it is in paged or nonpaged memory, whether it is linear or circular, and so on. For example, if you want to set up a session so that you can see the latest events logged before a crash generates a kernel dump, you can set up a nonpaged circular buffer. Then, after you load a kernel dump in WinDbg, you can use the !wmitrace extension to access the ETW buffers in memory. You can inspect them in the debugger or save the buffer to a file where it becomes just like any other .etl file.

Render the Events
Now that we have some events in MySession.etl, we need to turn it into something that is human readable. If you consider XML to be human-readable, you can use the tracerpt command, as shown here:

tracerpt MySession.etl

This is where it is very important that ETW be able to access the resources I mentioned earlier. When you run the Tracerpt command, a Summary.txt file and a Dumpfile.xml file are created. If you look at these two files, you should see our two events in the XML, near where the parameters and the message strings appear.

Alternatively, you can run Xperf with MySession.etl to display the Crimson events in a graph named Generic Events. In that graph, you can filter the event providers if you traced multiple providers in the session. In the graph, if you rest the mouse pointer over the diamond representing the event, you see an overview of the event, including the event data and other information.

Collect Stack Traces on Windows 7
The ability to capture stacks when Crimson events fire on Windows 7 is extremely beneficial. Following the Xperf -on argument, you can add flags, levels, and the optional keyword "stack," separated by colons, as shown in the following command (which does not have any specific flags or levels):

xperf -start MySession -on My-Test-Diagnostics-Provider:::'stack' -f MySession.etl

If you then re-run the test application, stop event collection, and render, you will see the stack data. However, the stacks will not be human-readable; they will contain only the raw function addresses.

Xperf is the best tool to use to render the stack data because you can point it to your symbols and to the Microsoft public symbol store. However, even Xperf cannot render these stacks. That is because Xperf is a state engine that is missing some crucial data necessary to translate the addresses to readable symbol names. We need to provide Xperf with the location at which the modules are loaded into memory. Fortunately, we can start a Kernel Logger session concurrently with our test provider and thereby collect module load events to give Xperf the data it needs. I also like to include process thread information, so that I can see which thread was executing when an event was fired.

Note: Be careful about which events you collect in a Kernel Logger session! For example, CSWITCH events generate a lot of events, and they'll fill a log file quickly. Collect only what is necessary.

To start both sessions at the same time, run the following command:

xperf -on LOADER+PROC_THREAD -start MySession -on My-Test-Diagnostics-Provider:::'stack' -f MySession.etl

This starts two sessions event though there is only one -start argument. This is because the Kernel Logger is the default session for Xperf and you don't need to name it. Therefore, the first -on argument is associated with the Kernel Logger, and the second -on argument is associated with MySession because it follows the -start argument for MySession. After starting the sessions, rerun the test application.

You must stop the sessions in the reverse order of how they are listed in the previous Xperf command, as follows:

xperf -stop MySession -stop -d merged.etl

The first -stop argument stops MySession and the second -stop argument (which has no session name) stops the Kernel Logger. Then, the -d argument merges the events from both sessions into one ETL file named Merged.etl. Note that the MySession events are still saved in MySession.etl.

Now, you can load the Merged.etl file into Xperf, set up the symbol paths in the Trace menu, and load the symbols. Be sure to point the symbol path to the Microsoft public symbol server and to the path containing the symbols for your MFC test application. The syntax for the symbol path is the same as the syntax we used in WinDbg.

Then, in the Generic Events graph, you can disable all providers except My-Test-Diagnostics-Provider. Next, right-click in the graph, then click Summary Table. Enable the stack column if it is not already enabled, and drag it to the left of the gold aggregation bar in the summary table view. When you expand the stack for our test provider, you should see the stack in all its glory. On my machine, the stack appears as it does in Figure 2.

The stack is displayed in the reverse order of what you usually see in the debugger. However, you can find the eventual call to ZwTraceEvent (which actually logs the event), and underneath that, you see the entire stack leading up to it. Looking at these stacks can help you understand how GUI apps work in Windows because you can see the stack bouncing back and forth between user mode and kernel mode as the window is created.

And finally, with the two events logged, you can compute the differential between the time stamps in the Generic Events graph to determine how long it took to execute CMainFrame::OnCreate().

Note: Manipulating summary tables takes some time to get used to. Experiment with moving things over to the left side of the gold aggregation bar to see what effect this has. Additionally, column ordering on the left side is important because items are sorted from left to right by each column's data type. For example, if you have the process column to the left of the stack column, events are first sorted by process and then by stack.

Overall Benefits of Crimson Events and ETW
Crimson events are an excellent tool for creating an optimally supportable product. They were introduced in Vista, partly to redefine the outdated system eventing mechanism. Moreover, ETW is fast and efficient, which lets you leave your event-generating code in the release builds of all but the most timing-intensive code. That is to say, the cost of ETW is measurable, but it is usually very miniscule. You can use Crimson events to perform diagnostic tracing, which is useful for administrators and developers. For example, if you produce a library that other developers use, and you want to help them troubleshoot their applications while protecting your intellectual property and your source code, ETW may be the solution for you.

Although I have demonstrated their use in a native MFC application, you can log Crimson events just as easily in managed .NET applications, and the process is essentially the same. To learn more, check out the managed tracing sample in the Windows SDK, which you can find in the \Samples\winbase\Eventing\Provider\Simple\CSharp directory. Arguably, if it were not for ETW's efficiency and enabling of measurement, Windows 7 would not be the amazing product that it is today, and there is no reason your products cannot benefit as well.

Trey Nash ([email protected]) is a Microsoft senior escalation engineer working on the Windows OS and other products. He is the author of Accelerated C# 2010 (Apress). Trey also delivers training on the .NET platform and kernel-mode debugging.

Comments

Plain text