Stop Non-Database Deadlocks
Understand why you're finding "deadlock" messages in your Application log - and what you can do about them.
By Don Kiely
Deadlocking has long been a problem with databases. Two processes each try to get an exclusive reference to a resource that the other already holds exclusively. Each sits waiting patiently for the other process to release the resource it needs, which never happens. Modern database engines typically handle the situation by picking one process randomly as the loser. If the work is within the context of a transaction, finished but uncommitted work is rolled back.
ASP.NET developers and admins usually are surprised to find a message in the Web server's Application log that looks something like this:
"aspnet_wp.exe (PID: 9999) was recycled because it was suspected to be in a deadlocked state"
where the 9999 is the Windows process ID for the affected ASP.NET worker process. In the event log, it is an Error type entry with a source of ASP.NET 1.0.3705.0 (or whatever version of the framework you are using). What can be confusing is that although a deadlock on a database used by an application can cause this error - and is probably the most likely cause - it has nothing to do with processes competing for resources. Instead, this travesty is a byproduct of one ASP.NET's new features that makes the framework as robust and reliable as it is.
In ASP classic, if an application hung - whether because of some long database work or some other lengthy process - it stayed hung. You pretty much had to stop IIS and restart it, which generally is not considered a good thing on a production server. And to be on the safe side, you probably rebooted the server.
To solve this problem, ASP.NET includes a new feature that recycles the worker process automatically if it suspects the application is hung: It shuts down the process and restarts it. This isn't good for your application, but at least it affects only a small part of the server rather than every Web application running. This feature makes the service a lot more robust, but it also can be triggered by processes that take a long time to generate the requested page. When it recycles the process, it puts the Error event message in the Application log.
Some of the .NET marketing materials seem to imply that it uses some kind of magic, psychic power to "suspect" whether a process is deadlocked, but actually it is a simple timeout setting. In most cases, you have two solutions. One - and this is usually the best - is to simplify the page processing so it doesn't take as long. Sometimes this isn't feasible, such as for long, complicated database queries, but usually there are ways around it. For example, you could recode a stored procedure to do its work in chunks, returning bits of data back to the server periodically.
The other option is to change the executionTimeout setting in the <httpRuntime> element hidden deep in the bowels of machine.config. It's probably set to 180 seconds, although framework updates seem to change that to 90 seconds. You can increase the value to longer than the process normally takes, but doing so will increase it for all ASP.NET apps on that machine. If an ASP.NET app really does freeze or die, it'll take that much longer for ASP.NET to restart itself.
In general, you want this setting to be as short as possible. If all your ASP.NET applications do only simple page processing - not hitting a database or other limited resource and performing only the simplest calculations - you might even want to shorten the setting from the default. This way, if an ASP.NET app deadlocks it will recycle more quickly.
If you find the deadlock event in your application log, the first step is to explore your ASP.NET app and try either or both of these options. But there's a complication: Sometimes the ASP.NET worker process recycles unexpectedly. In other words, it recycles when no deadlock state exists. Microsoft Knowledge Base article Q321792, FIX: ASP.NET Worker Process (Aspnet_wp.exe) Is Recycled Unexpectedly, identifies this as a bug resolved in Microsoft .NET Framework SP2 for version 1.0 of the ASP.NET (read the article at http://support.microsoft.com/default.aspx?scid=kb;en-us;321792).
But if you need to explore the issue, the Knowledge Base article suggests two ways to rule out a deadlocked state. Open the Windows Performance Monitor, then add the Requests Executing counter for the ASP.NET Application object. If the number of requests executing is greater than zero at the time of the recycle, you are experiencing a deadlock. Or, attach a native debugger to the Aspnet_wp.exe process, then dump out the threads. If any of the threads are processing a request, you are experiencing a deadlock. In this case, examine the thread processing a request to determine what is causing the request to stop responding (hang).
The moral of this story is nothing new to seasoned developers and admins: Check your event logs regularly (or use a third-party tool to keep on top of them) and apply service packs as they come out. But neither should be done indiscriminately.
Don Kiely is senior technology consultant for Information Insights, a business and technology consultancy in Fairbanks, Alaska. E-mail him at mailto:[email protected].