Practical SQL Server
traffic orange hazard lights

Tracking Live SQL Server File Stalls

Given that SQL Server workloads are commonly I/O bound, a great way to get a sense for how a given server is keeping up with its current workload is to evaluate how long the I/O subsystem is taking to respond to requests by SQL Server to read or write data.

Related: I/O I/O It's Why My Server's Slow

File Stalls: A Quick and Dirty Overview

A common mechanism to retrieve this information is simply to ask SQL Server for this info (which it keeps tabs on—thanks to the fact SQL Server's own, internal, scheduler keeps tabs on how long I/O requests take to be satisfied by the OS)—which you can do with a query like the following:

	DB_NAME(vfs.database_id) [db_name], 
    io_stall_read_ms / NULLIF(num_of_reads, 0) avg_read_latency, 
    io_stall_write_ms / NULLIF(num_of_writes, 0) avg_write_latency,
    physical_name [file_name], 
    io_stall / NULLIF(num_of_reads + num_of_writes, 0) avg_total_latency
	sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs 
    JOIN sys.master_files AS mf 
		ON vfs.database_id = mf.database_id AND vfs.FILE_ID = mf.FILE_ID 
	avg_total_latency DESC;

From such a query (which will run instantaneously on your server), you'll be able to evaluate what are commonly referred to as 'file stalls'—or the amount of elapsed time (expressed in milliseconds) that SQL Server spends waiting on (stalls against) the I/O subsystem before individual read/write operations are complete.

NOTE: The query above is quite simplifiedas there are a large amount of other details that COULD be (and commonly ARE) evaluated to get a better idea of what's going on with IO requests. For more info see Books Online.

In terms of evaluating file stall metrics, an ideally suited server (or I/O subsystem) will show latencies of 0-8ms for reads and writes against data files, and 0-4ms for writes and reads against log files. And while 0ms might seem to be an impossible feat, the reality is that an I/O subsystem with plenty of spindles AND cache can very regularly hold its own at 0-2ms even over long periods of time.

Many production systems, however, don't quite find themselves in the range of the ideal 0-8ms and 0-4ms specified above. And, to that end, servers running in/around 10-20ms can commonly 'handle themselves' just fine in MANY (not all) cases. Otherwise, as numbers start climbing up and above the, say, 20ms range on a repeated basis it's either time to start tuning indexes and code as needed or throwing more hardware at your workload in many cases.

Caveat: Aggregate Numbers Can be Misleading

All that said, I've got some clients with what I'd call finely tuned servers whose file stall metrics commonly climb above 100ms, 500ms, and even—occasionally, 2000ms (or two full seconds) across multiple databases on the same server—yet I'm NOT worried about disk performance on their systems at all.

On the surface it sounds crazy for me to say that average stalls of a tenth-of-a-second, a half-of-a-second, or even two full seconds against multiple databases would be 'fine.' But, when you consider that these are aggregate numbers—or metrics collected over long periods of time, this means that IO saturation at certain points or times during the day can dramatically inflate stall numbers. (Note that I'm NOT saying that the numbers are incorrect or invalid—LONG stalls are, indeed, accurately inflating these metrics. Instead, I'm stating that when carefully analyzed, these numbers MIGHT not be a problem in SOME environments.)

So, for example, with one of these clients in question, a large number of maintenance routines (both in terms of backups, index defrags/rebuilds, stats maintenance, and then a HOST of aggressive 'business jobs' running late at night (on a heavily-used multi-tenant system)) cause some decent, and ugly, bottlenecks against the underlying I/O subsystem. But, in this case, the bottlenecks are all against the data files (as there's a lot of row-by-row processing in larger tables as well as a gob of deletions from larger and older auditing tables) and so the sluggishness really isn't a problem. At all.

Of course, if the jobs took too long to complete or weren't completing within their specified 'maintenance window' then this would require some additional effort or hardware. But until then, if the IO subsystem is getting 'worked' fairly hard every night but stays fast and responsive during the day, that happens to work fine for this particular workload.

Checking on Live File Stall Metrics

To that end, the point of this post is just to call out that looking at aggregate metrics can be misleading—or might not tell the whole story. As such, whenever I find above-ideal metrics for file stalls, I always like to take a peek at 'real time' stalls—to get a better feeling for what the numbers are like when the system is under load or running during 'peak' periods of the day (where 'peak' is defined as when the highest number of end-users are using and depending upon the system and when it needs to be as responsive as possible). And to do that there are two tools that can be used. First and foremost is Performance Monitor, or perfmon, which can be used to look at Physical Disk Avg sec/Read and Avg sec/Write—across all drives. And since this counter tracks milliseconds it's pretty easy to get a quick feel if you are, indeed, seeing 'peak' stalls > 0-4ms or 0-8ms across your drives/workloads.

Then, if you are running into any IO problems (or if you'd prefer to not drop into perfmon), the following query can be very valuable as a means of tracking file stalls over a short amount of time:

DECLARE @Reset bit = 0;
IF NOT EXISTS (SELECT NULL FROM tempdb.sys.objects 
WHERE name LIKE '%#fileStats%')  
	SET @Reset = 1;  -- force a reset

IF @Reset = 1 BEGIN 
	IF EXISTS (SELECT NULL FROM tempdb.sys.objects 
	WHERE name LIKE '%#fileStats%')  
		DROP TABLE #fileStats;

		io_stall_write_ms, io_stall
	INTO #fileStats 
	FROM sys.dm_io_virtual_file_stats(null, null);

	DB_NAME(vfs.database_id) AS database_name, 
	--vfs.database_id , 
	vfs.FILE_ID , 
	(vfs.io_stall_read_ms - history.io_stall_read_ms)
	 / NULLIF((vfs.num_of_reads - history.num_of_reads), 0) avg_read_latency,
	(vfs.io_stall_write_ms - history.io_stall_write_ms)
	 / NULLIF((vfs.num_of_writes - history.num_of_writes), 0) AS avg_write_latency ,
FROM    sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs 
		JOIN sys.master_files AS mf 
			ON vfs.database_id = mf.database_id AND vfs.FILE_ID = mf.FILE_ID 
		RIGHT OUTER JOIN #fileStats history 
			ON history.database_id = vfs.database_id AND history.file_id = vfs.file_id
ORDER BY avg_write_latency DESC;

Because, as you can see, this query just uses a simple 'trick' to record file stall metrics as they are now (or over time), and then 'deltas' subsequent requests against that number so that you can more or less watch file stalls in real time. Accordingly, I like to use this query to watch what file stalls look like for a few minutes, or even 10-of-minutes during peak load, and then even reset the deltaing process by setting @Reset = 1, to determine if there are any usage patterns that would clearly define which databases might be causing excessive I/O requests and so on.

Related: Troubleshooting Slow Servers

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.