Don't Panic: Busting a File Space Myth

The Issue

Recently I had someone reach out to me concerned they were about to incur an auto-grow event on a volume where they're already running low on space. They only had megabytes of space remaining on a 5 terabyte-sized data file. This was not just a possibility - it was an inevitability.

Or was it?

A Quick Aside About Data Files and Filegroups in SQL Server

A Microsoft SQL Server database is composed of at least one data file and one transaction log file. Before we get to the discussion at hand I want to reinforce the 1:1 ratio between databases and transaction log files. Do not create multiple transaction log files in your databases.

Have I stressed that enough?

Good...

Moving on then: It's completely reasonable (and customary if not recommended) to have multiple data files associated with a database. The data files can be on the same or multiple logical volumes which may or may not be different physical volumes. When a data file is created it is assigned to a filegroup. A filegroup is nothing more than a logical storage construct to which one or more data files can be assigned. Multiple file groups can exist in a database.

SQL Server filegroups employ a proportional fill strategy. This means that by default data is written to all files in a filegroup proportionately to the amount of free space they have. However there is a valuable exception to this behavior and it's the crux of the solution I offered up to my friend in need.

"Don't Panic"

That is what I told my friend when he first came to me concerned that he was about to bring the production database to a standstill because the data file was full. I asked if the server hosting the database had connectivity to any other volumes. Finding out she did we were all set for a quick fix. I simply told her to create another file on the available volume and be sure to assign it to the same filegroup.

"But Tim" you may say, "with proportional fill you're still going to have at least some data getting written to the old file!"

Well that is where the second part of this solution comes into play: turn off autogrowth on the existing data file as soon as the new file is added. This ensures that data is still being written to disk and that there will be no new data written to the old, full file. Let's examine how this looks:

The first thing I'm going to do is to create a very small sample database. I'll already create the database with two data files and assign them both to the PRIMARY filegroup, which is created by default when a new database is born:

CREATE DATABASE [Fill_Er_Up] ON  PRIMARY 
	( 
		NAME = N'Fill_Er_Up', 
		FILENAME = N'C:\Data\MSSQL12.MSSQLSERVER\MSSQL\DATA\Fill_Er_Up.mdf',
		SIZE = 4MB, 
		FILEGROWTH = 1MB 
	),
	( 
		NAME = N'Fill_Er_Up_2', 
		FILENAME = N'C:\Data\MSSQL12.MSSQLSERVER\MSSQL\DATA\Fill_Er_Up_2.ndf', 
		SIZE = 1MB, 
		MAXSIZE = 1MB, 
		FILEGROWTH = 1MB 
	)
 LOG ON 
	( 
		NAME = N'Fill_Er_Up_log', 
		FILENAME = N'C:\Data\MSSQL12.MSSQLSERVER\MSSQL\DATA\Fill_Er_Up_log.ldf', 
		SIZE = 1MB, 
		FILEGROWTH = 10240KB 
	)
GO

ALTER DATABASE [Fill_Er_Up] SET RECOVERY SIMPLE 
GO

Next I'll create a table in the database and ensure it's created on the default, PRIMARY filegroup:

CREATE TABLE [Fill_Er_Up].dbo.Filling
	(
	Beefy char(5120) NOT NULL
	)  ON [PRIMARY]
GO

At this point I have everything I need for the test but before I start flooding the table with data I want to have a snapshot of before and after states of the files and their applicable sizes. This is the code I'm using for that and what the results look like while that table is empty:

USE [Fill_Er_Up];
GO

SELECT 
	[file_id],
	[name] AS [logical_name],
	physical_name,
	type_desc,
	CAST(CAST(size AS BIGINT)*8/1024 AS bigint) AS [size_mb],
	CASE is_percent_growth
		WHEN 0 THEN CAST((growth*8/1024) AS varchar(30)) +' Mb' 
		ELSE CAST(growth AS varchar(30)) + ' %' 
	END AS growth,
	CASE max_size
		WHEN -1 THEN 'Unlimited'
		ELSE CAST(CAST(max_size AS BIGINT)*8/1024 AS varchar(30)) + ' Mb' 
	END AS max_size,
	CASE max_size
		WHEN -1 THEN NULL
		ELSE (CAST(max_size AS BIGINT)*8/1024) - (CAST(size AS BIGINT)*8/1024)			--+ ' Mb' 
	END AS mb_before_full,
FROM sys.master_files
WHERE database_id = DB_ID('Fill_Er_Up')
ORDER BY 9 ASC
GO

State of files before loading data

Pay attention to the size_mb column and mb_before_full column here and at the end of this article. First lets load some data into the table. Enough that we should see proportional filling of the files:

INSERT INTO [Fill_Er_Up].dbo.Filling(Beefy)
VALUES	('halloween candy')
GO 10000

In order to load the data in I'm taking advantage of a little known secret about the GO batch identifier. If you place a numeric value after GO the batch that is bookended by GO commands will execute that number of times. In this case this will perform that INSERT command 10000 times. Enough to show growth in that table. What we should see is growth in the data files associated with the PRIMARY filegroup:

State of files after inserting 10,000 rows

This is where the myth of proportional fill comes into play: there was no change in size for the smaller, size-capped data file. SQL instead applied all data added to the data file that was not constrained to a maximum size.

In Conclusion

In the end my friend was happy that she was not going to have to explain to her boss that she wasn't keeping an eye on the database until it was too late. She also learned about options for separating IO and for scaling out the database for growth. The other important thing she learned was a key skill for being a successful enterprise DBA: don't panic.

Comments

Plain text