Fill Factor is a configuration setting applied to Microsoft SQL Server indexes that dictates how full an index page is going to be packed when initially created and periodically rebuilt. There are many factors that go into the value one should assign to this setting, and often it comes down to a solid educated guess. Considerations such as how frequently the underlying columns in the index change due to inserts, updates or deletes is the most critical. However, since most databases are created well before any real usage occurs--when you're probably relying on internal testing at best to provide insight into usage patterns--it's not uncommon to experience the negative consequences of an inaccurate Microsoft SQL Server Fill Factor setting when the database is “in the wild.”
While this setting is also configurable for each index independently, there is also a global setting for Fill Factor that can be found in the Server Properties dialog that is used when a SQL Server Fill Factor has not been explicitly set for an index. While it’s great that there is this over-arching configuration option that will (pardon the pun) “fill” in when there is no explicit value at the index level, it’s completely wrong to expect that there is a single value that is appropriate for each index in a database. Unfortunately, it’s quite common to find Fill Factor configured this way in most indexes because many novice and intermediate database administrators tend to use the configured value from the server settings as a crutch or catch-all because they’re unaware of the ramifications of an incorrect setting.
Benefits of a Good Fill Factor Setting
Setting a SQL Server Fill Factor that’s appropriate for the usage pattern and underlying configuration of a table and its columns leads to index pages that allow room for changes in the underlying columns through INSERT, UPDATE and DELETE requests submitted by end users. At the same time, an appropriate setting will not allow for too much room, which conflates the size of the index and leads to increased I/O activity because of the need to have the same amount of information spread across more index pages--all of which will need to be traversed when a query needs to be satisfied.
Any action that needs to be applied to data in Microsoft SQL Server needs to be done in memory. This means the pages for any index being used need to be read from disk into buffers in memory, processed and, if changes are made, written back to disk. If we consider an index with 500,000 pages and a Fill Factor of 100% (and each SQL Server index page being 8kb) a full scan of an index would require approximately 4gb of reads from disk and reservation in memory. If you don’t have enough RAM to support that read, then on top of that you'll encounter a spill to tempdb to process the query. Now, imagine that same index, but with a Fill Factor of 70%, which is not an uncommon value for Fill Factor. Now you’re looking at closer to 6gb. That’s significant even if the index is called infrequently. Now imagine that this is on your most-traversed table, and you can start to see how a single setting can have big ramifications on performance in your environment.
One Sure Bet Every Time for Fill Factor
When it comes to setting the SQL Server Fill Factor correctly for performance and underlying table configuration, there is at least one setting that you should implement every time and be on the watch for when it’s not configured properly. If you’ve got a single column index built on a column that only increments upward, such as an IDENTITY column or perhaps a datetime column that is used as a timestamp of some activity based upon the current system time, you should set the Fill Factor for the index at 100%. This ensures that you’re completely packing the index page optimally, since any new values are always going to be added to the end of the last index page in the index b-tree. Once that page is full, a new index page will be provisioned and start filling in sequence from there.
If you don’t assign a 100% value to the Fill Factor setting for such an index--and let’s use the previously mentioned example of a 70% global Fill Factor setting--the behavior will not change except that a new page will be provisioned when the last index page reaches 70% full or upon insertion of the next record into the index. This would guarantee wasting space on disk, increased IO and increased memory consumption, as explained earlier.
Identifying Candidates for 100% Fill Factor
The most common candidate indexes tend to be those built on IDENTITY columns. These are columns assigned the IDENTITY property, which will auto-populate a column starting with a seed value that is assigned when the table is originally created and increment at a rate also assigned upon table creation when a new record is added to the table. Take as an example the following table definition:
CREATE TABLE packed (ID INT IDENTITY(1,1), col1 CHAR(4) NOT NULL)
In this case, the ID column’s first value will be 1, and each time a new row is added to the table it will increment by 1. This construct is typical when using surrogate keys to uniquely identify a row in a table. If we create a clustered index on this column, we will want to completely pack each index page full as can be. There is no chance that the column values will fall out of order or the index become fragmented by the sheer nature of the table’s (and this column’s) definition. Using this as an example, the code block below will identify any single column indexes build on an IDENTITY column. The results return back a listing of those candidate indexes and their current Fill Factors. Note that in Microsoft SQL Server a value of 0% and 100% are identical and interchangeable.
Identify Single Column Indexes Built on IDENTITY Columns
Techoutbound SQL Cruise / B-Side Consulting
Use at Own Risk
SELECT object_name(I.[object_id], db_id()) AS [object_name]
, I.name AS index_name
, C.name AS column_name
, I.fill_factor AS current_fill_factor
FROM sys.indexes AS I
INNER JOIN sys.index_columns AS IC
ON I.[object_id] = IC.[object_id]
AND I.index_id = IC.index_id
INNER JOIN sys.all_columns AS C
ON IC.[object_id] = C.[object_id]
AND IC.column_id = C.column_id
/* These are single column indexes */
SELECT I1.[object_id], I1.index_id
FROM sys.index_columns AS IC1
INNER JOIN sys.indexes AS I1
ON I1.[object_id] = IC1.[object_id]
AND I1.index_id = IC1.index_id
INNER JOIN sys.objects AS O1
ON I1.[object_id] = O1.[object_id]
WHERE I1.index_id = 1
AND O1.is_ms_shipped = 0
GROUP BY I1.[object_id], I1.index_id
HAVING COUNT(I1.index_id) = 1
) AS clust_single_column
ON I.[object_id] = clust_single_column.[object_id]
AND I.index_id = clust_single_column.index_id
WHERE C.is_identity = 1
ORDER BY I.fill_factor, 1;
The results would look something like this with the candidate indexes listed first:
Using the above results, you could alter the database’s Fill Factor to 100 (aka 0) and rebuild the index. From that point forward it will use the explicitly set Fill Factor and remove any negative overhead that was the result of a poorly configured Fill Factor.