I've previously mentioned the tremendous amount of information available about SQL Server's features, functionalities, and best practices. However, there are SQL Server features for which there’s little information available. For example, as I was browsing through the January SQL Server 2008 Books Online (BOL) update I stumbled across a new stored procedure. I thought it was strange that a new stored procedure would show up in a documentation refresh without a corresponding service pack, but then I realized the procedure had been in the product since SQL Server 2008's release to manufacturing (RTM) and Microsoft was just now documenting it.
There are really two related procedures I’m talking about: sp_clean_db_free_space and sp_clean_db_file_free_space. Actually, the first procedure (sp_clean_db_free_space) just calls the second procedure (sp_clean_db_file_free_space) for every file in a database, as you can see if you look at the definition of the first procedure (i.e., EXEC sp_helptext ‘sys.sp_clean_db_free_space’). If you don't see these stored procedures in your version of BOL, make sure you have updated to the latest version at www.microsoft.com/downloads/details.aspx?familyid=765433F7-0983-4D7A-B628-0A98145BCB97&displaylang=en or look at the online documentation about these two procedures at http://msdn.microsoft.com/en-us/library/dd408732.aspx.
So what gets cleaned when you clean a database or a database file? BOL states that these procedures "Remove residual information left on database pages because of data modification routines in SQL Server." The documentation is referring to the fact that SQL Server frequently won't physically remove data from a page when you delete a row or update a row causing it to move. The old data isn't visible using SELECT operations, but it’s visible using operations such as DBCC PAGE. If the data is sensitive, the fact the data is visible using certain operations could be a security concern. SQL Server will typically run a background thread that cleans up the data left on the pages because of your data modification operations, but the sp_clean_db_* procedures let you force it to happen more quickly.
It turns out that there's another benefit of using these procedures. SQL Server 2005 introduced a new feature called fast file initialization (or instant file initialization) that lets SQL Server create new databases or add new files to existing databases quickly because it doesn't go through the entire file and zero out all the pages. Fast file initialization is the default behavior, so if you don't want SQL Server to zero out all the pages you have to jump through a few hoops. For details about what those hoops are, and to see an awesome example of why fast file initialization isn't always a good thing, take a look at Kimberly Tripp's blog post "Instant Initialization - What, Why and How?" So if you created a database with fast file initialization and then realized that wasn't what you wanted, you can run the sp_clean_db_free_space procedure.
It turns out that these procedures are available in SQL Server 2005, they just aren't documented. You can verify their existence using the following code:
select * from sys.system_objects where name like 'sp_clean_db%'
These procedures are also available in SQL Server 2000, even though it doesn't offer fast file initialization, because there was the problem of SQL Server 2000 not cleaning up page space after a data modification operation. It's likely these procedures are in SQL Server 7.0 as well, but I don't have that version easily available for verification right now.
Sometimes cleaning up can help us find useful things we didn't know we had, but here's a useful tool that will help with cleaning. By taking a few minutes to peruse BOL or the metadata list of supplied procedures, views, and functions, you can discover hidden treasures on your own!