When a Delete isn’t Really a Delete…

Q: When I delete lots of data from a table with a clustered index using a DELETE statement, the table size isn’t reduced like I would expect it to be. The DELETE transaction commits without errors. It seems like the space from the deleted records is freed up randomly. Can you explain this behavior?

A: What you’re experiencing is the mechanism SQL Server uses to delete records—it’s called ghosting. When a record in a table with a clustered index is deleted, it isn’t actually deleted right away. (As an aside, ghosting also happens for tables without clustered indexes when snapshot isolation is enabled, and it also happens for all non-clustered indexes.)

What happens is that the record is simply marked as deleted in the record header—the record becomes a “ghost record.” This is a performance optimization that allows DELETE transactions to roll back quickly.

Imagine if the DELETE operation physically removed the record and the space on the database page was reused. If the DELETE transaction rolled back, the rows would have to be inserted again in their entirety. By simply marking the record as deleted, a transaction rollback simply becomes the act of un-marking the records as deleted again.

This is obviously much more efficient, but what happens when the DELETE transaction commits, and the record(s) still exist? The answer is that they’ll hang around, taking up space in the table until they’re cleaned up by a background process called the ghost cleanup task. If the ghost cleanup task doesn’t remove the deleted records, the space they occupy won’t be freed up. This is what you’re seeing.

So when do the deleted records get removed? The act of deleting a record doesn’t put that record on the queue of work for the ghost cleanup task. It’s not until the database page on which the deleted record resides is read again by the SQL Server Storage Engine that the page will be queued up for definite ghost record removal.
The ghost cleanup task will look at its queue of things to do, and if it is empty, it will process part of a database on an instance to see if the database has ghost records to be removed. The only way to make sure that the space used by a ghost record is freed up quickly is to do something such as a table scan or an index rebuild so that all pages in the table are accessed by the Storage Engine and ghost records are queued for removal.

This two-step process can cause significant performance problems on some systems, so you might want to turn it off. You can do this using trace flag 661, but be aware that you’ll need to cause the removal of ghost records using ALTER INDEX REBUILD or ALTER INDEX REORGANIZE.

You can read about ghost records and the ghost cleanup task in more depth in my blog posts “Inside the Storage Engine: Ghost cleanup in depth” and “Ghost cleanup redux.”

Comments

Plain text