If It's Instant, It Isn't A Backup

Pretty much every storage vender supports the concept of snapshots.  Often these storage snapshots are presented to companies by the vendor as a way to take SQL Server backups without having to worry about eating large amounts of space.  Some companies will even use this instead of tape backups. Due to the way these snapshots work they do not make reliable backup.

Related: Should the disks that your are backing up be aligned?

Remember for a moment what the cause of the large majority of database corruption is … it is a problem with the storage, where the storage either incorrectly writes a block to the disk or a physical disk fails causing the data to be lost.  When using snapshots, the snapshots use a technique called “copy on first write” where, as blocks are written to for the first time by the disk array, the original block is copied to another location to be stored as part of the snapshot.  Because the snapshot isn’t a full copy of the volume, and at first contains no blocks with just the new blocks being written, the snapshot isn’t a full copy of the database.  As the snapshot isn’t a full copy of the disks the snapshot can’t be trusted to be a valid backup.  If the source disks were to fail and the data on the production disk (where the database is stored) was lost, the snapshots would be useless as the blocks which haven’t been written to yet would have been lost.

The same would apply if a block which hasn’t been written to before, and therefor does not exist in the snapshot was to become corrupt, either by a cluster on the disk going bad, or the array writing data incorrectly to the disks, etc. the block wouldn’t be available for backups and the backup would not be valid.

This risk can be reduced but not eliminated by replicating the LUN which contains the storage we want to use snapshots for a backup as to another storage array, then taking the snapshots on that synchronously replicated array.  This however may not completely protect you.  If the production array (the one that the SQL Server is talking to directly) was to write a bad block to the array, that bad write could be replicated to the remote site.

The next problem with only using snapshots for backup is that you only have a finite amount of space which can be used for backups as SAN storage is very expensive.  In the long run tape is much cheaper for long term storage as the tapes can be sent offsite and kept for effectively ever.  Having the same data retention policy on active storage simply isn’t a cost effect solution for most people as eventually your snapshots will end up being the same size as your source disk so for a 100 Gig database, and you have snapshots which are taken daily, and you keep one years’ worth of snapshots you will end up having terabytes of space tied up in the snapshot backups, which could end up costing hundreds of thousands of dollars in disk space.  With tapes on the other hand the tapes are very inexpensive and can hold hundreds of gigs per tape which can then be sent offsite to another location (usually via an offsite storage company) for long term safe keeping.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.