Without a doubt, the single biggest problem with SSD storage is that the devices wear out over time. Each time that data is written to a NAND cell, the cell is slightly degraded. Given enough write operations, a NAND cell will eventually fail. With that said, there is no universal consensus on when an SSD will likely fail. Examining the realistic life expectancy of SSD storage will help organizations determine when the storage format does--and doesn't--make sense.
Most people would agree that SSDs are reliable enough for mainstream use, but I have on occasion read blog posts claiming that today’s SSDs are far more durable than even the best HDD. At the same time, however, I have met at least two or three people in the last year who have told me that they would not use an SSD because of its lack of durability. I can understand why people have such differing opinions on SSD storage durability. After all, there are different classes of drives, with differing characteristics. An SLC disk, for example, will typically have far better durability than a QLC disk. Similarly, the way that a disk is used plays a huge role in its durability. Write-intensive applications will degrade an SSD far more quickly than an application that performs only occasional writes.
The first step in determining how long an SSD device will last is to accept the idea that the number of program/erase cycles that the device is subjected to is what usually determines how long the disk will last. With that in mind, manufacturers provide a few metrics that can help you to predict a drive’s longevity. Even so, it is impossible to know with any certainty when a disk will fail.
Two of the more important factors to examine are the total terabytes written over time and the drive writes per day. The disk manufacturer will provide you with the total number of terabytes that are estimated to be written over time. For example, the computer that I am using right now contains a 1 TB Western Digital SSD (WDS100T2BOA). According to the manufacturer, the disk has a Terabytes Written (TBW) rating of 500. This means that over the life of the drive, I can expect to be able to write about 500 TB to it. This does not necessarily mean that each of the disk’s NAND cells supports 500 program/erase cycles. Most SSD manufacturers include extra NAND cells on the disk. These extra cells can take over for cells that are heavily worn.
The drive writes per day value is an estimate based on how you use the drive. For example, if you are writing 2 TB per day to a 1 TB drive, then you are performing about two drive writes per day. So, if this particular disk supports a total of 500 TBW and you are performing two drive writes per day (DWPD) on a 1 TB drive, for a total of about 2 TB per day, then the drive might be expected to last roughly eight months (500 TBW / 2 TB per day = 250 days).
Keep in mind that this is an estimate, not an exact value. For one thing, cell durability is usually expressed as an expected range. A TLC disk with 3D NAND, for example, is expected to sustain anywhere from 1,500 to 3,000 write cycles. Another reason a disk’s calculated longevity should be treated as a rough estimate is because disks are rarely used evenly. Even with wear leveling, there may be some areas of the disk that are written to more frequently than others.
Another factor to consider it the disk’s Mean Time to Failure rating. The Mean Time to Failure is a value provided by the manufacturer based on the results of endurance testing. The Western Digital drive that I mentioned earlier, for example, has a Mean Time to Failure rating of 1.75 million hours.
It is worth noting that this rating does not mean that the disk’s average life expectancy is 200 years (1,750,000 hours). Instead, Mean Time to Failure is defined as “the number of total hours of service of all devices divided by the number of devices.”
The Mean Time to Failure takes into account the total number of devices being tested, the number of hours that the test lasted and the number of devices that failed. I have no idea what the specifics were for the Western Digital tests, but here is an example of where the 1.75 million hour Mean Time to Failure value could have come from.
Let’s suppose for a moment that Western Digital decided to test 17,500 SSDs, and that the test lasted for 5,000 hours (which is roughly seven months). Let’s also suppose that during that test, 50 drives failed. Here is how the math would work out:
17,500 drives * 5,000 hours of testing each) / 50 failures = 1,750,000
This calculation would yield a Mean Time to Failure value of 1.75 million hours. So, as you can see, the Mean Time to Failure value is not a direct indication of how long you should expect an SSD to last. Even so, disks with a higher Mean Time to Failure value are probably more reliable than those with a lower value.
Ultimately, there is no formula that you can use to determine exactly how long an SSD will last. Even so, the disk’s TBW and Mean Time to Failure ratings should give you some idea of what to expect.