[Design View / Design Solution]
Overcome Barriers To Broad-Based SSD Adoption In The Enterprise
Designers must solve poor write endurance and write performance, high error rates, and security issues before enterprise server and laptop makers embrace solid-state drives on a wide scale.
At first glance, solid-state drives (SSDs) appear to be a no-brainer for makers of storage systems for enterprise servers and laptops. After all, SSDs promise higher read/write performance, higher reliability, and lower power consumption compared to hard-disk drives (HDDs). But in practice, SSD adoption has been held back not only by a higher cost per gigabyte, but also by real-world issues that prevent them from achieving their performance and reliability promises.
Last year saw the proliferation of SSD product announcements in enterpriseclass servers and laptops, beginning with 32- and 64-Gbyte devices. In servers, we’ve seen SSDs used in so-called “Tiered Storage” systems, in which the SSD acts as a higher-speed intermediary between system RAM and hard-diskdrive (HDD) storage. Fujitsu and other vendors have also begun to use SSDs in enterprise-class laptops, touting their ruggedness and higher read performance. With SSDs now in the marketplace, two key trends have emerged:
Declining costs: As NAND flash manufacturers have continued to advance technology and densities, the price per gigabyte from NAND flash vendors has dropped approximately three orders of magnitude over the last decade, starting from thousands of dollars in the 2000 timeframe to today’s commodity price levels of around a $1 per gigabyte (for MLC-based, or multi-level cell, technologies). Continued price declines are expected for years to come.
Increasing performance: At the same time, advances in flash memory as well as techniques such as DRAM caching are driving input/output operations per second (IOPS) higher, with today’s fastest SSDs sporting tens of thousands of read IOPS.
SDD CHALLENGES Despite these advances, most analysts predict a very slow ramp-up toward broad-based adoption of SSDs in enterprise- class servers and laptops. One key reason is the relatively high cost per gigabyte for SSDs (compared with HDDs). Today’s SSDs mainly use single-level cell (SLC) memory due to its higher life expectancy and reliability.
The cost of SLC memory is roughly four times higher than MLC memory due to two factors. First, MLC memory stores two bits per cell and therefore provides twice the storage per square millimeter of silicon (the main cost of the memory). Second, the volume of MLC is roughly 90% of all NAND flash, further increasing the economies of scale in its production. Unfortunately, MLC flash memory isn’t yet deemed reliable or durable enough for widespread enterprise use.
Nevertheless, MLC flash is clearly the way forward due to its ability to rapidly reduce the cost per gigabyte. Still, several challenges must be overcome when using MLC flash in its current implementation.
For example, MLC flash offers poor write endurance. NAND flash memory can only be written a certain number of times to each block (or cell). SLC memory generally sustains 100,000 program/ erase (P/E) cycles, while MLC memory is generally 10 times less at 10,000 cycles. Once a block (or cell) is written to its limit, the block starts to forget what is stored.
Today’s SSDs are different from HDDs when it comes to data storage. HDDs can take the data directly from the host and write it to the rotating media. In contrast, SSDs can’t write a single bit of information without first erasing and then rewriting very large blocks of data at one time (also referred to as P/E). In addition, to maximize the life of the flash memory, a technique to level the wear across all blocks equally forces the SSD controller to constantly move data around on the flash memory.
These factors and other differences from HDDs give rise to write amplification, which can rise to a factor of 100 times the amount of user data actually being stored. Consequently, these factors also limit the life expectancy of the SSD. Figure 1 shows the basic life expectancy formula that affects all SSDs. Figure 2 shows the details of the formula. A typical MLC drive might have the characteristics shown in Figure 3, where:
Capacity = 128 Gbytes P/E cycles = 10,000 Write speed from the host = 125 Mbytes/s Duty cycle (when the drive is accessed for reads or writes) = 40% of the time Read: write ratio (percentage of time an access to the drive is a write, versus a read) = 33% of the time Write amplification (assuming a conservative number) = 40
Clearly, 23 days is too short a lifespan to deploy in an enterprise environment. To overcome the endurance problem, SSD manufacturers use one or more of these five techniques:
Combining MLC and SLC flash on the same device, which extends endurance by storing more active data on the higherendurance SLC memory, but still lowering the total cost by using some MLC memory.
Over-provisioning, which extends endurance by making more flash available. For example, an SSD with twice as much actual storage as its stated capacity would have twice the endurance as a drive in which flash and capacity had a 1:1 ratio (no overprovisioning). Of course, this over-provisioning would also double the cost.
DRAM caches, which extend endurance by aggregating some writes before sending it to the flash memory and using it for other housekeeping (rather than the flash memory). Naturally, the DRAM also adds costs.
Daily write limitations, which extend the life of the drive by restricting the number of writes to the flash each day. For instance, one vendor’s warranty specifies a limit of 20 Gbytes per day written from the host, which can be reached in less than five minutes on that same drive.
Reduced warranties (less than five years), which account for lower endurance by simply reducing the guaranteed life of the drive.