Storage Soup

Nov 20 2008   3:11PM GMT

Storage vendors debate Flash as cache

Beth Pariseau Beth Pariseau Profile: Beth Pariseau

By now it’s clear that all major storage vendors will support flash in their systems. But the debate rages over whether flash should be as cache or as persistent storage.

Earlier this month NetApp revealed plans to support solid-state Flash drives as cache and persistent storage in its FAS systems beginning next year. The cache model will come first.

“We believe the optimal use case initially lies in cache,” says Patrick Rogers, NetApp VP of solutions marketing. Netapp has developed wear-leveling algorithms that will be incorporated into the WAFL. WAFL’s awareness of access frequency and other characteristics for blocks will allow it to use both DRAM and flash, with flash as the “victim cache” — a landing spot for blocks displaced from primary DRAM cache.

Why not just use DRAM? “If you have a very large amount of data, and you can’t accommodate it entirely in [DRAM] cache, flash offers much higher capacities,” Rogers says.

EMC’s Barry Burke responded about a week later with a post on his blog, The Storage Anarchist, asking some detailed questions about Flash as cache. To wit:

  1. What read hit ratios and repetitive reads of a block are required to overcome the NAND write penalty?
  2. How will accelerated cell wear-out be avoided for NAND-based caches?
  3. What would be required to use NAND flash as a write cache – do you have to implement some form of external data integrity verification and a means to recover  from a damaged block (e.g., mirroring writes to separate NAND devices, etc.)?

I asked Burke to answer his own questions when it came to Flash as persistent storage, which is EMC’s preference so far. He answered me in an email:

  1. Overcoming the Write penalty – not an issue, because storage arrays generally always buffer writes, notify the host that the I/O is completed and then destage the writes to the flash drives asynchronously. Plus, unlike a cache, the data doesn’t have to be read off of disk first – all I/O’s can basically be a single direct I/O to flash: read what you need, write what’s changed. As such, reads aren’t deferred by writes – they can be asynchronously scheduled by the array based on demand and response time.
  2. Accelerated Wear-out – not an issue, for as I noted, the write speed is limited by the interface or the device itself, and the drives are internally optimized with enough spare capacity to ensure a predictable lifespan given the known maximum write rate. Also, as a storage device, every write into flash is required/necessary, whereas with flash, there likely will be many writes that are never leveraged as a cache hit – cache will always be written to more than physical storage (almost by definition).
  3. Data Integrity – again, not an issue, at least not with the enterprise drives we are using. This is one of the key areas that EMC and STEC collaborated on, for example, to ensure that there is end-to-end data integrity verification. Many flash drives don’t have this level of protection yet, and it is not inherent to the flash technology itself. So anyone implementing flash-as-cache has to add this integrity detection and recovery or run the risk of undetected data corruption.

I also asked NetApp for a response. So far no formal response to Burke’s specific questions, but there are some NetApp blog posts that address the plans for Flash deployments, one of which links to a white paper with some more specifics.

For the first question, according to the white paper, “Like SSDs, read caching offers the most benefit for applications with a lot of small, random read I/Os. Once a cache is populated, it can substantially decrease the average response time for read operations and reduce the total number of HDDs needed to meet a given I/O requirement.”

Not as specific an answer as you could hope for, but it’s a start. NetApp also appears to have an offering in place for customers to use to determine which specific applications in their environment might benefit from Flash as cache, called Predictive Cache Statistics (PCS).

As to the second question, according to the whitepaper, “NetApp has pioneered caching architectures to accelerate NFS storage performance with its FlexCache software and storage acceleration appliances. FlexCache eliminates storage bottlenecks without requiring additional administrative overhead for data placement. ”

Another precinct was also heard from in the vendor blogosphere on these topics, with a comment on Chuck Hollis’s blog late last week. With regard to the write penalty, Fusion-io CTO David Flynn argued that the bandwidth problem could be compensated for with parallelism–i.e. using an array of NAND chips in a Flash device .

He added:

Latency, on the other hand, cannot be “fixed” by parallelism. However, in a caching scheme, the latency differential between two tiers is compensated for by choice of the correct access size. While DRAM is accessed in cache lines (32 bytes if I remember correctly), something that runs at 100 times higher latency would need to be accessed in chunks 100 times larger (say around 4KB).

Curiously enough, the demand page loading virtual memory systems that were designed into OS’s decades ago does indeed use 4KB pages. That’s because it was designed in a day when memory and disk were only about a factor of 100 off in access latency – right where NAND and DRAM are today.

This is an extension of the debate that has been going on all year about the proper place for solid-state media. Server vendors such as Hewlett-Packard argue that Flash used as persistent storage behind a controller puts the bottleneck of the network between the application and the speedy drives, defeating the purpose of adding them to boost performance. And round and round we go…at the rate it’s going, this discussion could last longer than the  disk vs. tape argument.

2  Comments on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • Beth Pariseau
    The discussion can actually be rather short if conducted by people without an agenda who know what they're talking about. As usual, EMC's spin machine is in high gear; as usual, NetApp is right on target technically. The answers to Barry's questions are 1. NAND write performance is irrelevant for flash used as a NetApp read cache because NetApp's architecture allows it to cache reads initially in its stable DRAM cache and then destage them to flash as convenient to free up the DRAM cache for new entries (that's likely what NetApp meant when it characterized the flash as a 'victim' cache, though its choice of victims might not be the the conventional one if, for example, it decided to evict the data *most* likely to be read again in order to maximize flash longevity by minimizing the flow through it). Interestingly, leveraging the existing stable DRAM cache was Barry's own answer when applied to EMC's use of flash for drives rather than cache - but not only did he ignore its applicability to NetApp's case but he also ignored the fact that his original question seemed to be about *read* caching (which of course EMC's drive-only flash approach does not support at all with flash: if you want the benefits of flash-level access performance from EMC, *all* the data that needs it must reside on its flash drives). 2. I just offered one hypothetical way that flash cache wear-out might be alleviated above. Beyond that, the victims can be destaged to flash cache in large contiguous blocks (in a manner very similar to that used by NetApp to write data efficiently to disk - see Patrick's reference to 'new wear-leveling algorithms' above), thus further minimizing flash cache write activity. And for a read cache the worst thing that can happen if a portion of the flash *does* wear out is that the data must then be fetched from disk, since the NetApp architecture is already set up to detect any such corruption even if the flash itself does not. Contrast this with conventional approaches such as EMC's in which data must be destaged in chunks far smaller than the block-erase size (at least in the kinds of small-request-intensive environments under discussion here) to specific logical locations in the flash, wholly relying upon the flash's internal wear-leveling algorithms to attempt to mitigate the resulting heavy erase/rewrite activity. In some situations, this could result in faster wear-out for the backing flash storage than NetApp's use as a cache would. 3. It's not clear why NetApp would want to use flash as a write-back cache given how well their existing DRAM cache plus bulk-write technology handles this already. It would be interesting, however, to know exactly how EMC professes to deal with the problem of data integrity in its flash *drives*, since it's difficult to see how it can ensure it against the kinds of problems that Barry envisions for cache write-back flash use without performing the same kind of mirroring and read-after-write verification that would be required of a flash write-back cache. If one doesn't take (as you apparently did above) Barry's comments as being specifically directed at NetApp's use of flash cache, one can perhaps be at least slightly more forgiving of them: they could (though it's hardly clear that they do) apply to some other vendor's use of flash as cache if it were implemented sloppily (and indeed that's a specialty of EMC spin: hypothesize the worst and let the opposition defend against it rather than make any attempt to evaluate competing products on their actual merits). However, the assumption that *anyone* will treat flash as an extension of ECC RAM rather than as a caching tier (even if it's directly connected to the system bus rather than out in the storage stack) that needs to be treated with the same skepticism as other storage tiers seems pretty far-fetched to me. Flash only modifies the existing set of trade-offs between price and performance in storage. Just as DRAM SSDs have had an enterprise niche for years, flash drives will have one for applications that need all the performance that they can get at almost any price (remove that 'almost' and they'll still use DRAM SSDs). Just as large amounts of DRAM have served effectively as (usually read) cache, so will even larger amounts of flash allow more mainstream applications to leverage the performance advantages of cache without the cost of DRAM (though where the DRAM can be shared flexibly between caching and other system uses it has added value for its higher cost). As long as flash costs over an order of magnitude more per GB than disk storage only a minority of back-end storage applications will exist for flash in large quantities (though for small installations where its cost is a relatively small proportion of the total it should enjoy increasing acceptance). So NetApp's use of it for cache during this period (however long it may last) makes perfect sense. - bill
    0 pointsBadges:
    report
  • Beth Pariseau
    Beth, hope you don't mind if we add to the SSD debate because there's a cost factor in addition to the performance issue. By pairing SSDs in the drive capacity with automated tiered storage, users only purchase the hardware they need to house their active data. They get better performance and utilization benefits than by using SSDs for cache. Automated tiered storage increases performance by keeping frequently accessed blocks of data on tier 0 SSD storage, and moves less frequently used blocks to less expensive tiers like SATA. Without automated tiered storage, users would be required to purchase SSDs for entire volumes, driving up their costs and not full taking advantage of the performance benefits.
    0 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: