Blog

Gary Orenstein

VP of Products, Fusion-io

Speeding up Writes Using Read Caching

Speeding up Writes Using Read Caching

Posted: 10/19/2011

As caching with server side flash becomes more prevalent, including our own upcoming ioTurbine software, it is important to keep a few big picture themes in mind.

Read caching offers a number of advantages such as:

  • Retaining existing disk infrastructure as-is;
  • Retaining a single source of truth by only persisting data in one location;
  • Simple scenarios for data-in-flight;

and…

  • Better write performance.

This last one is not always intuitive. By offloading an underlying storage system from having to serve reads, the system frees up considerable resources to handle writes. In many cases this avoids the need for write caching while retaining an elegant and simplified data path.

Consider a common 70% read, 30% write workload. If you remove the reads, you are left with just the 30% writes that can then take up the full 100%–or triple the performance of writes. So, while read caching directly speeds up reads, it indirectly speeds up writes as well. And, it does this without changing the  "persistence model" of high availability, disaster recovery, backups, or any other data management task.

But, with some storage arrays it's even better than a 3x improvement on writes. Systems like NetApp ingest data extremely well because they use streaming write or log structured data layouts when doing pure writes. So the write improvement can be as much as 10x. Here's why…if you remove the contention to move the disk heads that is caused by random reading, then the heads can stay in one place streaming data continuously to the log. This results in very fast writes, because even random writes sequentialized to the log and do not require disk head movement. Note that this is not the case for traditional performance oriented arrays that use "update-in-place" writing methodologies where the disk head moves for random writes.

The chart below shows this principle in action when using our own ioTurbine software and flash memory within a server combined with a NetApp storage array. The midway point is enabling the caching features of ioTurbine software, upon which writes skyrocket on the storage system and reads reduce to zero.


Figure 1: Introduction of Read Cache Improves Storage System Write Performance by 10X

  • Setup: IOMeter benchmark running in 8 VMs
  • Each VM has

1 VCPU

1 IOMeter thread, 32 Outstanding I/Os

4GB dataset, 4GB cache

70% Reads, 30% writes

  • NetApp primary with ~6K IOPS performance
  • Start with no cache, then add cache

Without a log structured layout, most SSDs that rely on embedded micro-controllers to emulate disk drive protocols and rely on basic Flash Translation Layers (FTLs) suffer when doing blended reads and writes, too. They can do pure reads fast and they can do pure writes okay, but when you ask them to do both the performance drops through the floor. The performance forms a "bath-tub" curve as shown below:

Figure 2: Most SSDs Cannot Support Real World Mixed Workloads Well

Fusion ioMemory avoids this behavior by not depending on choke-point embedded micro-controllers and using a streaming, or log-structured data layout. The ioDrive products achieve fast random reads because flash's seek time is a mere tens of microseconds, and we achieve fast random writes because we use a log structure that does not depend on micro-controllers. This is actually why we get more random write IOPS than read IOPS - all writes are sequentialized.

It is interesting to note that hard disk drive-based storage arrays have the same challenge as flash. The arrays that do update in place, like EMC and 3par are similar to SSDs that use basic FTLs, and cannot write as fast. However, on storage arrays like NetApp that sequentialize writes, if you offload the read contention for the disk heads, you free up the underlying system to maximize its full write potential.

Sign in to leave a comment:
Use the form above to be the first to leave a comment.