Advantages of Cloud Storage Acceleration Layer (CSAL)

Solidigm leverages CSAL write shaping for improved TCO with high-density QLC SSDs

 Cloud storage acceleration layer depicted with a data point overlay on a real-world intersection.
 Cloud storage acceleration layer depicted with a data point overlay on a real-world intersection.

Current data center real-estate and power consumption budgets are trying to keep up with unprecedented data growth and the need for sufficient storage performance to feed that data back to users or to AI training models. The industry is seeing a thirst for high-density storage to maximize the usage of existing data center space. 

While the HDD segment is working hard to find ways to increase density and performance, Solidigm’s 3D NAND QLC SSDs have already achieved both and, in fact, have been in production since 2018. 

With the newest 61.44TB D5-P5336 3D NAND QLC SSD, Solidigm has launched its largest capacity NAND drive yet, which comes with a cost and performance benefit that can help data centers achieve higher density.

CSAL + high-density drives for mixed workloads

Read-intensive applications can already benefit from high-density QLC SSDs like the Solidigm D5-P5336, but what about mixed workloads and data placement applications? 

To further expand the application benefits of high-density Solidigm 61.44TB SSDs with mixed workloads and emergent NVMe designs like data placement technologies such as streaming, flexible data placement (FDP), and zoned namespace (ZNS) the Solidigm team is leveraging CSAL, the Cloud Storage Acceleration Layer, a new open-source cloud-scale share-nothing storage software layer (bdev, i.e., block device) in the Storage Performance Development Kit (SPDK). 

CSAL allows design flexibility to tune endurance of the SSD for the entire platform. This allows the solution to maintain high application write performance through Emerging Storage Class Memory (SCM) SSDs, such as Solidigm’s first-generation D7-P5810 SSDs, and optimizes TCO by leveraging low-cost and high-density QLC storage while taking advantage of the TLC-equivalent read performance offered by Solidigm’s QLC SSDs. [1] Figure 1 shows memory tiering hierarchy with the slowest tier at the bottom of the pyramid and the fastest tier at the top.

Graphic of the tiering hierarchy from archive storage to QLC, TLC, SLC, CXL, DRAM, to CPU cache

Figure 1. D7-P5810 SSD and QLC SSD in a storage tiering hierarchy

Write cache architecture overview

In traditional cache architecture, high-performance storage, such as a storage class memory (SCM) SSD, is put in front of primary storage like a QLC SSD. Instead of writing data to primary storage directly, writes are acknowledged to users or applications as soon as data is written to the cache tier. Then, data is written back to the capacity tier. 

Traditional caches can help high-density NAND media to maintain write performance per TB and boost endurance for high-temporal-locality workloads. For example, a high-performance, high-endurance SCM tier can absorb frequently updated writes without sending them to the QLC NAND tier. 

How CSAL improves platform performance and endurance

The key strategy of CSAL is to leverage an SCM SSD as the cache to compact and shape user random writes to SSD-friendly writes. The goal of a CSAL design is to minimize the system-level write amplification and the wear for NAND SSDs, hence improving overall performance and system endurance of NAND-based primary storage.

CSAL improves on traditional cache technologies in three ways:

  1. CSAL uses an ultra-fast write buffer (SCM) to “sequentialize” I/O writes to the QLC device for higher performance and endurance at the system level.
  2. CSAL absorbs and compacts large quantities of user writes in the cache tier, further extending the endurance and lifespan of the capacity tier: QLC NAND SSDs. 
  3. CSAL guarantees that data in the cache tier can be written back to the capacity tier in predictable time.

Figure 2 below shows the major differences between a traditional write cache and a write shaping cache.

Read-write flows in traditional write cache vs CSAL write shaping cache

Figure 2. Differences between a traditional write cache and CSAL design

Cloud storage acceleration layer (CSAL) architectural overview

In a given scenario, CSAL is implemented in SPDK for high-performance storage systems. SPDK offers a full- stack storage system from a logic volume, a generic block layer to an NVMe driver. CSAL is implemented in the SPDK block layer and exposed as a virtual block device that consists of two physical block devices: 

  • P5810 SSD as the cache tier 
  • QLC SSD as the capacity tier

Storage applications, such as NVMe-oF (NVMe over Fabrics), can use this virtual block device as a generic block device. 

Graphic of CSAL with QLC + SLC architecture

Figure 3. Diagram of a write shaping cache block

Figure 3 shows the overall architecture of CSAL. There are several key points to highlight:

  1. CSAL is a generic SPDK block device (bdev) that supports NVMe-oF targets organically through SPDK.
  2. Application Reads/Writes go through the SPDK generic bdev layer first, and then go into CSAL bdev.
  3. The CSAL bdev layer is a virtualized flash translation layer (FTL) device that will shape a random workload into a sequential workload by leveraging the P5810 SSD as a Persistent Write Buffer and the L2P table.
  4. FTL will record user write IOs to the Persistent Write Buffer as FIFO logs on the Solidigm P5810 and the logical to physical (L2P) table is then updated to point to the P5810 LBA.
  5. When the cache capacity reaches a certain threshold, the FTL background compaction process will kick in to:
    • Read FIFO logs from the P5810 SSD
    • Evict invalid logs
    • Merge and write valid logs as large sequential IOs to the QLC SSD
    • Update the L2P table to point to the QLC LBA. 
  6. Data is written to the QLC and P5810 SSDs via the standard SPDK bdev again.
  7. An FTL device is similar to an SSD device, and defragmentation is designed to do housekeeping jobs to maintain the free space for new writes.

To achieve the above data transition, CSAL manages four key components: 

  • Logical to physical address table
  • Persistent Write Buffer
  • Compaction worker
  • Garbage collection (GC) worker

CSAL open-source deployment 

CSAL software solution architecture is not limited to any specific hardware architecture and can be deployed on various server architectures including Intel, AMD, ARM, IPU/DPU, and GPU etc. We are eager to see the open-source community’s involvement and organic growth of support for various architectures.

CSAL: The bottom line

CSAL is a write-shaping cache that unleashes the value of high-density NAND flash media. By leveraging the host-side FTL, CSAL preserves the existing software interface while transforming any write workload to a sequential write workload. Furthermore, CSAL minimizes the frequency of writes by caching frequently updated or temporary data on P5810 SSDs. 

With these two strategies, CSAL enhances endurance of the entire platform and delivers application performance. CSAL is a software-defined and flexible storage architecture for next-gen media and data placement technologies. It is easy to scale-out in data centers and can easily be tuned to various performance and TCO requirements.


References 

[1] https://www.solidigm.com/products/data-center/d5/p5336.html

Additional reading

Achieving Optimal Performance and Endurance on Coarse-grained Indirection Unit SSDs

IDC Global DataSphere Forecast, May 2022

Open-CAS / standalone-linux-io-tracer

About the Authors

Sarika Mehta is a Senior Storage Solutions Architect at Solidigm with over 15 years of storage experience throughout her career at Intel’s storage division and now at Solidigm. Her focus is to work closely with Solidigm customers and partners to optimize their storage solutions for cost and performance. She is responsible for tuning and optimizing Solidigm’s SSDs for various storage use cases in a variety of storage deployments ranging from direct-attached storage to tiered and non-tiered disaggregated storage solutions. She has diverse storage background in validation, performance benchmarking, pathfinding, technical marketing, and solutions architecture.

Kapil Karkra is a Sr. Principal Engineer and the Chief Storage Platform Architect at Solidigm responsible for the architecture of Cloud Storage Acceleration Layer (CSAL), a host based FTL. His current focus is to define a turnkey Reference Storage Platform (RSP), both software and hardware, that helps develop insights about Cloud use cases, and speeds high-density NAND SSD development and adoption. Kapil has over 25 years of storage experience and has over 20 patent filings/grants. Kapil holds a bachelor’s degree in electrical engineering from National Institute of Technology (NIT) in India and an MBA from Arizona State University.

Wayne Gao is a Principal Engineer as Storage solution architect and worked on CSAL from PF to Alibaba commercial release. Wayne has over 20 years of storage developer experience as previous DellEMC ECS all flash object storage team and has 4 US patent filings/grants and 1 EuroSys paper published.