Object storage has become so popular that it is now considered the first-tier storage solution for AI workloads, encompassing not only inference but also AI training and VectorDB use cases. There is an increasing demand to proxy S3 with a file system that can seamlessly integrate with AI training and inference frameworks like PyTorch and CUDA. To address this, the industry offers the open-source project s3fs-fuse, enabling users to bridge S3 storage with file systems.
Solidigm’s Cloud Storage Acceleration Layer (CSAL) group proposes an innovative architecture designed to provide customers with greater flexibility in defining performance SLAs and managing DRAM usage. By leveraging cutting-edge technologies like the SK hynix CXL memory module, Solidigm D7-P5810 SLC drive, Solidigm D7-PS1010 Gen5 TLC drive, and Solidigm D5-P5336 high-density QLC drive, this design ensures:
This architecture offers a forward-looking solution to address the limitations of existing open-source options and enable efficient AI training and inference workflows.
File system operations initiated by the user are first handled by the kernel’s FUSE driver. The FUSE driver redirects these operations to a user-mode service. In this architecture, the proof-of-concept S3 Fuse service developed by Solidigm manages core file system operations such as GetAttr, ReadDir, Read, and Write.
In the cache module design of Solidigm S3 FUSE cache, we chose the FIFO (First-In, First-Out) eviction algorithm. This is because FIFO not only achieves a higher hit rate (as demonstrated in the comparison tests with the LRU eviction algorithm in [3]) but also exhibits greater compatibility with the underlying NAND SSD. By working in conjunction with Solidigm's CSAL software, we have significantly reduced the write amplification on the NAND SSD. Through real-world testing, we've observed that the WAF (Write Amplification Factor) value is essentially reduced to 1.
These requests are processed by Metadata Core, which utilizes the proof-of-concept S3-FIFO Key-Value store developed by Solidigm. For detailed insights into Solidigm's CSAL append cache and S3-FIFO mechanisms, please see links 1-3 in the reference section.
The metadata core is built upon the proof-of-concept KV cache store developed by Solidigm, which employs the proof-of-concept S3-FIFO cache algorithm. The FIFO and CSAL append cache mechanisms are optimized for SSDs, making them highly efficient. The Solidigm Gen5 NVMe SSD, D7-PS1010 delivers 9.4 GB/s write bandwidth and 14 GB/s read bandwidth. By designing the software to utilize FIFO writes, the SSD's Write Amplification Factor (WAF) is reduced to 1, ensuring that the entire bandwidth is available for user data. Consequently, the KV store’s primary queue can operate at 10 GB/s bandwidth.
The small queue of the KV store can be extended using additional DRAM or SK hynix CXL memory modules, which offer up to 96GB per unit with latencies comparable to DRAM.
These requests rely on the Metadata Core to retrieve the file system’s chunk mapping. Data is then written to or read from large FIFO-based chunks, allowing users to choose QLC storage based on capacity requirements.
If file information and data are available locally, the system directly returns them to the user. Otherwise, it fetches the source data from S3.
Performance benchmarking was conducted using microbenchmark tests [5] with C++ map and RocksDB configurations with Solidigm D7-PS1010 Gen5 TLC SSD. The results demonstrate significant efficiency and scalability advantages.
This architecture emphasizes optimized SSD utilization, scalability for AI workloads, and flexible integration options, making it a robust solution for modern AI storage challenges.
The test results for Test_KV provide the following key trends and observations:
RocksDB was tested with minimal tuning for this analysis. We utilized a memory write buffer and disabled compression and block cache to better control memory usage. Further fine-tuning could yield additional performance improvements.
For S3 Fuse cache, the Solidigm D5-P5336 61.44TB QLC drive offers exceptional performance and scalability. For checkpoint writes, the PCIe 5.0 Solidigm D7-PS1010 delivers world-class write performance.
Wayne Gao is a Principal Engineer and Solution Storage Architect at Solidigm. He has worked on Solidigm’s Cloud Storage Acceleration Layer (CSAL) from pathfinding to commercial release. Wayne has over 20 years of storage developer experience, has four U.S. patent filings/grants, and is a published EuroSys paper author.
Yi Wang is a Field Application Engineer at Solidigm. Before joining Solidigm, he held technical roles with Intel, Cloudera, and NCR. He holds "Cisco Certified Network Professional," "Microsoft Certified Solutions Expert," and "Cloudera Data Platform Administrator" certifications.
Li Bo serves as a senior storage solutions architect at Solidigm. With over two decades of experience in system design and development across multiple organizations, he specializes in optimizing the performance of networked and storage solutions. In recent years, he has concentrated his efforts on advancing the industry-wide adoption of non-volatile storage technologies.
Sarika Mehta is a Senior Storage Solutions Architect at Solidigm with over 15 years of storage experience throughout her career at Intel’s storage division and now at Solidigm. Her focus is to work closely with Solidigm customers and partners to optimize their storage solutions for cost and performance. She is responsible for tuning and optimizing Solidigm’s SSDs for various storage use cases in a variety of storage deployments ranging from direct-attached storage to tiered and non-tiered disaggregated storage solutions. She has diverse storage background in validation, performance benchmarking, pathfinding, technical marketing, and solutions architecture.
Jie Chen is a Technical Marketing Architect at Solidigm, responsible for ecosystem enabling for cloud customers, especially in Data placement modes and storage AI. Prior to joining Solidigm, Jie took different technical roles as Application Engineer, Quality & Reliability, Product Development Engineer and Program Manager of varies Flash memory and Persistent memory products.
The code referenced in the article is test code and has not gone through typical validation. This code is used to validate the architecture and is not production ready. User assumes responsibility for risk by using this code in their environments.
All product plans, roadmaps, specifications, and product descriptions are subject to change without notice.
Nothing herein is intended to create any express or implied warranty, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, or any warranty arising from course of performance, course of dealing, or usage in trade.
The products described in this document may contain design defects or errors known as “errata,” which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Contact your Solidigm representative or your distributor to obtain the latest specifications before placing your product order.
For copies of this document, documents that are referenced within, or other Solidigm literature, please contact your Solidigm representative.
All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
Solidigm may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Solidigm reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase.
Performance results are based on testing as of dates shown in the configurations and may not reflect all publicly available updates. See configuration disclosure for details. No product or component can be absolutely secure.
Solidigm or Intel optimizations, for Solidigm or Intel compilers or other products, may not provide optimized performance to the same degree for non-Solidigm or Intel products. Solidigm or Intel technologies may require enabled hardware, software, or service activation.
Your costs and results may vary.
Solidigm does not control or audit third-party data. You should consult other sources to evaluate accuracy.
Some results have been estimated or simulated using internal Solidigm analysis or architecture simulation or modeling, and provided to you for information purposes only. Any differences in your system hardware, software or configuration may affect your actual performance.
© Solidigm 2025. SOLIDIGM and the Solidigm “S” logo are trademarks of SK hynix NAND Product Solutions Corp (d/b/a Solidigm), registered in the United States, People’s Republic of China, Taiwan, Hong Kong, Singapore, the European Union, the United Kingdom, Mexico, and other countries.