Kingsoft Cloud is a multibillion-yuan independent cloud service provider in China.1 The company provides a highly secure, reliable distributed cloud storage service to deliver large storage capacity at a low cost.
The world has changed and the AI revolution has pushed boundaries, demanding new requirements for storage architectures. For years, Kingsoft has been a leader in the industry, developing a comprehensive suite of cloud computing services including Kingsoft Cloud for cloud storage platforms and WPS for office software, such as WPS Office. Kingsoft Cloud chose Solidigm SSDs for its latest object storage solution, coined KS3 Extreme. The new KS3 Extreme Speed's bandwidth capabilities dynamically extend based on data volume. The bigger the SSD, the more bandwidth the system can offer.
To keep up with today’s demanding workloads, Kingsoft customers like WPS Office demand faster access to their applications. To address this, Kingsoft expanded storage architecture in both performance and capacity. By replacing HDDs with Solidigm SSDs, Kingsoft improved the bandwidth by more than 100x to over 1 terabit per second (Tbps) per petabyte.2 This is a huge benefit for workloads such as Artificial Intelligence Generated Content (AIGC), animation rendering, and high-performance computing (HPC).
Figure 1. Evolution of Kingsoft Cloud's storage architecture
Figure 1 depicts Kingsoft’s previous architecture as compared to its new architecture. In the old design, there was a file system cache deployed in front of the S3 service because it could not support high throughput needed for intensive applications such as AI. Kingsoft needed a new, more efficient architecture with a way to remove bottlenecks. With the new all-flash design, Kingsoft clients can directly connect object storage to S3 because the object lifetime is set inside S3. This new design offers a better balance of capacity, performance, and cost.
Figure 2. Kingsoft Cloud S3 vs KS3 Extreme Speed
Today’s AI workloads use larger data sets and create larger models. To make AI simple to deploy and manage, Kingsoft has created an out-of-the-box solution to address a variety of AI workloads.
In specific AI instances, high I/O throughput is crucial for training large models. Faster storage is critical in efficiently training AI models as these systems require high input/output operations per second (IOPS) to process vast amounts of data and perform various calculations in real time.
If we take a large 175 billion-parameter data model as an example, with an assumed training data volume of 40TB, using standard object storage with a throughput capacity of 20 Gbps per petabyte, then loading all training data would take a minimum of 535 minutes.
With KS3 Extreme Speed Object Storage, boasting a throughput capacity of 1 Tbps per petabyte, the loading of all data could be completed in as little as 11 minutes,3 representing a 48.6x improvement. This is just one example. Other benefits include:
Data pressure brought by emerging services such as AI makes it imperative for Kingsoft Cloud’s hardware to remain up to date. Their original solution of improving storage I/O performance found it feasible to replace SATA SSDs and SATA HDDs, but further scrutiny determined that this was not the most cost-effective or efficient storage. Instead, by fully transitioning to TLC NVMe SSDs, Kingsoft could meet I/O performance requirements.
However, after additional research by the Solidigm team, Kingsoft found an even better storge solution with QLC SSDs. With 33% more bits per cell than TLC, Solidigm QLC SSDs enable 3x8 storage consolidation leading to lower total operational costs. Solidigm offers QLC SSDs ranging from 7.68TB to 60.72TB, with the same endurance and performance as TLC SSDs.
“We had multiple rounds of in-depth communication with Solidigm to understand each other's system characteristics, which provided us a better understanding of the value of all-flash storage. We now can reduce our web application firewall (WAF), and improve overall throughput and stability," says Hongxing Gan.
The collaboration between Kingsoft Cloud and Solidigm produced meaningful results. Both Solidigm TLC and QLC SSDs have been shown to improve the capabilities of Kingsoft’s object storage services and help reduce its operational costs. Solidigm also takes quality and reliability to the next level, with a customer care team that provides Kingsoft overall more effective support.
“Kingsoft Cloud will continue to strengthen its technical and product capabilities based on all-flash media, combined with the development of Solidigm QLC technology, focusing on cost to create high-performance and cost-effective object storage products, and delivering greater value to users in various civil sectors," says Hongxing Gan.
Jeniece Wnorowski, Product Marketing Manager at Solidigm, has over 14 years of experience in data center storage solutions. Jeniece got her start in technical marketing at Intel Corporation, then joined Solidigm where she continues to evangelize data center SSD innovations with a variety of companies and partners. Outside of work, Jeniece enjoys spending time with her kids, training for jiu jitsu, and exploring the outdoors.
Wayne Gao is a Principal Engineer and Solution Storage Architect at Solidigm. He has worked on Solidigm’s Cloud Storage Acceleration Layer (CSAL) from pathfinding to commercial release. Wayne has over 20 years of storage developer experience, has four U.S. patent filings/grants, and is a published EuroSys paper author.
[1] https://www.macrotrends.net/stocks/charts/KC/kingsoft-cloud-holdings/total-assets