There are data storage experts, and there are AI workload experts. If you’ve looked for one company to provide both, you might have come away disappointed.
That’s where the new Solidigm AI Central Lab comes in.
Built on the latest technology for model training and inference, the Solidigm lab brings together storage and AI capabilities to perform cutting-edge research and, alongside key collaborators, improve bottom-line results to drive both industries forward.
The lab features high-performance GPUs like the NVIDIA B200 and H200, 800Gbps Ethernet networking and, of course, lots of Solidigm SSDs. Hosted in a facility across the street from the Solidigm headquarters in Rancho Cordova, California, the lab was developed with AI infrastructure provider FarmGPU and enables rapid spin-up of new hardware and software for testing.
The lab is designed on reference architectures that mirror what hyperscalers and enterprises are deploying in data centers across the world, making the findings broadly applicable to customer environments.
In addition to running real-world AI workloads, the lab also has the capability to collect telemetry to create a detailed picture of how system resources are used and where bottlenecks exist. This will allow Solidigm and its collaborators to recommend optimizations to improve performance and power efficiency.
In the words of Greg Matson, Solidigm SVP and Head of Products and Marketing, “Running storage tests isn’t enough anymore.”
Although new, the lab already boasts several accomplishments. In the most recent round of MLPerf Storage testing, which is designed to measure a storage subsystem’s ability to keep GPUs busy during an AI model training run, FarmGPU submitted results on an AI Central Lab single-node cluster featuring 24 Solidigm™ D7-PS1010 SSDs. Powered by our flagship performance drives, the cluster achieved 116 GB/s, the highest per-node throughput ever measured on the test. That architecture is easily scalable to multiple nodes for future tests.
The lab also hosts what we believe to be the most dense storage test cluster ever built. Outfitted with 192 Solidigm D5-P5336 SSDs with a massive 122TB of storage each, this cluster packs 23.4PB into just 16U of rack space. That’s the equivalent of more than 300 years of non-stop HD video or about 5 billion songs.
These and other configurations in the lab will enable Solidigm to push at the frontiers of storage performance and density in AI applications, beating new paths for future innovations and optimizations.
A major driving force for the development of the lab was to host hardware and software of various organizations to test and unlock new levels of joint innovation. So far, the response has been as expected.
“Interest in the lab’s unique combination of AI and storage expertise is off the charts,” Matson says, with testing and pathfinding agreements forming or underway with a variety of companies across the AI landscape.
Early findings are already helping to redefine the view of how great storage can improve AI outcomes. Earlier this year Solidigm, along with Metrum AI, a provider of AI workload analysis and enterprise solutions, published a white paper about retrieval-augmented generation (RAG). The companies conducted extensive research into the benefits of moving RAG data from system memory to SSDs, including reducing DRAM usage by up to 57% while maintaining performance and accuracy.
Work with other world-class collaborators is underway and additional announcements are imminent.
Ace Stryker is the director of market development at Solidigm, where he focuses on emerging applications for the company’s portfolio of data center storage solutions.