AI Field Day 8: Solidigm Makes the Case for Storage as the Foundation of AI Infrastructure

When the conversation around AI infrastructure happens, it tends to orbit around the GPU. At AI Field Day 8, recorded live in San Jose, California on May 14, 2026, Solidigm returned to the delegate table with an increasingly urgent message. Storage is no longer an afterthought you bolt on once the compute is sized. It is a day-zero design decision that shapes latency, throughput, reasoning quality, power consumption, and ultimately the economics of running AI at scale.

Across three presentations led by Kapil Karkra, Sr. Principal Engineer for AI Solutions and Software, and Scott Shadley, Director of AI Infrastructure Marketing and Strategy, Solidigm walked delegates from the lifecycle of a single AI prompt, through a new framework for measuring intelligence itself, and into the physical hardware innovations making it all possible. Here is a high-level recap of each session.

1. The Anatomy of a Prompt with Solidigm

In the opening session, Kapil Karkra takes the audience inside a single user prompt to reveal just how much storage sits behind a response that feels instantaneous. Joined by Scott Shadley, the session reframes storage as the silent assembler of every prompt an LLM ever sees then decomposes the metrics that AI architects actually care about.

Read our white paper by Jeff Harthorn covering this topic here: Anatomy of a Prompt.

Items discussed in this section:

  • Token amplification: A small user prompt balloons into tens of thousands of tokens as the AI agent layers in domain rules, retrieved context, tool definitions, and session history. Storage is what assembles this comprehensive prompt.
  • Decomposing "time to first token": The key responsiveness metric is broken into prompt assembly, network transmission, provider queuing, and the GPU-intensive prefill and decode phases.
  • Turning compute into reads: By caching stable portions of context, storage converts the expensive O(n²) cost of GPU recomputation into far faster O(n) reads, freeing GPU cycles and reducing power consumption.
  • Two workloads, two storage profiles: RAG demands excellent P99/P99.9 tail latency for small random reads, while KV cache offload is bandwidth-oriented for large block reads.

2. Scaling Intelligence Through the Memory Hierarchy with Solidigm

Kapil Karkra returns to tackle a provocative question. What if memory capacity is the third axis of AI intelligence, sitting right alongside model size and raw compute? This session introduces a measurement framework and backs it with benchmark data showing how extending memory beyond the GPU transforms both performance and the quality of reasoning itself.

Items discussed in this section:

  • The "CRAFT" framework: Solidigm defines AI intelligence across five dimensions— Comprehension, Recall, Adaptability, Fluency, and Tenacity — and measures the impact of memory capacity on each.
  • Capacity as a scaling axis: Extending memory from HBM out to system DRAM and NVMe SSDs improves inference efficiency and prevents costly recomputation.
  • Reasoning needs scratch space: In an AIME 2024 math test (Tenacity), more output token capacity let the model deliberate longer and score higher, linking storage capacity directly to answer quality.
  • Dramatic, measurable gains: Extending KV cache to NVMe delivered up to 4x throughput and 21x better inter-token latency, with a "needle in a haystack" comprehension test reading up to 78x faster.
  • The tiered memory hierarchy: Automatic caching across HBM, DRAM, and NVMe avoids GPU stalls and lets organizations balance performance against cost for larger, multi-agent, long-context workloads.

3. AI-Optimized SSD and Hardware Solutions with Solidigm

For the closing session, Scott Shadley brought the discussion down from architecture to the physical drives — and the very real constraints of power, sustainability, and footprint that every data center now faces. Using a memorable analogy that storage is the "dough" of a great pizza, Scott makes the case that you cannot simply add more bytes to existing boxes and call it an AI strategy.

Items discussed in this section: 

  • Storage tiers, end to end: A holistic map from HBM and DRAM (G1, G2) through local NVMe (G2.5, G3), in-rack DPU-led NVMe (G3.5), to network-attached NVMe and hard drives (G4), with NVMe adoption critical throughout.
  • Real-world infrastructure constraints: Power availability, sustainability directives, and physical footprint limits are now the defining challenges for AI buildouts.
  • Liquid-cooled innovation: Liquid-cooled E1.S form factor SSDs deliver substantially lower power and better thermal management, co-designed with partners like NVIDIA for optimal integration.
  • An ecosystem, not a part number: A portfolio spanning high-performance cache to durable high-capacity storage, matched to workloads through direct customer consultation rather than one-size-fits-all selling.

Conclusion

Taken together, the three AI Field Day 8 sessions tell a single story; the path to better, faster, more affordable AI runs straight through the storage tier. From the journey of a single prompt to a new framework for scaling intelligence through memory, to the liquid-cooled hardware making dense AI deployments viable, Solidigm demonstrated that the "unsung hero" of AI infrastructure deserves a seat at the design table from day zero.