AI, Storage & the NYSE: Behind the World's Fastest Markets

GTC 2025 video interview hosted by Roger Corell

Watch as Roger Corell of Solidigm talks with Anand Pradhan of the AI Center of Excellence for Intercontinental Exchange about how the New York Stock Exchange uses AI in its data center. Anand explains that storage plays a large part in securing and aggregating the data used in their over 700 billion transactions per day.

Want to explore more about the role of storage in AI?


 


Video Transcript

This transcript has been edited for clarity and conciseness

Roger Corell: Hi, I'm Roger Corell. I head up our AI and Leadership Marketing at Solidigm and with me I've got Anand.

Anand Pradhan: Hey, I'm Anand Pradhan. I am the head of AI Center of Excellence for Intercontinental Exchange.

Roger: So, Anand, tell us about just how you use technology to provide some of the services that you provide.

Anand: Just to give some perspective, intercontinental exchange has exchanges including the largest stock exchange in the world called New York Stock Exchange.

Roger: Right.

Anand: [Also] clearing houses and then we have mortgage technology as well as data services. So three big segments of technology that drives our company. Many people know about New York Stock Exchange. We do more than 700 billion transactions every single day across our exchanges. So we are super-focused on building the technology the right way. We have super-optimized or hyper-optimized stack, you know. So we like to understand soup to nuts. Everything that goes into moving terabytes of data through our network, through our data centers. So we build sophisticated data centers. Everything is optimized from network to server to CPU to all the drivers and things like that. And we are a regulated industry, so we have a redundancy model for everything. On top of that, we are one of the largest aggregators of data from across the world. We provide our reference data for trading customers to take our data set so that they can build their statistical analysis on the data to trade in our venues. And right now we have mortgage and real estate technology also. So we gather data from across the US. We build different data products to do that thing. We provide loan origination and servicing platforms for the housing market. So we are focused on optimizing that also.

Roger: You talked about terabytes of data. So can you talk a little bit more about the role of storage and maybe some of the storage challenges and trends you see emerging?

Anand: Yeah, I mean look one of the systems that I have led and built [that] processes 700 billion transaction every single day to do the data analysis on top of that. So you can think of our data sets in our exchanges with nanosecond granularity. Everything is a time series data, and all the analytics needs to extract information out of multiple different data sources. So everything needs to line up to extract information so that we can provide the output to our analyst to look at if somebody is trying to game the market, right? So in order to do that thing, we deal with around 10 to 12 terabyte of nanosecond granularity binary compressed data. And then writing to the disc as well as getting it out of the disc plays a huge role, because we understand the compute need for that but the storage has to be optimized so that the compute nodes are not starving.

Roger: [You need to] keep them optimized, keep them utilized.

Anand: Because of the IO bottleneck and things like that. So we spend quite a bit of time with my team [and] our infra-solutions architecture team to test everything and then on top of that the storage needs to be flexible, highly scalable so that we can deal with any failure that is happening. How do we replace it? How do we make some checkpointing? And the whole nine yards around storage.

Roger: Okay. So you mentioned checkpointing. When you say checkpointing I think of AI. And here we are at the largest AI conference in the world. Talk about AI. Talk about your company. What models are you deploying now? [What] value you're seeing, maybe how you see it kind of evolving moving forward.

Anand: Yeah. So, [the] majority of our workload is inference-based workload in the sense that we have deployed all the open-source model so we are focused on optimizing the stack. What it takes to deploy a model in our data center, in our CPU/GPU cluster. And we are focused on understanding each of the layers that goes from chip to  the microservice layer. So that's what we do. We also use some of the managed services for some of the “low hanging fruit kind” of deployments. So we have been reliant on some of the cloud providers who are providing managed services. And we have enterprise license agreements with all the things for our data protection. And so on and so forth. So what we are doing is that, for example, data aggregation is one of the big things that we do. We extract information out of dataunstructured data to structured data. So we normalize and we sell that thing. We use AI for doing that thing. That is one of our biggest workloads. Now you think about images; photographs of documents, photographs of houses, rooms to roofs, and things like that. And you need to extract information out of that to figure out what is the artifact in that picture. Whether it is a red door or a kitchen or any of the things. We are using AI for doing that thing. And this is at a large scale because we are dealing with photographs from across the US. And we are providing services to our MLSS.

Roger: So your AI usage and some of the future usages that you see, how is this shaping your thinking on the role of storage? What are you looking for in storage as you increase your adoption of AI?

Anand: To me, AI is no different than what we have done for HBC, right? So at the end of the day, you need a storage where we can store any kind of data: photographs, videos,  flat files, and things like that. And it needs to be readily accessible. Once you start going, let's say you are dealing with millions of files and across any of the dimensions that we need, we need a faster access for that. And then once we extract information, we need to write into the disk and the retrieval becomes a key point. So we are invested in that. And then it needs to be horizontally scalable [for] when our storage needs increase [and] easily accessible.

Roger: We really appreciate your time today, Anand. I know your schedule's busy. It's a super active conference and it's tough to find time to get together, so we appreciate it. If you want to share, where can viewers find more information?

Anand: Thank you for having me. I'm on LinkedIn. You can look me up [with this link].

Roger: All right, cool. Thank you.