Enterprise Compute Moving to the Edge

AI generated image of man standing at the edge of data collection for AI inferencing.
AI generated image of man standing at the edge of data collection for AI inferencing.

AI adoption is exploding – and the fastest way to keep pace is to push compute closer to the data at the edge. This trend is driving massive investment in edge data centers, with Research and Markets estimating that the CAGR will increase about 17%, and the global market will increase from about $15 billion in 2024 to nearly $40 billion in 2030.

IT spent the last two decades consolidating compute in the cloud. The next decade will be spent bringing compute back to the data.
Rita Wouhaybi, Solidigm AI Fellow

Edge data centers are, by definition, located closer to customers, partners, and devices that access an organization’s services. They can range from a few servers in a retail closet to racks of IT equipment in a colocation center (“colo”), and everything in between. In contrast to IoT devices, which make use of microcontrollers and real-time software, edge data centers offer full-stack application processing capabilities, only in a smaller footprint.

The reasons for the transition to edge processing include transaction responsiveness, service scalability, and data sovereignty. Let’s explore these in a bit more detail.

Why move compute to the edge?

For many organizations, primary data centers are running into constraints. Those include power, cooling, space, and system architectural restrictions. Edge data processing offers an end run around most of those impediments by distributing computation from main data centers and closer to IT service consumers. 

Here are some reasons why edge processing makes sense for today’s IT organizations:

  • Responsiveness and latency considerations: As response times increase, customer satisfaction plummets. And for safety critical decision-making (i.e., for robots, self-driving cars, drones, etc.), quick decision-making to avoid obstacles can be vital. One sure way of reducing latency is by moving computation and storage to where customers, partners, and devices use your services. By doing so, you can eliminate round-trip networking time and costs needed to transfer requests and replies to and from a core data center.
  • Scalability considerations: In many cases, data processing at primary data centers runs into system or architectural limits, both software and hardware related. It’s easy to see how power, cooling, and space constraints can be relieved by deploying edge processing, but system-architectural restrictions can routinely be freed up just as well by distributing some work out to the edge. In the long run, this can lead to faster scaling of an organization’s IT services.
  • Data reduction considerations: Sending all data – the good, the bad, and the ugly – back to the core is often unnecessary and, in some cases, can be eliminated. For example, satellite imagery could show landforms, earth infrastructure, and crops – and sometimes of cloud tops obscuring all these more important scenes. Sending all this imagery to ground stations for storage, processing, and analysis can be wasteful. Any data reduction done at the edge can increase cost efficiency by eliminating bad or redundant data at the source, and by doing so, reduce further data transmission and processing. 
  • Data privacy considerations: Data privacy requirements increasingly influence where and how data is processed. In jurisdictions with strict data residency rules, edge computing offers a practical architectural approach: by processing data locally—closer to where it is generated—organizations can more easily determine which personal data to retain within borders. Edge deployments can support infrastructure strategies by limiting data movement and enabling more localized control over data access and retention.

These are just some examples on why it may be beneficial to move data processing to the edge. Depending on your organization’s needs, you may find other benefits.

AI inference is accelerating edge growth

As mentioned above, anything that can relieve system constraints, reduce response times, and eliminate unneeded data transmission and processing is worth considering doing at the edge. One good example is AI inferencing.

Previously we discussed centralized data center’s power and cooling constraints. Generally speaking, for most organizations, AI training and inferencing infrastructure is the number one cause for approaching such limits. While organizations need large, centralized clusters of GPU-accelerated servers to do most AI model training, the infrastructure hurdles associated with deploying AI inference solutions can be much lower. By moving inferencing out to the edge, organizations can reduce their core data center power and cooling requirements to something much more sustainable. Moreover, scaling AI infrastructure to handle inferencing processing loads is much easier to accomplish if it can be done at multiple edge data centers rather than all processed at the core.

Furthermore, moving AI inferencing closer to where services are consumed can help to reduce inferencing response times. Two very different inference profiles live at the edge—each with distinct storage needs:

  • Stateless LLM/Text Chat: CPU-centric, latency-sensitive, and largely read-only once the model is loaded. Storage needs center on fast sequential reads and cost-effective capacity; write endurance is a non-issue—making dense QLC SSDs a strong fit.
  • Vision + Sensor Analytics: GPU/NPU-driven pipelines ingesting continuous video, lidar, or IoT telemetry. These workloads hammer storage with random reads plus constant small writes (metadata, feature vectors, rolling buffers). They crave high mixed-IOPS and superior write endurance to survive 24 × 7 capture cycles.

Why it matters for Solidigm: pairing the right SSD to each profile maximizes $/performance. The QLC-based Solidigm™ D5-P5336 drives shine for massive LLM repositories, while Solidigm TLC-based D7-Series drives cover vision workloads that punish NAND with ≥3 DWPD write rates—providing longevity without over-provisioning.  

Beyond performance, data reduction is another advantage of work done at the edge. AI inferencing is increasingly a principal means by which systems distinguish between good data versus bad or redundant data, which would allow elimination of bad data without impact to system capabilities. 

Similarly, regulations may require personal data processing to be done in-country or at an institution. Doing inferencing at the edge can help meet border requirements. 

AI inferencing is not the only IT workload that can help organizations meet their system goals by deploying it at the edge. But it is a great illustration of work done at the edge that showcases many of the reasons enterprises use to justify deploying edge processing. 

The Solidigm advantage

Distributed edge processing makes a lot of sense for organizations that have users far from core data centers. Responsiveness is often a main driver for introducing edge processing, but it’s not the only one.

By using edge processing to reduce and eliminate extraneous data at the source, data transmission, processing, and storage costs can all be reduced considerably. Moreover, scaling systems often run into software and hardware limits at central data centers, but by deploying edge processing, organizations can mitigate these constraints. 

AI inferencing is a good paradigm where edge processing can pay extensive dividends. As AI applications become ever more ubiquitous in enterprise IT, inferencing done at the edge can be a make-or-break decision across several dimensions.

Making smart data infrastructure choices is a key part of that process. For more information on the Solidigm product portfolio and our recommendations for edge AI, see www.solidigm.com/ai.


About the author

Ace Stryker is the director of market development at Solidigm, where he focuses on emerging applications for the company’s portfolio of data center storage solutions.

Disclaimers

Nothing herein is intended to create any express or implied warranty, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, or any warranty arising from course of performance, course of dealing, or usage in trade.

The products described in this document may contain design defects or errors known as “errata,” which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Solidigm does not control or audit third-party data. You should consult other sources to evaluate accuracy.

Contact your Solidigm representative or your distributor to obtain the latest specifications before placing your product order. 

SOLIDIGM and the Solidigm “S” logo are trademarks of SK hynix NAND Product Solutions Corp. (d/b/a Solidigm), registered in the United States, People’s Republic of China, Japan, Singapore, the European Union, the United Kingdom, Mexico, and other countries.