Emerging Use Cases for Data Center SSDs from Core to Edge

Three emerging use cases- SSDs in smart ag, SSDs in autonomous vehicles, SSDs in edge and data center.

We are generating and consuming data like never before. Across the world, and in every industry, data never sleeps. Experts project that by 2025,¹ we will generate, consume, and analyze 181 Zetta Bytes of data per year. With a compound growth annual rate (CAGR) of 23%, what we are observing is a massive surge in data generation and consumption. The question becomes, where are we going to store all that data?

The scope of the need for data storage

When we talk about data storage locations, there are three main subcategories to consider. In endpoint data, information is located in a device itself, like a laptop, mobile device, wearable tech, vehicle, or local server. At the edge, the data from these endpoints flows to enterprise-hardened locations like regional offices and small data centers, or to the core, (aka, the cloud).

Imagine:

Every minute, we are uploading 66K Instagram pics
Every minute, we are generating $500,000 worth of Venmo or AWS transactions
We perform 6 million Google searches per minute

While we have been producing more data year-over-year, usage has changed in the last few years. Now instead of producing it in temporary caches, then overwriting it, we are storing this new data at unprecedented rates. Most of that data will be stored in the cloud or at the edge.²

Cloud native data is the future of content at the core

The majority of this data is going to the cloud. Gartner estimates more than 95% of new digital initiatives will be cloud-native by 2025.³ Data centers housing this content will need to be cooled. In fact as much as 40% of data center energy cost can be attributed to cooling. And with advanced AI models growing at a rate of 10,000x every few years,⁴ we need to address power consumption while keeping data center drives performant and, at the same time, gentler on power consumption. Multi-story data centers make those cooling and power challenges even more pronounced.

Data at the edge is driven by Internet of Things

At the same time, the data storage phenomenon referred to as “edgeification” is real now. The Internet of Things (IoT) is impacting data creation, which in turn impacts data storage and retrieval. With about 14.5 billion connected IOT devices by the end of 2023 and local edge SSD CAGR growing 50%. ^5,6 Edge presents unique locality challenges ranging from weight, form factor and ruggedness.

Solidigm answers the need for data storage and sustainability

Now more than ever, we need SSDs with higher density that provide the right endurance with an eye on sustainability for the real world. One of the key challenges of sustainability is footprint reduction. By increasing unit level density and providing the right endurance, Solidigm storage solutions go a long way to drive those sustainability metrics; drive disposal cost reduction, rack level consolidation, and total power reduction. 

Now, let’s look at three of the emerging use cases spanning core and edge to see how the density and endurance requirement is shaping up in real-world applications.

1. Autonomous driving data usage 

For advanced driver-assistance system (ADAS), commonly referred to as autonomous driving, there is massive amount of data logging and detection work that needs to take place. SSDs are the right choice here for their superior shock and vibration specifications and capabilities. Bumps and road imperfections demand that for this edge- storage application. These systems can have a fill rate requirement as high as 19TB/hour. While it is not a 24/7 duty cycle, 17,600 mins per year driving profile will generate 5PB/year of data.⁷

Multiple SSDs in the Solidigm SSD portfolio of products can meet the endurance requirement in smart driving applications. But more importantly, QLC product lines can provide both the performance and endurance required for some of the emerging storage use cases including applications like smart agriculture and precision agriculture which is improving crop yields while optimizing resources such as labor, water, and fertilizer. These are processes that drive a lot of data need at the edge and primarily read dominated for decision making.⁸

2. Robots and drones data usage

Robotic systems, drones, and satellite image analysis create real-time data. For example, in the case of the See and Spray feature from John Deere, there are 20 images per second per crop being taken while the vehicle is moving. These images are then compared against 1 million stored images to understand the area that needs to be sprayed. This is assisted by an onboard camera capturing images every moment, accounting for 6TB/day of storage needed per sprayer.⁹

3. Object storage data use case

At the core data centers, object store is the data storage solution for unstructured data. Imagine a user needing to expand to cover a 5PB data pool. Dell EMC’s F600 or F900 systems are excellent installations for that. The Dell F900 can house almost a petabyte of storage. According to Dell’s own field trace data analysis they could extract 14 years of drive life from some of the deployed D5-P5316 drives.¹⁰

Real-world data storage conclusions

From some of these real-life use cases, we can focus on some drive level analysis and fleet data. For example:

A University of Toronto study shows that a majority of the workload in an enterprise deployment was 80% read and 20% write.¹¹
Storage Review conducted an analysis of 100 trillionth digits of Pi using 19 of the Solidigm D5-P5316 QLC drives, and measured the extrapolated endurance, far exceeding the 5 year drive warranty period, meaning endurance is not an issue for these High Performance Computing (HPC) applications. (You can read about that analysis here.)
NetApp performed a study of SSD operational characteristics with their large-scale deployment in the field. While there are many things to glean from this study, one of the most important findings (as noted in section §3.1.2) was that “the vast majority of the population (~95%) could move toward QLC without wearing out prematurely” if those QLC SSDs can provide 3K PE cycles or above and Solidigm has already achieved this with our QLC offerings.¹²

Choosing the right storage solution

There is an increasing need for dedicated swim lanes for storage use cases. Which SSD you choose for your application will depend on your target drive writes per day (DWPD) as well as your workload; write-heavy, read-heavy, or some mixture of both. One size does not fit all, and the best results will come from weighing multiple factors as you plan your storage solution build. Figure 1 can help you get started choosing the right storage solution for you for the application you want to run

QLC vs SLC usage examples

At Solidigm, we provide a number of endurance levels and performance levels, dedicated for a wide range of applications. As you see from Figure 1, the Solidigm D7-P5810 (SLC NAND based) provides the highest endurance, measured in terms of DWPD. At the same time, the QLC-based SSDs with 0.5+ DWPD, by the virtue of their massive capacity (up to 61.44TB), deliver the highest petabyte-written capability of the Solidigm family.

The following table shows how Solidigm’s drives can be utilized for a range of data center workload based on their relative write endurance capability. As we embark on use cases from core to the edge, understanding the right requirement mix of density, endurance and performance will be key to making sure you have the right storage solution for your use case.

Target applications for DWPD SSD endurance with SSD usage examples.

Figure 1. Endurance swim lanes with target applications and usage

How storage use cases are evolving

Some of the key use cases for SSDs have been evolving as solid-state storage technology has propelled into multiple directions with use case expansion.

1. Endurance

First generation SSDs were expected to pack up a very high write endurance requirement, in some cases 10 to 20 DWPD. Write amplification factor (the amount of time you have to write to NAND compared to host requests) has gone down since then due to better SSD firmware architectures and SLC, TLC and QLC NAND endurance level in terms of program/erase cycle has somewhat standardized.

JESED219 type of standards have provided much-needed clarity on the type of workload used for endurance measurement. The upper section of Figure 1 is an example of how the workload landscape can be fitted to the unique offering of endurance defined by each of the swim lanes of products.

2. Performance with impact (H3)

With generational changes happening in solid-state storage and use case evolution, there are areas beyond bandwidth and IOPS that are gaining more attention. Consistency of IOPS after transitional workload, latency response post TRIM operations, and looking at low-to-medium QD performance are of interest.

With drive sizes getting larger, newer techniques are being implemented at the NAND and drive firmware level to take advantage of multi-tenancy applications, where IOs on one tenant doesn’t cause latency impact on the other. In the future, granular controls with Flexible Data Placement (FDP) mode will help the host place data, without incurring endurance and performance loss that can be caused by internal garbage collection of the drive firmware.

3. Form factor

Gone are the days of one-size-fits-all story for data center SSDs with an HDD-inherited 2.5” form factor. EDSFF has provided different form factors with better signal integrity and hot plug robustness to the connectors. Implementation of a common connector design enabled multiple long, short, and tall form factors for deployment in cloud and data center platforms.

4. Data centric features

First generation SSDs deployed for data centers had two key features as must-haves: 1) Power loss data protection and, 2) The ability to shield the data end-to-end with ECC as it traverses from temporary buffer to the NAND media.

Advanced features like out-of-band management, telemetry, and the ability to track latency and drive health on the fly are must-have features in modern day SSDs. With the advent of computational storage and AI, we expect to see these technologies deployed for failure prediction of the drive itself for future deployment. One key area of future development is the sustainable use of SSDs for longer life: reuse, repurpose, and re-provision for that extra mile of usage.

Key parameters for SSD use cases that are impacting future SSD development.

Figure 2. Use cases dictating SSDs of the future

Conclusion

With data growth and the maturity of SSDs, usage models are no longer tied to traditional enterprise application. Cloud Service Providers (CSPs) revolutionized the deployment of SSDs at a massive scale. CSPs also helped shape the industry by embracing new features, encouraging new form factors, and has taken advantage of the differentiated portfolio based on NAND media and endurance tier. Emerging use cases at the edge along with the need for storage to propel the AI revolution will further impose newer requirements for the next-generation SSDs.

Gone are the days of fitting one density, with one form factor for the need of the overall market. Innovation for the future will be led and defined by the workload-need of the storage use case landscape. As a technology, Solidigm SSDs are ready and nimble enough to take those on.

Notes

[1] https://explodingtopics.com/blog/data-generated-per-day

[2] https://www.red-gate.com/blog/database-development/whats-the-real-story-behind-the-explosive-growth-of-data

[3] https://www.datacenterdynamics.com/en/opinions/the-five-big-trends-powering-tomorrows-data-center/

[4] https://pages.dataiku.com/report-idc-2023

[5] https://iot-analytics.com/number-connected-iot-devices/

[6] https://www.idc.com/getdoc.jsp?containerId=US50673423

[7] https://www.solidigm.com/products/technology/inonet-used-solidigm-qlc-drives-for-duration-cost-accuracy-of-test-drive-results.html and https://www.visualcapitalist.com/network-overload/

[8] https://www.sciencedirect.com/topics/earth-and-planetary-sciences/precision-agriculture#:~:text=Precision%20agriculture%20(PA)%20is%20the,of%20fertilizers%20and%20irrigation%20processes.

[9] https://www.deere.com/en/sprayers/see-spray-ultimate/

[10] https://www.storagereview.com/review/dell-powerscale-benefitting-from-qlc-ssd-economics-and-performance

[11] https://www.usenix.org/conference/fast22/presentation/maneas

[12] https://www.usenix.org/system/files/fast22-maneas.pdf

About the Author

Tahmid Rahman is the Data Center Director of Product Marketing at Solidigm. His primary responsibilities include product positioning, benchmarking, and customer requirement integration for current and future products. He has a bachelor's degree from Bangladesh University of Engineering and Technology, and a Masters degree in Electrical Engineering from Texas A&M. He also holds an MBA from University of California, Davis. He loves outdoor activities including sightseeing and hiking.

What are you looking for?

Welcome

My Profile

mySolidigm

Settings

Sign Out