Co-Design of Memory-Storage Systems for Workload Awareness with Interpretable Models

2026-03-16Hardware Architecture

Hardware ArchitectureMachine Learning
AI summary

The authors explain how they use machine learning to help design the error management systems in solid-state drives (SSDs) that use NAND memory. Their approach models how different workloads and memory behaviors affect reliability and performance together, rather than separately. By analyzing data from thousands of SSDs across multiple generations, they create a method to improve SSD design continuously. This framework also helps understand how error management interacts with various tasks, guiding better architectural choices.

NAND memorySolid-state drive (SSD)Error managementMachine learningFirmwareMemory architectureWorkload modelingFlash translation layerRepresentation learningDatacenter storage
Authors
Jay Sarkar, Vamsi Pavan Rayaprolu, Abhijeet Bhalerao
Abstract
Solid-state storage architectures based on NAND or emerging memory devices (SSD), are fundamentally architected and optimized for both reliability and performance. Achieving these simultaneous goals requires co-design of memory components with firmware-architected Error Management (EM) algorithms for density- and performance-scaled memory technologies. We describe a Machine Learning (ML) for systems methodology and modeling for co-designing the EM subsystem together with the natural variance inherent to scaled silicon process of memory components underlying SSD technology. The modeling analyzes NAND memory components and EM algorithms interacting with comprehensive suite of synthetic (stress-focused and JEDEC) and emulation (YCSB and similar) workloads across Flash Translation abstraction layers, by leveraging a statistically interpretable and intuitively explainable ML algorithm. The generalizable co-design framework evaluates several thousand datacenter SSDs spanning multiple generations of memory and storage technology. Consequently, the modeling framework enables continuous, holistic, data-driven design towards generational architectural advancements. We additionally demonstrate that the framework enables Representation Learning of the EM-workload domain for enhancement of the architectural design-space across broad spectrum of workloads.