Performance Models for a Two-tiered Storage System

📅 2025-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address inefficient data migration and inaccurate performance prediction in heterogeneous storage systems (NVMe cache + HDD backend), this paper designs and implements a distributed two-tier storage system. We propose an online reinforcement learning–based dynamic data tiering scheduling algorithm and develop an end-to-end performance model integrating queuing network theory with fine-grained device behavior modeling. Our key contribution is the first scalable, fine-grained device behavior modeling method tailored for heterogeneous storage—enabling adaptive tiering management and precise performance prediction under high-concurrency I/O workloads in multi-core clusters. Experimental evaluation on multi-node clusters demonstrates an average model prediction error of less than 8%, a 27% improvement in I/O throughput, and a 34% reduction in average access latency. The framework provides a reusable modeling and optimization foundation for two-tier storage systems.

Technology Category

Application Category

📝 Abstract
This work describes the design, implementation and performance analysis of a distributed two-tiered storage software. The first tier functions as a distributed software cache implemented using solid-state devices~(NVMes) and the second tier consists of multiple hard disks~(HDDs). We describe an online learning algorithm that manages data movement between the tiers. The software is hybrid, i.e. both distributed and multi-threaded. The end-to-end performance model of the two-tier system was developed using queuing networks and behavioral models of storage devices. We identified significant parameters that affect the performance of storage devices and created behavioral models for each device. The performance of the software was evaluated on a many-core cluster using non-trivial read/write workloads. The paper provides examples to illustrate the use of these models.
Problem

Research questions and friction points this paper is trying to address.

Design and analyze a two-tiered storage system
Develop online learning for data tier management
Evaluate performance using queuing and behavioral models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed two-tiered storage using NVMe and HDD
Online learning algorithm for data tier management
Performance modeling with queuing and behavioral analysis
🔎 Similar Papers
No similar papers found.
A
Aparna Sasidharan
Computer Science Dept, IIT, Chicago, USA
X
Xian-He
Computer Science Dept, IIT, Chicago, USA
J
Jay F. Lofstead
Computer Science, Sandia National Lab, New Mexico, USA
Scott Klasky
Scott Klasky
Oak Ridge National Laboratory
Computer SciencePhysicsHigh Performance Computingdata science