Cleaning up the Mess

📅 2025-10-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper identifies critical methodological flaws in the memory system simulations reported in the “Mess paper”: Ramulator 2.0 results are irreproducible due to misconfigured parameters, while DAMOV employs irrelevant statistical metrics and lacks complete open-source artifacts—both compromising validity. The authors systematically reproduce and rectify configuration errors and methodological shortcomings across both simulators via parameter calibration, output validation, and source-code auditing. They demonstrate that Ramulator 2.0, when properly configured, accurately models real-world DRAM behavior. This work constitutes the first cross-simulator methodological correction of a prominent publication in computer architecture. All corrected code, experimental scripts, and validation data are publicly released. The study not only corrects the scientific record but also underscores the necessity of rigorous simulation validation, prompting critical reflection on peer-review practices and reproducibility standards in systems research.

Technology Category

Application Category

📝 Abstract
A MICRO 2024 best paper runner-up publication (the Mess paper) with all three artifact badges awarded (including "Reproducible") proposes a new benchmark to evaluate real and simulated memory system performance. In this paper, we demonstrate that the Ramulator 2.0 simulation results reported in the Mess paper are incorrect and, at the time of the publication of the Mess paper, irreproducible. We find that the authors of Mess paper made multiple trivial human errors in both the configuration and usage of the simulators. We show that by correctly configuring Ramulator 2.0, Ramulator 2.0's simulated memory system performance actually resembles real system characteristics well, and thus a key claimed contribution of the Mess paper is factually incorrect. We also identify that the DAMOV simulation results in the Mess paper use wrong simulation statistics that are unrelated to the simulated DRAM performance. Moreover, the Mess paper's artifact repository lacks the necessary sources to fully reproduce all the Mess paper's results. Our work corrects the Mess paper's errors regarding Ramulator 2.0 and identifies important issues in the Mess paper's memory simulator evaluation methodology. We emphasize the importance of both carefully and rigorously validating simulation results and contacting simulator authors and developers, in true open source spirit, to ensure these simulators are used with correct configurations and as intended. We encourage the computer architecture community to correct the Mess paper's errors. This is necessary to prevent the propagation of inaccurate and misleading results, and to maintain the reliability of the scientific record. Our investigation also opens up questions about the integrity of the review and artifact evaluation processes. To aid future work, our source code and scripts are openly available at https: //github.com/CMU-SAFARI/ramulator2/tree/mess.
Problem

Research questions and friction points this paper is trying to address.

Correcting incorrect Ramulator 2.0 simulation results in Mess paper
Identifying flawed memory simulator evaluation methodology issues
Addressing irreproducible results and configuration errors in benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Corrected Ramulator 2.0 configuration errors
Identified flawed memory simulator evaluation methodology
Provided open source code for reproducibility
🔎 Similar Papers
No similar papers found.
H
Haocong Luo
ETH Zürich
Ataberk Olgun
Ataberk Olgun
ETH Zurich
Computer ArchitectureMemory SystemsComputer SecurityReliability
M
Maria Makeenkova
ETH Zürich
F
F. Nisa Bostanci
ETH Zürich
Geraldo F. Oliveira
Geraldo F. Oliveira
ETH Zurich
Computer Architecture
A
A. Giray Yaglikci
ETH Zürich
O
Onur Mutlu
ETH Zürich