PCF Learned Sort: a Learning Augmented Sort Algorithm with O(n log log n) Expected Complexity

📅 2024-05-12

🏛️ Trans. Mach. Learn. Res.

📈 Citations: 2

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing learned sorting algorithms lack theoretical complexity guarantees. This paper proposes PCF Learned Sort—the first learning-augmented sorting algorithm with rigorous theoretical guarantees. It models the underlying data distribution using piecewise constant functions (PCFs), integrates empirical distribution learning with adaptive bucketing, and proves an expected time complexity of $O(n log log n)$. This bound provides the first theoretically grounded explanation for why learned sorting can surpass the classical $Omega(n log n)$ lower bound for comparison-based sorting. Both theoretical analysis and empirical evaluation—across synthetic and real-world datasets—consistently validate the $O(n log log n)$ scaling behavior, demonstrating significant improvements over traditional sorting algorithms. The core contribution is a provably efficient, data-aware sorting framework that formally bridges distributional assumptions with algorithmic performance guarantees.

Technology Category

Application Category

📝 Abstract

Sorting is one of the most fundamental algorithms in computer science. Recently, Learned Sorts, which use machine learning to improve sorting speed, have attracted attention. While existing studies show that Learned Sort is empirically faster than classical sorting algorithms, they do not provide theoretical guarantees about its computational complexity. We propose Piecewise Constant Function (PCF) Learned Sort, a theoretically guaranteed Learned Sort algorithm. We prove that the expected complexity of PCF Learned Sort is $mathcal{O}(n log log n)$ under mild assumptions on the data distribution. We also confirm empirically that PCF Learned Sort has a computational complexity of $mathcal{O}(n log log n)$ on both synthetic and real datasets. This is the first study to theoretically support the empirical success of Learned Sort, and provides evidence for why Learned Sort is fast. The code is available at https://github.com/atsukisato/PCF_Learned_Sort .

Problem

Research questions and friction points this paper is trying to address.

Learned Sort lacks theoretical complexity guarantees despite empirical speed

PCF Learned Sort provides O(n log log n) expected complexity guarantee

The algorithm bridges theoretical foundations with empirical performance of Learned Sort

Innovation

Methods, ideas, or system contributions that make the work stand out.

PCF Learned Sort uses machine learning for sorting

Algorithm achieves O(n log log n) expected complexity

Provides theoretical guarantees for learned sorting performance

🔎 Similar Papers

Learning-Augmented Search Data Structures