DaiSy: A Library for Scalable Data Series Similarity Search

📅 2026-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods for similarity search over data sequences are fragmented and constrained to specific execution environments, lacking a unified and efficient cross-platform solution. This work proposes and open-sources DaiSy, the first unified framework for exact similarity search that seamlessly supports disk-based, in-memory, GPU-accelerated, and distributed settings, while accommodating both sequence and vector data. Integrating multiple state-of-the-art algorithms, DaiSy provides both C++ and Python interfaces, significantly enhancing scalability and deployment flexibility in large-scale scenarios.
📝 Abstract
Exact similarity search over large collections of data series is a fundamental operation in modern applications, yet existing solutions are often fragmented, specialized, or tailored to specific execution environments. In this paper, we present DaiSy, a unified library for exact data series similarity search that integrates multiple state-of-the-art algorithms within a single, coherent framework. DaiSy is the first library to support exact similarity search across diverse execution environments, including implementations for disk-based, in-memory, GPU-accelerated, and distributed scalable similarity search. Although designed for data series, DaiSy is also directly applicable to exact similarity search over vector data, enabling its use in a broader range of applications. The library supports interfaces in both C++ and Python, enabling users to easily integrate its functionality into a variety of tasks. DaiSy is open-sourced and available at: https://github.com/MChatzakis/DaiSy.
Problem

Research questions and friction points this paper is trying to address.

data series
similarity search
scalable
exact search
unified library
Innovation

Methods, ideas, or system contributions that make the work stand out.

data series
similarity search
scalable
GPU-accelerated
distributed
🔎 Similar Papers
No similar papers found.
F
Francesca Del Gaudio
Université Paris Cité, LIP ADE, F-75006, Paris, France
M
Manos Chatzakis
Université Paris Cité, LIP ADE, F-75006, Paris, France
G
Gayathiri Ravendirane
Université de Bordeaux, Bordeaux, France
B
Botao Peng
Chinese Academy of Sciences, Beijing, China
Themis Palpanas
Themis Palpanas
Distinguished Professor, University Paris Cite, French University Institute (IUF)
data managementdata sciencedata/time seriesanomaly detectionhigh-dimensional similarity search