SEAnet: A Deep Learning Architecture for Data Series Similarity Search

📅 2026-03-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the Deep Embedding Approximation (DEA) framework to address the limited performance of existing SAX-based indexing methods in similarity search over high-frequency, weakly correlated, or highly noisy time series. DEA is the first approach to integrate deep neural networks into symbolic aggregate approximation, introducing the SEAnet architecture with sum-of-squares preservation properties. It further incorporates the SEAtrans encoder and novel sampling strategies—SEAsam and SEAsamE—to enable efficient training and high-precision retrieval. Extensive experiments on seven synthetic and real-world datasets demonstrate that DEA significantly outperforms state-of-the-art methods, achieving notable advances in both time series representation quality and similarity search accuracy.

Technology Category

Application Category

📝 Abstract
A key operation for massive data series collection analysis is similarity search. According to recent studies, SAX-based indexes offer state-of-the-art performance for similarity search tasks. However, their performance lags under high-frequency, weakly correlated, excessively noisy, or other dataset-specific properties. In this work, we propose Deep Embedding Approximation (DEA), a novel family of data series summarization techniques based on deep neural networks. Moreover, we describe SEAnet, a novel architecture especially designed for learning DEA, that introduces the Sum of Squares preservation property into the deep network design. We further enhance SEAnet with SEAtrans encoder. Finally, we propose novel sampling strategies, SEAsam and SEAsamE, that allow SEAnet to effectively train on massive datasets. Comprehensive experiments on 7 diverse synthetic and real datasets verify the advantages of DEA learned using SEAnet in providing high-quality data series summarizations and similarity search results.
Problem

Research questions and friction points this paper is trying to address.

similarity search
data series
SAX-based indexes
noisy data
high-frequency data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Embedding Approximation
SEAnet
Sum of Squares preservation
SEAtrans
Similarity Search