Canonical Correlation Patterns for Validating Clustering of Multivariate Time Series

📅 2025-07-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing clustering validity metrics—designed for Euclidean spaces—fail to reliably assess correlation-based clustering of multivariate time series, as correlation structures are continuous, reference-free, and lack a natural metric space. Method: We propose the first standardized clustering validity framework specifically tailored to correlation patterns. Its core innovation is defining interpretable, comparable reference structures by discretizing the infinite correlation space using canonical correlation patterns as mathematically grounded ground-truth targets. We further adapt the silhouette coefficient and Davies–Bouldin index by introducing L1-norm-based mapping and L5-norm-based dissimilarity measures to enhance sensitivity to subtle differences in correlation structure. Results: Extensive evaluation on synthetic data demonstrates that our framework robustly detects correlation structure degradation and significantly outperforms conventional validity indices. It provides a reliable, interpretable, and domain-agnostic assessment tool for correlation clustering—particularly critical in high-stakes applications such as finance and healthcare.

Technology Category

Application Category

📝 Abstract
Clustering of multivariate time series using correlation-based methods reveals regime changes in relationships between variables across health, finance, and industrial applications. However, validating whether discovered clusters represent distinct relationships rather than arbitrary groupings remains a fundamental challenge. Existing clustering validity indices were developed for Euclidean data, and their effectiveness for correlation patterns has not been systematically evaluated. Unlike Euclidean clustering, where geometric shapes provide discrete reference targets, correlations exist in continuous space without equivalent reference patterns. We address this validation gap by introducing canonical correlation patterns as mathematically defined validation targets that discretise the infinite correlation space into finite, interpretable reference patterns. Using synthetic datasets with perfect ground truth across controlled conditions, we demonstrate that canonical patterns provide reliable validation targets, with L1 norm for mapping and L5 norm for silhouette width criterion and Davies-Bouldin index showing superior performance. These methods are robust to distribution shifts and appropriately detect correlation structure degradation, enabling practical implementation guidelines. This work establishes a methodological foundation for rigorous correlation-based clustering validation in high-stakes domains.
Problem

Research questions and friction points this paper is trying to address.

Validating distinct clusters in multivariate time series
Evaluating effectiveness of existing Euclidean-based validity indices
Defining interpretable reference patterns for correlation space
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces canonical correlation patterns for validation
Uses L1 and L5 norms for superior performance
Robust to distribution shifts and structure degradation
🔎 Similar Papers
No similar papers found.
Isabella Degen
Isabella Degen
EPSRC Doctoral Impact Fellow, University of Bristol
AI validationmachine learningunsupervised learningtime seriestype 1 diabetes
Z
Zahraa S Abdallah
School of Engineering Mathematics and Technology, University of Bristol
K
Kate Robson Brown
College of Engineering and Architecture, University College Dublin
Henry W J Reeve
Henry W J Reeve
University of Bristol
Statistics & Machine Learning