Forest Proximities for Time Series

📅 2024-10-04

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

This work addresses the limited discriminative capability of proximity measures in time series classification. We propose PF-GAP, the first method to systematically extend RF-GAP proximity—originally defined for single decision trees—to the ensemble level of Proximity Forests. PF-GAP integrates multidimensional scaling (MDS) to generate high-fidelity univariate time series embeddings and jointly applies Local Outlier Factor (LOF) to analyze associations between misclassified samples and anomalous structural patterns. Experiments demonstrate that PF-GAP embeddings substantially outperform conventional distance metrics—including DTW and Euclidean distance—yielding consistent improvements in k-NN classification accuracy across multiple benchmark datasets. More importantly, the learned forest-level proximity structure enables more precise identification of outlier patterns within misclassified instances, revealing PF-GAP’s unique advantages for anomaly attribution and model interpretability in time series analysis.

Technology Category

Application Category

📝 Abstract

RF-GAP has recently been introduced as an improved random forest proximity measure. In this paper, we present PF-GAP, an extension of RF-GAP proximities to proximity forests, an accurate and efficient time series classification model. We use the forest proximities in connection with Multi-Dimensional Scaling to obtain vector embeddings of univariate time series, comparing the embeddings to those obtained using various time series distance measures. We also use the forest proximities alongside Local Outlier Factors to investigate the connection between misclassified points and outliers, comparing with nearest neighbor classifiers which use time series distance measures. We show that the forest proximities seem to exhibit a stronger connection between misclassified points and outliers than nearest neighbor classifiers.

Problem

Research questions and friction points this paper is trying to address.

Extends RF-GAP to proximity forests for time series classification

Compares vector embeddings using forest proximities vs distance measures

Investigates misclassified points and outliers connection via proximities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends RF-GAP to proximity forests (PF-GAP)

Uses Multi-Dimensional Scaling for time series embeddings

Links misclassified points to outliers via Local Outlier Factors

🔎 Similar Papers

No similar papers found.