Soft-ECM: An extension of Evidential C-Means for complex data

📅 2025-07-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing evidential c-means (ECM) algorithms rely on centroid-based constructions in Euclidean space, limiting their applicability to non-Euclidean structured data such as mixed-type and time-series data. Method: We propose Soft-ECM—the first soft evidential clustering framework designed for semi-metric spaces—replacing the Euclidean assumption with a generalized distance satisfying non-negativity, identity of indiscernibles, and the triangle inequality (e.g., DTW). By integrating belief function theory with fuzzy membership mechanisms, Soft-ECM unifies uncertainty and fuzziness modeling under semi-metric constraints, enabling end-to-end clustering of heterogeneous data (numerical, categorical, and time-series). Contribution/Results: Experiments show that Soft-ECM matches classical fuzzy c-means on standard numerical benchmarks and significantly outperforms existing evidential clustering methods on mixed-type and time-series tasks, demonstrating both effectiveness and generalizability of uncertainty-aware clustering in non-Euclidean settings.

Technology Category

Application Category

📝 Abstract
Clustering based on belief functions has been gaining increasing attention in the machine learning community due to its ability to effectively represent uncertainty and/or imprecision. However, none of the existing algorithms can be applied to complex data, such as mixed data (numerical and categorical) or non-tabular data like time series. Indeed, these types of data are, in general, not represented in a Euclidean space and the aforementioned algorithms make use of the properties of such spaces, in particular for the construction of barycenters. In this paper, we reformulate the Evidential C-Means (ECM) problem for clustering complex data. We propose a new algorithm, Soft-ECM, which consistently positions the centroids of imprecise clusters requiring only a semi-metric. Our experiments show that Soft-ECM present results comparable to conventional fuzzy clustering approaches on numerical data, and we demonstrate its ability to handle mixed data and its benefits when combining fuzzy clustering with semi-metrics such as DTW for time series data.
Problem

Research questions and friction points this paper is trying to address.

Extends Evidential C-Means for clustering complex data types
Handles mixed numerical and categorical data effectively
Applies semi-metrics like DTW for time series clustering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends Evidential C-Means for complex data
Uses belief functions for uncertainty representation
Employs semi-metrics like DTW for time series
🔎 Similar Papers
No similar papers found.