Topolow: Force-Directed Euclidean Embedding of Dissimilarity Data with Robustness Against Non-Metricity and Sparsity

📅 2025-08-03

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper addresses low-dimensional embedding of non-metric, sparse, or censored pairwise dissimilarity data—without requiring Euclidean or metric axioms. The proposed method formulates a maximum-likelihood estimation framework based on a Laplacian error model, and integrates a force-directed particle-system analogy with a sequential stochastic pairwise interaction optimization strategy. This enables gradient-free, locally robust embedding learning. By eschewing global gradient computation, the approach inherently accommodates censored observations and exhibits enhanced robustness against local optima and sparse structural artifacts. Empirical evaluations demonstrate superior geometric reconstruction accuracy over classical multidimensional scaling (MDS) on both non-Euclidean and sparse datasets. The method has been implemented as the general-purpose function `Euclidify` in the open-source R package `topolow`, facilitating broad accessibility and reproducibility.

Technology Category

Application Category

📝 Abstract

The problem of embedding a set of objects into a low-dimensional Euclidean space based on a matrix of pairwise dissimilarities is fundamental in data analysis, machine learning, and statistics. However, the assumptions of many standard analytical methods are violated when the input dissimilarities fail to satisfy metric or Euclidean axioms. We present the mathematical and statistical foundations of Topolow, a physics-inspired, gradient-free optimization framework for such embedding problems. Topolow is conceptually related to force-directed graph drawing algorithms but is fundamentally distinguished by its goal of quantitative metric reconstruction. It models objects as particles in a physical system, and its novel optimization scheme proceeds through sequential, stochastic pairwise interactions, which circumvents the need to compute a global gradient and provides robustness against convergence to local optima, especially for sparse data. Topolow maximizes the likelihood under a Laplace error model, robust to outliers and heterogeneous errors, and properly handles censored data. Crucially, Topolow does not require the input dissimilarities to be metric, making it a robust solution for embedding non-metric measurements into a valid Euclidean space, thereby enabling the use of standard analytical tools. We demonstrate the superior performance of Topolow compared to standard Multidimensional Scaling (MDS) methods in reconstructing the geometry of sparse and non-Euclidean data. This paper formalizes the algorithm, first introduced as Topolow in the context of antigenic mapping in (Arhami and Rohani, 2025) (open access), with emphasis on its metric embedding and mathematical properties for a broader audience. The general-purpose function Euclidify is available in the R package topolow.

Problem

Research questions and friction points this paper is trying to address.

Embedding non-metric dissimilarity data into Euclidean space

Robust optimization for sparse and non-Euclidean datasets

Quantitative metric reconstruction without global gradient computation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-inspired gradient-free optimization framework

Stochastic pairwise interactions avoid global gradients

Robust Laplace error model handles non-metric data

🔎 Similar Papers

No similar papers found.

Authors to Follow