🤖 AI Summary
Classical Johnson–Lindenstrauss (JL) lemmata fail to apply to non-Euclidean data—e.g., general dissimilarity matrices—due to their inherent reliance on Euclidean geometry.
Method: This work pioneers the extension of JL-type dimensionality reduction to pseudo-Euclidean spaces, introducing a novel framework based on generalized power distances and $(p,q)$-norms. It decomposes symmetric hollow distance matrices, models geometric deviation parameters, and incorporates a geometry-deviation-driven error control mechanism for fine-grained theoretical analysis and guaranteed accuracy.
Contribution/Results: We prove that the method preserves pairwise distance approximations in pseudo-Euclidean spaces, with an explicit error bound dependent on the intrinsic geometric offset of the data. Experiments on synthetic and real-world non-Euclidean datasets demonstrate superior efficiency and robustness compared to state-of-the-art Euclidean embedding methods.
📝 Abstract
The Johnson-Lindenstrauss (JL) lemma is a cornerstone of dimensionality reduction in Euclidean space, but its applicability to non-Euclidean data has remained limited. This paper extends the JL lemma beyond Euclidean geometry to handle general dissimilarity matrices that are prevalent in real-world applications. We present two complementary approaches: First, we show the JL transform can be applied to vectors in pseudo-Euclidean space with signature $(p,q)$, providing theoretical guarantees that depend on the ratio of the $(p, q)$ norm and Euclidean norm of two vectors, measuring the deviation from Euclidean geometry. Second, we prove that any symmetric hollow dissimilarity matrix can be represented as a matrix of generalized power distances, with an additional parameter representing the uncertainty level within the data. In this representation, applying the JL transform yields multiplicative approximation with a controlled additive error term proportional to the deviation from Euclidean geometry. Our theoretical results provide fine-grained performance analysis based on the degree to which the input data deviates from Euclidean geometry, making practical and meaningful reduction in dimensionality accessible to a wider class of data. We validate our approaches on both synthetic and real-world datasets, demonstrating the effectiveness of extending the JL lemma to non-Euclidean settings.