Local Diagnostics of Continuous Normalizing Flow for Out-of-Distribution Detection

📅 2026-05-30

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This work addresses the challenge of out-of-distribution (OOD) detection in deep generative models, which often misclassify OOD samples due to the “likelihood paradox” in high-dimensional subspaces. To overcome this limitation, the authors propose the Lagrangian Subflow (LSF) framework, which leverages continuous normalizing flows to disentangle relevant features from contextual information in latent representations. Crucially, LSF introduces a novel local geometric diagnostic signal derived from the velocity field along subflow trajectories, enabling OOD detection without relying on likelihood values—a departure from conventional paradigms. Evaluated on a zero-shot phoneme-level mispronunciation detection task, the proposed geometric metric substantially outperforms existing likelihood-based methods, effectively mitigating the likelihood paradox and demonstrating both the efficacy and innovation of the approach.

📝 Abstract

We address the problem of out-of-distribution (OOD) detection for target observations embedded in a subspace of the high dimensional data space. Using continuous normalizing flows (CNFs), we propose a Lagrangian sub-flow (LSF) framework designed to isolate and estimate the density for the relevant components in the representation and using the remaining components as context. Through experimentation with models for speech synthesis, we show that CNFs, similarly to other deep generative models (DGMs), are susceptible to the "likelihood paradox", where high likelihood is erroneously assigned to OOD samples. This is attributed to the inductive bias of DGMs that prioritize low-level structural details over high-level semantic coherence. To mitigate this phenomenon, we propose a number of geometric diagnostic signals based on the velocity field over the sub-flow trajectory. Based on these signals, we design metrics for the challenging task of zero-shot phoneme-level mispronunciation detection. Finally, we demonstrate the superiority of these metrics compared to likelihood-based methods on a real-world mispronunciation detection benchmark.

Problem

Research questions and friction points this paper is trying to address.

out-of-distribution detection

continuous normalizing flows

likelihood paradox

subspace modeling

mispronunciation detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continuous Normalizing Flows

Out-of-Distribution Detection

Lagrangian Sub-Flow