Normalizing Flow to Augmented Posterior: Conditional Density Estimation with Interpretable Dimension Reduction for High Dimensional Data

📅 2025-07-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conditional density estimation (CDE) for high-dimensional response variables—e.g., images—remains challenging, as existing methods are typically designed for low-dimensional responses and struggle to jointly achieve interpretability and effective dimensionality reduction. This paper proposes a novel CDE framework integrating normalizing flows with augmented posterior modeling. Its core innovation is the first incorporation of supervised dimensionality reduction into normalizing flows, introducing interpretable, low-dimensional latent variables that explicitly disentangle semantic variations relevant to predictors (e.g., identity, illumination) from irrelevant stochastic variations (e.g., noise). Conditional posteriors are modeled via linear or logistic regression, while irrelevant components are captured by a Gaussian mixture model, enabling structured interpretation in latent space. Evaluated on image analysis tasks, the method significantly outperforms unsupervised baselines, achieving both high-fidelity generation and clear semantic interpretability.

Technology Category

Application Category

📝 Abstract
The conditional density characterizes the distribution of a response variable $y$ given other predictor $x$, and plays a key role in many statistical tasks, including classification and outlier detection. Although there has been abundant work on the problem of Conditional Density Estimation (CDE) for a low-dimensional response in the presence of a high-dimensional predictor, little work has been done for a high-dimensional response such as images. The promising performance of normalizing flow (NF) neural networks in unconditional density estimation acts a motivating starting point. In this work, we extend NF neural networks when external $x$ is present. Specifically, they use the NF to parameterize a one-to-one transform between a high-dimensional $y$ and a latent $z$ that comprises two components ([z_P,z_N]). The $z_P$ component is a low-dimensional subvector obtained from the posterior distribution of an elementary predictive model for $x$, such as logistic/linear regression. The $z_N$ component is a high-dimensional independent Gaussian vector, which explains the variations in $y$ not or less related to $x$. Unlike existing CDE methods, the proposed approach, coined Augmented Posterior CDE (AP-CDE), only requires a simple modification on the common normalizing flow framework, while significantly improving the interpretation of the latent component, since $z_P$ represents a supervised dimension reduction. In image analytics applications, AP-CDE shows good separation of $x$-related variations due to factors such as lighting condition and subject id, from the other random variations. Further, the experiments show that an unconditional NF neural network, based on an unsupervised model of $z$, such as Gaussian mixture, fails to generate interpretable results.
Problem

Research questions and friction points this paper is trying to address.

Estimating high-dimensional conditional density for response variables
Improving interpretability of dimension reduction in normalizing flow
Separating predictor-related variations from random noise in data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Normalizing Flow for high-dimensional response
Latent transform with supervised dimension reduction
Augmented Posterior CDE improves interpretability
🔎 Similar Papers
No similar papers found.
C
Cheng Zeng
Department of Statistics, University of Florida, U.S.A.
George Michailidis
George Michailidis
Professor of Statistics and Computer Science
Hitoshi Iyatomi
Hitoshi Iyatomi
Professor, Hosei University, Japan
deep learningcomputer visionmachine learningmedical engineering
L
Leo L Duan
Department of Statistics, University of Florida, U.S.A.