π€ AI Summary
To address weak generalization and poor interpretability in image disentanglement, this paper proposes a modular Bayesian deep neural network that integrates hierarchical Bayesian modeling with deep learning. Methodologically: (i) it establishes, for the first time, a theoretical link between the loss function and the generalization error bound, thereby guiding the design of interpretable losses; (ii) it introduces optimization-driven variational inference and test-time adaptation to enhance robustness under out-of-distribution scenarios; and (iii) it supports two downstream tasksβnoise removal and unsupervised anomaly detection. Experiments demonstrate significant improvements in both component interpretability and cross-distribution generalization on image denoising and anomaly detection benchmarks. The proposed framework provides a novel paradigm for interpretable deep learning that combines rigorous theoretical guarantees with practical engineering applicability.
π Abstract
Image decomposition aims to analyze an image into elementary components, which is essential for numerous downstream tasks and also by nature provides certain interpretability to the analysis. Deep learning can be powerful for such tasks, but surprisingly their combination with a focus on interpretability and generalizability is rarely explored. In this work, we introduce a novel framework for interpretable deep image decomposition, combining hierarchical Bayesian modeling and deep learning to create an architecture-modularized and model-generalizable deep neural network (DNN). The proposed framework includes three steps: (1) hierarchical Bayesian modeling of image decomposition, (2) transforming the inference problem into optimization tasks, and (3) deep inference via a modularized Bayesian DNN. We further establish a theoretical connection between the loss function and the generalization error bound, which inspires a new test-time adaptation approach for out-of-distribution scenarios. We instantiated the application using two downstream tasks, extit{i.e.}, image denoising, and unsupervised anomaly detection, and the results demonstrated improved generalizability as well as interpretability of our methods. The source code will be released upon the acceptance of this paper.