Inverting Neural Networks: New Methods to Generate Neural Network Inputs from Prescribed Outputs

πŸ“… 2026-03-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work investigates how to reconstruct input images from neural network outputs to uncover the features underlying model decisions. To this end, two novel inversion methods are proposed: a forward inversion approach leveraging the input Jacobian matrix combined with root-finding algorithms, and a backward inversion technique that iteratively inverts layer-by-layer while injecting random vectors into the nullspace of each layer’s linear transformation. For the first time, high-fidelity input reconstructions are achieved on Transformers and linear sequence networks. The generated images, though appearing random, consistently yield near-100% classification confidence and densely span the feasible input space. This approach substantially outperforms existing methods and effectively exposes the model’s reliance on non-semantic features and its inherent vulnerabilities.

Technology Category

Application Category

πŸ“ Abstract
Neural network systems describe complex mappings that can be very difficult to understand. In this paper, we study the inverse problem of determining the input images that get mapped to specific neural network classes. Ultimately, we expect that these images contain recognizable features that are associated with their corresponding class classifications. We introduce two general methods for solving the inverse problem. In our forward pass method, we develop an inverse method based on a root-finding algorithm and the Jacobian with respect to the input image. In our backward pass method, we iteratively invert each layer, at the top. During the inversion process, we add random vectors sampled from the null-space of each linear layer. We demonstrate our new methods on both transformer architectures and sequential networks based on linear layers. Unlike previous methods, we show that our new methods are able to produce random-like input images that yield near perfect classification scores in all cases, revealing vulnerabilities in the underlying networks. Hence, we conclude that the proposed methods provide a more comprehensive coverage of the input image spaces that solve the inverse mapping problem.
Problem

Research questions and friction points this paper is trying to address.

Neural Network Inversion
Inverse Problem
Input Reconstruction
Class-specific Inputs
Output-to-Input Mapping
Innovation

Methods, ideas, or system contributions that make the work stand out.

neural network inversion
input reconstruction
Jacobian-based optimization
null-space sampling
inverse mapping
πŸ”Ž Similar Papers
No similar papers found.
R
Rebecca Pattichis
Department of Electrical Engineering, University of California, Los Angeles, Los Angeles CA, USA
S
Sebastian Janampa
Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA
Constantinos S. Pattichis
Constantinos S. Pattichis
Prof. of Computer Science, University of Cyprus & HealthXR Group, CYENS Centre of Excellence, CYPRUS
eHealthmHealthAI and machine learning in medicinemedical imagingbiosignal analysis
M
Marios S. Pattichis
Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA