🤖 AI Summary
This work addresses the severe resolution degradation in smartphone telephoto lenses when pixel sizes shrink below 0.5 µm, where optical diffraction and geometric aberrations become dominant, and conventional image signal processors (ISPs) fail to recover fine details due to their lack of explicit point spread function (PSF) modeling. The authors propose an end-to-end trained neural ISP that explicitly compensates for residual aberrations through both single-frame and multi-frame architectures, leveraging an optical simulation platform that jointly controls signal-to-noise ratio and diffraction spot size. Experiments demonstrate that at a 0.35 µm pixel pitch, the method achieves an MTF50 of 745 cycles/mm—2.5–3× higher resolution than traditional ISPs—and reduces LPIPS to 0.151. In low-SNR multi-frame scenarios, performance approaches that of a bright single-frame baseline. This study is the first to systematically validate that neural ISPs can transform extremely small pixels into an imaging advantage, revealing that the fundamental bottleneck of conventional ISPs at small pixel scales stems from uncorrected PSF-induced blur.
📝 Abstract
Smartphone telephoto cameras are approaching a "telephoto physics wall": as pixel pitches shrink toward sub-0.5 micron, the optics remain limited by geometric aberrations, leading to diminishing returns on resolution. Traditional Image Signal Processors (ISPs) cannot eliminate these aberrations, because they operate through local, stage-wise processing with no explicit model of the underlying point spread function (PSF). We demonstrate how a learning-based Neural ISP for image restoration, trained on the underlying degradations, inverts what stage-wise pipelines cannot, turning small-pixel designs into a net advantage.
We investigate this through a controlled simulation of a representative telephoto module, evaluating five configurations (0.35--0.75 micron pixel pitch). The aperture is scaled proportionally to keep per-pixel SNR and diffraction spot size fixed, thereby isolating geometric aberration and spatial sampling. While the traditional ISP improves only modestly with smaller pixels, the Neural ISP scales substantially: at 0.35 micron} it reaches 745 cycles/mm MTF50 (vertical), a 2.5--3x resolution improvement over the traditional ISP, and LPIPS improves significantly from 0.244 to 0.151 while traditional results stay comparatively flat. In a low-SNR extension (15 dB per-frame bursts at 0.35 micron), a multi-frame Neural ISP recovers performance close to the bright-light single-frame baseline, whereas a multi-frame traditional ISP shows no meaningful improvement -- indicating that traditional pipelines at small pixels are bottlenecked by uncorrected PSF blur rather than by noise. These results point to a design philosophy in which Neural ISPs enable high-resolution telephoto modules by correcting residual optical aberrations rather than requiring increasingly complex optics.