π€ AI Summary
This work addresses the limitation of existing motion deblurring methods, whose evaluation on consumer-grade devices relies on aggregate metrics that obscure performance variations across different blur severities. To this end, the authors construct a difficulty-stratified deblurring benchmark synthesized from high-frame-rate videos captured on an iPhone 17 Pro, comprising 7,400 image pairs. Using PSNR-guided adaptive temporal windows, the dataset is partitioned into easy, medium, and hard subsets, enriched with optical flow magnitudes, spectral features, and ISP-related metadata. This benchmark introduces, for the first time, a difficulty-aware stratification scheme alongside deployment-critical metadata, revealing a significant 7β9 dB performance drop of state-of-the-art models on hard cases. Targeted fine-tuning substantially narrows the domain gap between consumer and professional cameras, demonstrating the benchmarkβs effectiveness and practical utility.
π Abstract
Motion blur restoration on consumer mobile devices is typically evaluated using aggregate metrics that obscure performance variation across blur difficulty, masking model behavior under real deployment conditions. This work introduces iPhoneBlur, a difficulty-stratified benchmark of 7,400 image pairs synthesized from high-framerate iPhone 17 Pro videos captured in diverse real-world scenarios. Samples are partitioned into Easy, Medium, and Hard categories through PSNR-guided adaptive temporal windowing, with stratification validated by monotonic 2.2x increase in optical flow magnitude across tiers. Each sample includes comprehensive metadata enabling investigation of ISP-aware and difficulty-adaptive restoration strategies. Spectral analysis confirms synthesized blur exhibits high-frequency suppression patterns consistent with authentic motion degradation. Evaluation of six architectures reveals consistent 7-9 dB performance degradation from Easy to Hard subsets, a substantial gap entirely hidden by aggregate reporting. The benchmark further exposes a domain gap between professional and consumer cameras which targeted fine-tuning substantially recovers. By coupling difficulty stratification with deployment-critical metadata, iPhoneBlur enables systematic assessment of model reliability and failure modes for resource-constrained edge systems.