iPhoneBlur: A Difficulty-Stratified Benchmark for Consumer Device Motion Deblurring

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the limitation of existing motion deblurring methods, whose evaluation on consumer-grade devices relies on aggregate metrics that obscure performance variations across different blur severities. To this end, the authors construct a difficulty-stratified deblurring benchmark synthesized from high-frame-rate videos captured on an iPhone 17 Pro, comprising 7,400 image pairs. Using PSNR-guided adaptive temporal windows, the dataset is partitioned into easy, medium, and hard subsets, enriched with optical flow magnitudes, spectral features, and ISP-related metadata. This benchmark introduces, for the first time, a difficulty-aware stratification scheme alongside deployment-critical metadata, revealing a significant 7–9 dB performance drop of state-of-the-art models on hard cases. Targeted fine-tuning substantially narrows the domain gap between consumer and professional cameras, demonstrating the benchmark’s effectiveness and practical utility.

📝 Abstract

Motion blur restoration on consumer mobile devices is typically evaluated using aggregate metrics that obscure performance variation across blur difficulty, masking model behavior under real deployment conditions. This work introduces iPhoneBlur, a difficulty-stratified benchmark of 7,400 image pairs synthesized from high-framerate iPhone 17 Pro videos captured in diverse real-world scenarios. Samples are partitioned into Easy, Medium, and Hard categories through PSNR-guided adaptive temporal windowing, with stratification validated by monotonic 2.2x increase in optical flow magnitude across tiers. Each sample includes comprehensive metadata enabling investigation of ISP-aware and difficulty-adaptive restoration strategies. Spectral analysis confirms synthesized blur exhibits high-frequency suppression patterns consistent with authentic motion degradation. Evaluation of six architectures reveals consistent 7-9 dB performance degradation from Easy to Hard subsets, a substantial gap entirely hidden by aggregate reporting. The benchmark further exposes a domain gap between professional and consumer cameras which targeted fine-tuning substantially recovers. By coupling difficulty stratification with deployment-critical metadata, iPhoneBlur enables systematic assessment of model reliability and failure modes for resource-constrained edge systems.

Problem

Research questions and friction points this paper is trying to address.

motion deblurring

consumer devices

difficulty stratification

benchmark

performance evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

difficulty-stratified benchmark

motion deblurring

consumer mobile devices