Robust Shape from Focus via Multiscale Directional Dilated Laplacian and Recurrent Network

📅 2025-12-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Shape-from-Focus (SFF) methods typically adopt a two-stage paradigm: first extracting a focus volume via complex encoders, then estimating depth via simple aggregation—leading to artifacts and noise amplification. This work proposes an end-to-end lightweight framework that synergistically integrates handcrafted priors with recurrent modeling capabilities. Specifically, we introduce: (i) a novel multi-scale directional-dilated Laplacian (DDL) operator to construct a robust focus volume representation; and (ii) a GRU-driven iterative low-resolution depth refinement module coupled with a learnable convex upsampling mechanism. Evaluated on both synthetic and real-world datasets, our method achieves significant improvements over state-of-the-art approaches—yielding higher depth accuracy, superior boundary preservation, strong generalization across diverse scenes, and efficient inference.

Technology Category

Application Category

📝 Abstract
Shape-from-Focus (SFF) is a passive depth estimation technique that infers scene depth by analyzing focus variations in a focal stack. Most recent deep learning-based SFF methods typically operate in two stages: first, they extract focus volumes (a per pixel representation of focus likelihood across the focal stack) using heavy feature encoders; then, they estimate depth via a simple one-step aggregation technique that often introduces artifacts and amplifies noise in the depth map. To address these issues, we propose a hybrid framework. Our method computes multi-scale focus volumes traditionally using handcrafted Directional Dilated Laplacian (DDL) kernels, which capture long-range and directional focus variations to form robust focus volumes. These focus volumes are then fed into a lightweight, multi-scale GRU-based depth extraction module that iteratively refines an initial depth estimate at a lower resolution for computational efficiency. Finally, a learned convex upsampling module within our recurrent network reconstructs high-resolution depth maps while preserving fine scene details and sharp boundaries. Extensive experiments on both synthetic and real-world datasets demonstrate that our approach outperforms state-of-the-art deep learning and traditional methods, achieving superior accuracy and generalization across diverse focal conditions.
Problem

Research questions and friction points this paper is trying to address.

Improves depth estimation accuracy in Shape-from-Focus methods
Reduces artifacts and noise in depth map reconstruction
Enhances generalization across diverse focal conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-scale directional dilated Laplacian kernels for focus volumes
Lightweight multi-scale GRU network for iterative depth refinement
Learned convex upsampling to reconstruct high-resolution depth maps
🔎 Similar Papers
No similar papers found.
K
Khurram Ashfaq
Future Convergence Engineering, School of Computer Science and Engineering, Korea University of Technology and Education, Cheonan, 31253, Republic of Korea.
Muhammad Tariq Mahmood
Muhammad Tariq Mahmood
Associate Professor at Koreatech, Korea
Image ProcessingComputer Vision