🤖 AI Summary
This work addresses the suboptimal performance in few-shot novel view synthesis caused by treating all input views with equal weight. To this end, we propose a camera-aware adaptive view weighting mechanism that dynamically evaluates the relevance of each source view to the target view based on geometric relationships—such as Euclidean distance and viewing angle disparity—and cross-attention responses, then assigns corresponding weights accordingly. We introduce two complementary weighting schemes: a deterministic geometric weighting and a learnable attention-based weighting, both of which can be seamlessly integrated into existing novel view synthesis frameworks. Extensive experiments demonstrate that our approach significantly improves the accuracy and photorealism of synthesized images under few-shot settings, outperforming current state-of-the-art methods.
📝 Abstract
Novel view synthesis (NVS) has advanced with generative modeling, enabling photorealistic image generation. In few-shot NVS, where only a few input views are available, existing methods often assume equal importance for all input views relative to the target, leading to suboptimal results. We address this limitation by introducing a camera-weighting mechanism that adjusts the importance of source views based on their relevance to the target. We propose two approaches: a deterministic weighting scheme leveraging geometric properties like Euclidean distance and angular differences, and a cross-attention-based learning scheme that optimizes view weighting. Additionally, models can be further trained with our camera-weighting scheme to refine their understanding of view relevance and enhance synthesis quality. This mechanism is adaptable and can be integrated into various NVS algorithms, improving their ability to synthesize high-quality novel views. Our results demonstrate that adaptive view weighting enhances accuracy and realism, offering a promising direction for improving NVS.