AdjustAR: AI-Driven In-Situ Adjustment of Site-Specific Augmented Reality Content

📅 2025-08-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In outdoor augmented reality (AR), static 3D content frequently misaligns with dynamic physical environments, degrading spatial registration and contextual understanding. To address this, we propose an in-situ correction system powered by multimodal large language models (MLLMs). Our method jointly analyzes the original authored view and real-time camera footage to perform visual-semantic reasoning, automatically detecting misalignments and generating geometrically and semantically consistent 3D scene updates. This work introduces the first application of MLLMs to runtime visual-semantic alignment in outdoor AR, enabling fully autonomous adaptation to environmental dynamics without manual intervention. Experimental evaluation demonstrates significant improvements in long-term AR content stability, spatial consistency across real-world scenes, and overall user experience.

Technology Category

Application Category

📝 Abstract
Site-specific outdoor AR experiences are typically authored using static 3D models, but are deployed in physical environments that change over time. As a result, virtual content may become misaligned with its intended real-world referents, degrading user experience and compromising contextual interpretation. We present AdjustAR, a system that supports in-situ correction of AR content in dynamic environments using multimodal large language models (MLLMs). Given a composite image comprising the originally authored view and the current live user view from the same perspective, an MLLM detects contextual misalignments and proposes revised 2D placements for affected AR elements. These corrections are backprojected into 3D space to update the scene at runtime. By leveraging MLLMs for visual-semantic reasoning, this approach enables automated runtime corrections to maintain alignment with the authored intent as real-world target environments evolve.
Problem

Research questions and friction points this paper is trying to address.

Detects misalignments between AR content and changing real-world environments
Proposes revised 2D placements for AR elements using MLLMs
Automatically updates 3D AR scenes to maintain authored intent
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses MLLMs for AR content alignment
Detects misalignments via composite images
Backprojects 2D corrections to 3D space