Problem
Research questions and friction points this paper is trying to address.
Enhancing VLN agents with visual imaginations
Using text-to-image models for navigational cues
Improving navigation success rates with visual aids
Innovation
Methods, ideas, or system contributions that make the work stand out.
Text-to-image diffusion model for visual imaginations
Added modality for landmark cues in navigation
Auxiliary loss to relate imaginations with instructions