🤖 AI Summary
This study addresses the challenge of seamlessly integrating physical objects into spatial computing environments. We propose Augmented Object Intelligence (AOI), a novel paradigm that endows arbitrary physical entities with native perception, understanding, and responsive capabilities. Methodologically, we develop XR-Objects—an open-source system that unifies real-time object segmentation and classification, multimodal large language models (MLLMs), and spatial computing techniques to enable end-to-end transformation of physical items into interactive digital interfaces. Unlike conventional AI assistants reliant on explicit user commands, AOI enables implicit, context-aware interaction. User studies demonstrate significant improvements in interaction naturalness and task completion efficiency. The framework’s practicality and generalizability are validated across diverse application domains, including education, smart homes, and industrial settings.
📝 Abstract
Seamless integration of physical objects as interactive digital entities remains a challenge for spatial computing. This paper introduces Augmented Object Intelligence (AOI), a novel XR interaction paradigm designed to blur the lines between digital and physical by equipping real-world objects with the ability to interact as if they were digital, where every object has the potential to serve as a portal to vast digital functionalities. Our approach utilizes object segmentation and classification, combined with the power of Multimodal Large Language Models (MLLMs), to facilitate these interactions. We implement the AOI concept in the form of XR-Objects, an open-source prototype system that provides a platform for users to engage with their physical environment in rich and contextually relevant ways. This system enables analog objects to not only convey information but also to initiate digital actions, such as querying for details or executing tasks. Our contributions are threefold: (1) we define the AOI concept and detail its advantages over traditional AI assistants, (2) detail the XR-Objects system's open-source design and implementation, and (3) show its versatility through a variety of use cases and a user study.