A Zero-Shot Multi-Agent Framework for Human-Building Interaction via Programmatic Reasoning

📅 2026-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses key challenges in applying large language models to the building domain—namely, scarce training data, difficulty in integrating domain-specific knowledge, and the tension between natural interaction and technical accuracy—by proposing a zero-shot hierarchical multi-agent framework. The framework decouples natural language understanding from building data analysis through semantic routing and procedural reasoning mechanisms, and employs a “gatekeeper” strategy to decompose user queries into subtasks. Dedicated encoding agents then generate executable Python scripts, circumventing limitations inherent in conventional retrieval-augmented generation (RAG) approaches. Evaluated on a dataset comprising over 200 commercial buildings, the method delivers accurate, context-aware responses tailored to diverse users—from tenants to facility managers—without requiring any model fine-tuning.
📝 Abstract
Large Language Model (LLM) offers opportunities to enhance Human-Building Interaction (HBI) by enabling more direct interactions through intuitive interfaces to complex building systems. These systems can be characterized by the vast amounts of data across multiple formats, the lack of nonconfidential and generalizable information, and the requirement of domain expertise for interpretation. Applying LLMs to domain-specific tasks like HBI presents additional challenges. Limited training data makes traditional fine-tuning approaches impractical. Meanwhile, the opacity of LLM training data requires careful integration of domain knowledge to ensure reliability. Additionally, different LLMs exhibit varying alignment characteristics, suggesting that achieving both natural interaction and technical accuracy requires a multi-agent approach. These challenges highlight the need for innovative approaches to adapt LLMs for specialized domains while maintaining accuracy and user engagement. In this paper, we develop a hierarchical multi-agent framework that utilizes semantic routing and programmatic reasoning to decouple natural language understanding from building analytics. Instead of standard RAG approaches, our system employs a "Doorman" mechanism for task decomposition and specialized coding agents that generate executable Python scripts for precise arithmetic. We validate this framework on a dataset from more than 200 commercial buildings. Results demonstrate the effectiveness in providing accurate and contextual responses for diverse users, including stakeholders, from tenants to building managers, across various building system applications.
Problem

Research questions and friction points this paper is trying to address.

Human-Building Interaction
Large Language Model
Zero-Shot Learning
Multi-Agent System
Domain Adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-Shot
Multi-Agent Framework
Programmatic Reasoning
Human-Building Interaction
Semantic Routing
🔎 Similar Papers
No similar papers found.