Adaptive and Resource-efficient Agentic AI Systems for Mobile and Embedded Devices: A Survey

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the fundamental tension between resource constraints (memory, energy, bandwidth, latency) and system requirements (long-term adaptability, real-time interaction) in deploying foundation model (FM)-based agents on mobile and embedded devices, this paper proposes the first unified co-optimization framework tailored for edge-side agents. Our method integrates elastic inference, test-time adaptation, dynamic multimodal fusion, edge-coordinated deployment, and distribution-shift-robust optimization—establishing interpretable mappings among FM architecture, cognitive capabilities, and hardware resources. Through algorithm-system co-design, we achieve Pareto-optimal trade-offs among accuracy, latency, and communication overhead. We systematically analyze the intelligent evolution of the perception-decision-execution loop under resource constraints, identifying key challenges and future research directions. This work provides both theoretical foundations and practical paradigms for scalable, adaptive edge AI agents.

Technology Category

Application Category

📝 Abstract
Foundation models have reshaped AI by unifying fragmented architectures into scalable backbones with multimodal reasoning and contextual adaptation. In parallel, the long-standing notion of AI agents, defined by the sensing-decision-action loop, is entering a new paradigm: with FMs as their cognitive core, agents transcend rule-based behaviors to achieve autonomy, generalization, and self-reflection. This dual shift is reinforced by real-world demands such as autonomous driving, robotics, virtual assistants, and GUI agents, as well as ecosystem advances in embedded hardware, edge computing, mobile deployment platforms, and communication protocols that together enable large-scale deployment. Yet this convergence collides with reality: while applications demand long-term adaptability and real-time interaction, mobile and edge deployments remain constrained by memory, energy, bandwidth, and latency. This creates a fundamental tension between the growing complexity of FMs and the limited resources of deployment environments. This survey provides the first systematic characterization of adaptive, resource-efficient agentic AI systems. We summarize enabling techniques into elastic inference, test-time adaptation, dynamic multimodal integration, and agentic AI applications, and identify open challenges in balancing accuracy-latency-communication trade-offs and sustaining robustness under distribution shifts. We further highlight future opportunities in algorithm-system co-design, cognitive adaptation, and collaborative edge deployment. By mapping FM structures, cognition, and hardware resources, this work establishes a unified perspective toward scalable, adaptive, and resource-efficient agentic AI. We believe this survey can help readers to understand the connections between enabling technologies while promoting further discussions on the fusion of agentic intelligence and intelligent agents.
Problem

Research questions and friction points this paper is trying to address.

Adapting large AI models to resource-limited mobile devices
Balancing computational complexity with energy and latency constraints
Enabling autonomous AI agents on embedded systems efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Elastic inference adapts to resource constraints
Test-time adaptation enables real-time learning
Dynamic multimodal integration enhances contextual reasoning
🔎 Similar Papers
No similar papers found.
S
Sicong Liu
W
Weiye Wu
X
Xiangrui Xu
T
Teng Li
Bowen Pang
Bowen Pang
Noah Ark's Lab, Huawei
B
Bin Guo
Z
Zhiwen Yu