From Language to Action: A Review of Large Language Models as Autonomous Agents and Tool Users

📅 2025-08-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing research on large language models (LLMs) as autonomous agents and tool users remains fragmented and limited in architecture design, multi-agent coordination, tool integration, cognitive mechanism modeling, and evaluation frameworks. Method: This survey systematically analyzes 2023–2025 top-tier conference and journal publications using structured literature analysis, integrating prompt engineering and fine-tuning techniques to dissect LLM implementations of core cognitive capabilities—reasoning, planning, and memory. Contribution/Results: We identify three breakthrough directions—verifiable reasoning, self-improvement, and personalized customization—and distill ten concrete future research pathways. Further, we propose a unified evaluation framework covering 68 publicly available datasets, exposing critical gaps in current benchmarks regarding task generalization, dynamic adaptability, and causal attribution capability.

Technology Category

Application Category

📝 Abstract

The pursuit of human-level artificial intelligence (AI) has significantly advanced the development of autonomous agents and Large Language Models (LLMs). LLMs are now widely utilized as decision-making agents for their ability to interpret instructions, manage sequential tasks, and adapt through feedback. This review examines recent developments in employing LLMs as autonomous agents and tool users and comprises seven research questions. We only used the papers published between 2023 and 2025 in conferences of the A* and A rank and Q1 journals. A structured analysis of the LLM agents' architectural design principles, dividing their applications into single-agent and multi-agent systems, and strategies for integrating external tools is presented. In addition, the cognitive mechanisms of LLM, including reasoning, planning, and memory, and the impact of prompting methods and fine-tuning procedures on agent performance are also investigated. Furthermore, we evaluated current benchmarks and assessment protocols and have provided an analysis of 68 publicly available datasets to assess the performance of LLM-based agents in various tasks. In conducting this review, we have identified critical findings on verifiable reasoning of LLMs, the capacity for self-improvement, and the personalization of LLM-based agents. Finally, we have discussed ten future research directions to overcome these gaps.

Problem

Research questions and friction points this paper is trying to address.

Examining LLMs as autonomous agents and tool users

Analyzing architectural designs for single and multi-agent systems

Evaluating benchmarks and datasets for LLM agent performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs as autonomous decision-making agents

Integrating external tools with LLM systems

Enhancing reasoning through prompting and fine-tuning

🔎 Similar Papers

No similar papers found.

Authors to Follow