🤖 AI Summary
Weak reasoning capabilities of large language models (LLMs) and vision-language models (VLMs) hinder their trustworthy deployment in high-stakes domains such as healthcare, finance, and law.
Method: We propose the first unified taxonomy for reasoning paradigms—systematically comparing chain-of-thought, tree-based, and reflective strategies—and analyze their fundamental trade-offs among generalizability, interpretability, and computational efficiency. Through a comprehensive survey of techniques—including chain-of-thought prompting, self-consistency, verifier-guided reasoning, process supervision, and VLM-augmented joint reasoning—we identify three core bottlenecks: long-horizon logical consistency, dynamic knowledge integration, and causal inference.
Contribution/Results: We introduce novel evaluation dimensions tailored to trustworthy AI and establish a principled foundation for enhancing deep reasoning and real-world robustness in multimodal models, offering both theoretical insights and actionable methodological pathways.
📝 Abstract
Large Language Models (LLMs) are highly proficient in language-based tasks. Their language capabilities have positioned them at the forefront of the future AGI (Artificial General Intelligence) race. However, on closer inspection, Valmeekam et al. (2024); Zecevic et al. (2023); Wu et al. (2024) highlight a significant gap between their language proficiency and reasoning abilities. Reasoning in LLMs and Vision Language Models (VLMs) aims to bridge this gap by enabling these models to think and re-evaluate their actions and responses. Reasoning is an essential capability for complex problem-solving and a necessary step toward establishing trust in Artificial Intelligence (AI). This will make AI suitable for deployment in sensitive domains, such as healthcare, banking, law, defense, security etc. In recent times, with the advent of powerful reasoning models like OpenAI O1 and DeepSeek R1, reasoning endowment has become a critical research topic in LLMs. In this paper, we provide a detailed overview and comparison of existing reasoning techniques and present a systematic survey of reasoning-imbued language models. We also study current challenges and present our findings.