MARBLE: A Multi-Agent Rule-Based LLM Reasoning Engine for Accident Severity Prediction

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traffic accident severity prediction faces three major challenges: missing data, high-dimensional feature interdependence, and extreme class imbalance—particularly the scarcity of high-severity instances. Existing single-model or black-box prompting approaches suffer from poor generalizability and limited interpretability. This paper proposes a modular multi-agent collaborative reasoning framework that integrates a rule-based engine with a large language model (LLM) consensus mechanism. Leveraging plug-and-play traditional model components and fine-grained modular prompt engineering, the framework ensures semantically traceable decision-making. Evaluated on real-world datasets from the UK and US, it achieves near 90% accuracy—substantially outperforming conventional models and state-of-the-art prompting methods such as Chain-of-Thought (CoT) and Tree-of-Thought (ToT). To our knowledge, this is the first approach for accident severity prediction that simultaneously delivers high accuracy, strong robustness, and rigorous interpretability, thereby redefining the performance frontier for this task.

Technology Category

Application Category

📝 Abstract
Accident severity prediction plays a critical role in transportation safety systems but is a persistently difficult task due to incomplete data, strong feature dependencies, and severe class imbalance in which rare but high-severity cases are underrepresented and hard to detect. Existing methods often rely on monolithic models or black box prompting, which struggle to scale in noisy, real-world settings and offer limited interpretability. To address these challenges, we propose MARBLE a multiagent rule based LLM engine that decomposes the severity prediction task across a team of specialized reasoning agents, including an interchangeable ML-backed agent. Each agent focuses on a semantic subset of features (e.g., spatial, environmental, temporal), enabling scoped reasoning and modular prompting without the risk of prompt saturation. Predictions are coordinated through either rule-based or LLM-guided consensus mechanisms that account for class rarity and confidence dynamics. The system retains structured traces of agent-level reasoning and coordination outcomes, supporting in-depth interpretability and post-hoc performance diagnostics. Across both UK and US datasets, MARBLE consistently outperforms traditional machine learning classifiers and state-of-the-art (SOTA) prompt-based reasoning methods including Chain-of-Thought (CoT), Least-to-Most (L2M), and Tree-of-Thought (ToT) achieving nearly 90% accuracy where others plateau below 48%. This performance redefines the practical ceiling for accident severity classification under real world noise and extreme class imbalance. Our results position MARBLE as a generalizable and interpretable framework for reasoning under uncertainty in safety-critical applications.
Problem

Research questions and friction points this paper is trying to address.

Predicting accident severity with incomplete, imbalanced data
Overcoming limitations of monolithic models and black-box methods
Enhancing interpretability in safety-critical reasoning systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent rule-based LLM engine
Modular prompting with specialized agents
Rule-based or LLM-guided consensus mechanisms
🔎 Similar Papers
No similar papers found.
Kaleem Ullah Qasim
Kaleem Ullah Qasim
School of Computing and Artificial Intelligence, Southwest Jiaotong University
Reasoning in LLMsPrompt EngineeringLLM Agents
J
Jiashu Zhang
School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan, China