🤖 AI Summary
This study investigates the capacity of large language models (LLMs) to predict violent conflict escalation and fatality counts solely from pretrained parametric knowledge, and systematically compares their performance against retrieval-augmented generation (RAG) systems integrating ACLED, GDELT, and news data—non-parametric approaches—for forecasting conflict trends (Escalate/Stable/De-escalate/Peace) and casualty figures across the Horn of Africa and the Middle East (2020–2024). It introduces, for the first time, a dual-modality evaluation framework that explicitly distinguishes and quantifies the predictive utility of parametric versus non-parametric knowledge. Results show that purely parametric prediction suffers from significant temporal lag and low accuracy; RAG improves trend classification accuracy by 27% and reduces fatality prediction error by 41%. The findings empirically validate that external, dynamically updated knowledge is indispensable for LLM-based early conflict warning, establishing both an evidence base and a methodological paradigm for AI-enhanced humanitarian decision-making.
📝 Abstract
Large Language Models (LLMs) have shown impressive performance across natural language tasks, but their ability to forecast violent conflict remains underexplored. We investigate whether LLMs possess meaningful parametric knowledge-encoded in their pretrained weights-to predict conflict escalation and fatalities without external data. This is critical for early warning systems, humanitarian planning, and policy-making. We compare this parametric knowledge with non-parametric capabilities, where LLMs access structured and unstructured context from conflict datasets (e.g., ACLED, GDELT) and recent news reports via Retrieval-Augmented Generation (RAG). Incorporating external information could enhance model performance by providing up-to-date context otherwise missing from pretrained weights. Our two-part evaluation framework spans 2020-2024 across conflict-prone regions in the Horn of Africa and the Middle East. In the parametric setting, LLMs predict conflict trends and fatalities relying only on pretrained knowledge. In the non-parametric setting, models receive summaries of recent conflict events, indicators, and geopolitical developments. We compare predicted conflict trend labels (e.g., Escalate, Stable Conflict, De-escalate, Peace) and fatalities against historical data. Our findings highlight the strengths and limitations of LLMs for conflict forecasting and the benefits of augmenting them with structured external knowledge.