🤖 AI Summary
This study addresses the lack of a comprehensive, multilingual, and temporally extensive news corpus on the Russia-Ukraine war that spans multiple countries and languages, which has hindered systematic analysis of narrative divergence and information warfare strategies across national contexts. To bridge this gap, we introduce the DNIPRO corpus, comprising 246,000 English, Russian, and Chinese news articles from eleven media outlets across five countries, covering the period from February 2022 to August 2024. DNIPRO uniquely integrates sources from opposing geopolitical blocs and provides rich metadata alongside human annotations. Leveraging stance detection, sentiment analysis, thematic framing, and contradiction identification—validated through rigorous human evaluation—the corpus enables longitudinal, cross-national, and cross-lingual discourse comparison, offering a high-quality foundational resource for computational journalism and global information warfare research.
📝 Abstract
We introduce DNIPRO, a novel longitudinal corpus of 246K news articles documenting the Russo-Ukrainian war from Feb 2022 to Aug 2024, spanning eleven media outlets across five nation states (Russia, Ukraine, U.S., U.K., and China) and three languages (English, Russian, and Mandarin Chinese). This multilingual resource features consistent and comprehensive metadata, and multiple types of annotation with rigorous human evaluations for downstream tasks relevant to systematic transnational analyses of contentious wartime discourse. DNIPRO's distinctive value lies in its inclusion of competing geopolitical perspectives, making it uniquely suited for studying narrative divergence, media framing, and information warfare. To demonstrate its utility, we include use case experiments using stance detection, sentiment analysis, topical framing, and contradiction analysis of major conflict events within the larger war. Our explorations reveal how outlets construct competing realities, with coverage exhibiting polarized interpretations that reflect geopolitical interests. Beyond supporting computational journalism research, DNIPRO provides a foundational resource for understanding how conflicting narratives emerge and evolve across global information ecosystems.