TathyaNyaya and FactLegalLlama: Advancing Factual Judgment Prediction and Explanation in the Indian Legal Context

📅 2025-04-07

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address the lack of factual grounding, transparency, and interpretability in AI-based judicial prediction within the Indian legal context, this paper introduces the first “fact-centered” legal AI modeling paradigm. We construct TathyaNyaya—a large-scale, multi-level, structured dataset of judicial fact annotations from Indian case law—currently the largest and most diverse benchmark for Fact-Judgment Prediction and Explanation (FJPE) in India. Building upon it, we propose FactLegalLlama: a joint framework based on instruction-tuned LLaMA-3-8B that performs end-to-end fact-driven judgment prediction and natural language explanation generation. Our approach achieves state-of-the-art performance in both prediction accuracy and explanation relevance and coherence within Indian legal NLP. It is the first to systematically tackle three core challenges: factual anchoring, cross-court generalization, and joint modeling of prediction and interpretability.

Technology Category

Application Category

📝 Abstract

In the landscape of Fact-based Judgment Prediction and Explanation (FJPE), reliance on factual data is essential for developing robust and realistic AI-driven decision-making tools. This paper introduces TathyaNyaya, the largest annotated dataset for FJPE tailored to the Indian legal context, encompassing judgments from the Supreme Court of India and various High Courts. Derived from the Hindi terms"Tathya"(fact) and"Nyaya"(justice), the TathyaNyaya dataset is uniquely designed to focus on factual statements rather than complete legal texts, reflecting real-world judicial processes where factual data drives outcomes. Complementing this dataset, we present FactLegalLlama, an instruction-tuned variant of the LLaMa-3-8B Large Language Model (LLM), optimized for generating high-quality explanations in FJPE tasks. Finetuned on the factual data in TathyaNyaya, FactLegalLlama integrates predictive accuracy with coherent, contextually relevant explanations, addressing the critical need for transparency and interpretability in AI-assisted legal systems. Our methodology combines transformers for binary judgment prediction with FactLegalLlama for explanation generation, creating a robust framework for advancing FJPE in the Indian legal domain. TathyaNyaya not only surpasses existing datasets in scale and diversity but also establishes a benchmark for building explainable AI systems in legal analysis. The findings underscore the importance of factual precision and domain-specific tuning in enhancing predictive performance and interpretability, positioning TathyaNyaya and FactLegalLlama as foundational resources for AI-assisted legal decision-making.

Problem

Research questions and friction points this paper is trying to address.

Develops AI tools for factual judgment prediction in Indian law.

Creates largest annotated dataset for Indian legal factual analysis.

Optimizes LLM for transparent, explainable legal decision-making systems.

Innovation

Methods, ideas, or system contributions that make the work stand out.

TathyaNyaya: largest annotated Indian legal dataset

FactLegalLlama: instruction-tuned LLaMa-3-8B for explanations

Combines transformers and LLM for prediction and explanation

🔎 Similar Papers

Legal Fact Prediction: The Missing Piece in Legal Judgment Prediction