Falkor-IRAC: Graph-Constrained Generation for Verified Legal Reasoning in Indian Judicial AI

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This work addresses critical challenges in applying large language models to Indian legal AI—namely, hallucinated precedents, citations to outdated statutes, and unsubstantiated reasoning stemming from reliance on vector retrieval. To mitigate these issues, the authors propose a graph-constrained generation framework grounded in IRAC (Issue, Rule, Analysis, Conclusion) knowledge graphs. The approach structures judgments from the Supreme Court and High Courts of India into graph nodes encoding procedural state transitions, precedent relationships, and statutory references. A verification agent traverses these graphs to validate reasoning paths and assess falsifiability, ensuring generated answers are supported by valid graph trajectories. This study pioneers the integration of graph-constrained generation and doctrinal conflict detection in legal AI and introduces graph-native evaluation metrics that surpass conventional text-similarity measures. Experiments on a 51-judgment validation set demonstrate high accuracy in identifying authentic citations and rejecting fabricated precedents, enabling interpretable and verifiable legal reasoning.

📝 Abstract

Legal reasoning is not semantic similarity search. A court judgment encodes constrained symbolic reasoning: precedent propagation, procedural state transitions, and statute-bound inference. These are properties that vector-based retrieval-augmented generation (RAG) cannot faithfully represent. Hallucinated precedents, outdated statute citations, and unsupported reasoning chains remain persistent failure modes in LLM-based legal AI, with real consequences for access to justice in high-caseload jurisdictions such as India. This paper presents Falkor-IRAC, a graph-constrained generation framework for Indian legal AI that grounds generation in structured reasoning over an IRAC (Issue, Rule, Analysis, Conclusion) knowledge graph. Judgments from the Supreme Court and High Courts of India are ingested as IRAC node structures enriched with procedural state transitions, precedent relationships, and statutory references, stored in FalkorDB for low-latency agentic traversal. At inference time, LLM-generated answers are accepted only if a valid supporting path can be traced through the graph, a check performed by a falsifiability oracle called the Verifier Agent. The system also detects doctrinal conflicts as a first-class output rather than silently resolving them. Falkor-IRAC is evaluated using graph-native metrics: citation grounding accuracy, path validity rate, hallucinated precedent rate, and conflict detection rate. These metrics are argued to be more appropriate for legal reasoning evaluation than BLEU and ROUGE. On a proof-of-concept corpus of 51 Supreme Court judgments, the Verifier Agent correctly validated citations on completed queries and correctly rejected fabricated citations. Evaluation against vector-only RAG baselines is left for future work, as is GPU-accelerated inference to address current timeout rates on CPU hardware.

Problem

Research questions and friction points this paper is trying to address.

legal reasoning

hallucination

precedent propagation

statute-bound inference

retrieval-augmented generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph-constrained generation

legal reasoning

IRAC knowledge graph