🤖 AI Summary
Low automation efficiency and difficulty in structured generation of Indian private-law documents hinder legal practice. Method: This paper introduces VidhikDastaavej—the first anonymized, domain-specific dataset of Indian legal documents—and proposes MAW, a model-agnostic, two-stage encapsulation framework: (1) structured section-title generation, followed by (2) content population via retrieval-augmented generation (RAG) and iterative refinement. MAW mitigates limitations of few-shot fine-tuning, enabling seamless integration with arbitrary large language models (LLMs). It further incorporates instruction tuning, domain adaptation, and a human-in-the-loop interface to enhance coherence, factual accuracy, and hallucination resistance. Contribution/Results: Expert evaluation confirms high reliability and robustness; MAW achieves the first end-to-end, structured generation of multiple Indian private-law document types, establishing a foundational benchmark for automated legal drafting in India.
📝 Abstract
Automating legal document drafting can significantly enhance efficiency, reduce manual effort, and streamline legal workflows. While prior research has explored tasks such as judgment prediction and case summarization, the structured generation of private legal documents in the Indian legal domain remains largely unaddressed. To bridge this gap, we introduce VidhikDastaavej, a novel, anonymized dataset of private legal documents, and develop NyayaShilp, a fine-tuned legal document generation model specifically adapted to Indian legal texts. We propose a Model-Agnostic Wrapper (MAW), a two-step framework that first generates structured section titles and then iteratively produces content while leveraging retrieval-based mechanisms to ensure coherence and factual accuracy. We benchmark multiple open-source LLMs, including instruction-tuned and domain-adapted versions, alongside proprietary models for comparison. Our findings indicate that while direct fine-tuning on small datasets does not always yield improvements, our structured wrapper significantly enhances coherence, factual adherence, and overall document quality while mitigating hallucinations. To ensure real-world applicability, we developed a Human-in-the-Loop (HITL) Document Generation System, an interactive user interface that enables users to specify document types, refine section details, and generate structured legal drafts. This tool allows legal professionals and researchers to generate, validate, and refine AI-generated legal documents efficiently. Extensive evaluations, including expert assessments, confirm that our framework achieves high reliability in structured legal drafting. This research establishes a scalable and adaptable foundation for AI-assisted legal drafting in India, offering an effective approach to structured legal document generation.