JUÁ - A Benchmark for Information Retrieval in Brazilian Legal Text Collections

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the absence of a standardized, reproducible evaluation benchmark for legal information retrieval in Brazilian Portuguese. To bridge this gap, we introduce JUÁ, the first public retrieval benchmark encompassing multiple types of Brazilian legal texts—including case law, legislation, regulations, and question-answer pairs—accompanied by standardized protocols, data splits, evaluation metrics, and a public leaderboard. Leveraging JUÁ, we systematically compare lexical retrieval, dense vector retrieval, and BM25-based re-ranking approaches, and further propose a domain-adapted Qwen embedding model fine-tuned on aligned supervised data. Experimental results demonstrate that JUÁ effectively discriminates between method performance: domain adaptation yields significant gains on the JUÁ-Juris subset, while BM25 remains competitive in domains characterized by strong lexical and institutionalized phrasing.
📝 Abstract
Legal information retrieval in Portuguese remains difficult to evaluate systematically because available datasets differ widely in document type, query style, and relevance definition. We present \textsc{JUÁ}, a public benchmark for Brazilian legal retrieval designed to support more reproducible and comparable evaluation across heterogeneous legal collections. More broadly, \textsc{JUÁ} is intended not only as a benchmark, but as a continuous evaluation infrastructure for Brazilian legal IR, combining shared protocols, common ranking metrics, fixed splits when applicable, and a public leaderboard. The benchmark covers jurisprudence retrieval as well as broader legislative, regulatory, and question-driven legal search. We evaluate lexical, dense, and BM25-based reranking pipelines, including a domain-adapted Qwen embedding model fine-tuned on \textsc{JUÁ}-aligned supervision. Results show that the benchmark is sufficiently heterogeneous to distinguish retrieval paradigms and reveal substantial cross-dataset trade-offs. Domain adaptation yields its clearest gains on the supervision-aligned \textsc{JUÁ-Juris} subset, while BM25 remains highly competitive on other collections, especially in settings with strong lexical and institutional phrasing cues. Overall, \textsc{JUÁ} provides a practical evaluation framework for studying legal retrieval across multiple Brazilian legal domains under a common benchmark design.
Problem

Research questions and friction points this paper is trying to address.

legal information retrieval
benchmark
evaluation
Portuguese legal texts
reproducibility
Innovation

Methods, ideas, or system contributions that make the work stand out.

legal information retrieval
benchmark
domain adaptation
dense retrieval
BM25 reranking
🔎 Similar Papers
No similar papers found.
J
Jayr Pereira
Centro de Ciências e Tecnologia, Universidade Federal do Cariri (UFCA), Juazeiro do Norte, Ceará, Brazil.
L
Leandro Fernandes
Câmara dos Deputados, Brasília, Brazil.
E
Erick de Brito
Centro de Ciências e Tecnologia, Universidade Federal do Cariri (UFCA), Juazeiro do Norte, Ceará, Brazil.
R
Roberto Lotufo
Faculdade de Engenharia Elétrica e da Computação, Universidade Estadual de Campinas (UNICAMP), Campinas, São Paulo, Brazil.; NeuralMind.ai, Campinas, São Paulo, Brazil.
Luiz Bonifacio
Luiz Bonifacio
Unicamp, University of Waterloo
Large Language ModelsInformation RetrievalNatural Language Processing