JUÁ - A Benchmark for Information Retrieval in Brazilian Legal Text Collections

2026-04-07Information Retrieval

Information RetrievalComputation and Language
AI summary

The authors created JUÁ, a public benchmark to help test and compare how well different legal information search systems work for Brazilian Portuguese. JUÁ includes various types of legal documents and search tasks to provide a unified way to measure performance. They tested several search methods, including a specialized AI model trained on JUÁ data, and found that different methods perform better on different parts of the benchmark. This tool aims to standardize evaluations and support ongoing improvements in legal information retrieval in Brazil.

legal information retrievalbenchmarkBrazilian Portuguesejurisprudencelexical searchdense retrievalBM25embedding modeldomain adaptationevaluation metrics
Authors
Jayr Pereira, Leandro Fernandes, Erick de Brito, Roberto Lotufo, Luiz Bonifacio
Abstract
Legal information retrieval in Portuguese remains difficult to evaluate systematically because available datasets differ widely in document type, query style, and relevance definition. We present \textsc{JUÁ}, a public benchmark for Brazilian legal retrieval designed to support more reproducible and comparable evaluation across heterogeneous legal collections. More broadly, \textsc{JUÁ} is intended not only as a benchmark, but as a continuous evaluation infrastructure for Brazilian legal IR, combining shared protocols, common ranking metrics, fixed splits when applicable, and a public leaderboard. The benchmark covers jurisprudence retrieval as well as broader legislative, regulatory, and question-driven legal search. We evaluate lexical, dense, and BM25-based reranking pipelines, including a domain-adapted Qwen embedding model fine-tuned on \textsc{JUÁ}-aligned supervision. Results show that the benchmark is sufficiently heterogeneous to distinguish retrieval paradigms and reveal substantial cross-dataset trade-offs. Domain adaptation yields its clearest gains on the supervision-aligned \textsc{JUÁ-Juris} subset, while BM25 remains highly competitive on other collections, especially in settings with strong lexical and institutional phrasing cues. Overall, \textsc{JUÁ} provides a practical evaluation framework for studying legal retrieval across multiple Brazilian legal domains under a common benchmark design.