HKJudge: A Legal Discourse-Annotated Corpus for Interpreting What Courts Find, How They Reason, and What They Rule

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the lack of expert-annotated corpora for Hong Kong court judgments, which has hindered computational modeling of legal discourse structure. We present HKJudge, the first sentence-level expert-annotated corpus of Hong Kong judgments, comprising approximately 290,000 sentences from five court levels. A two-tier annotation framework tailored to common law judgments is introduced: the upper tier defines 26 rhetorical roles, while the lower tier captures three sentencing-related legal elements. Annotations were performed by ten legal linguistics experts with substantial inter-annotator agreement (Cohen’s κ = 0.8). Leveraging this resource, we establish two benchmark tasks—rhetorical role classification and legal element extraction—and evaluate the performance of BERT and various large language models under both zero-shot and fine-tuned settings. This work provides the first fine-grained characterization of the logical structure of Hong Kong judgments, offering a high-quality data foundation and evaluation benchmarks for legal judgment understanding and downstream applications.

📝 Abstract

Court judgments are central to legal practice and jurisprudence, yet discourse analysis of Hong Kong judgments has received limited attention, owing largely to the absence of expert-annotated corpora. We introduce the Hong Kong Judgment Discourse Dataset (HKJudge), the first sentence-level expert-annotated legal discourse corpus. HKJudge includes criminal judgments across all five levels of HK's court hierarchy, comprising $\sim$290k sentences and $\sim$6.5 million tokens, fully annotated by legal linguistics experts. We design a two-tier discourse schema that captures what facts a court finds, how it reasons, and what it rules. At the sentence level, each sentence is assigned one of 26 rhetorical roles. At the span level, sentences are further annotated with three sentencing elements (charge, imprisonment term, fine). Ten legal linguistics annotators produced the annotations with an inter-annotator agreement of $κ= 0.8$. We formulate two tasks on HKJudge, termed rhetorical role classification and legal element extraction, and provide the first benchmark evaluation of four BERT-based models, two open-source LLMs under zero-shot and fine-tuning settings, and four commercial LLMs on both tasks. Our work demonstrates the value of sentence-level discourse annotation for modeling the structure of HK judgments and provides a rich data foundation for future work on legal judgment prediction. The HKJudge dataset and code are available at https://github.com/xuanxixi/HKJudge.

Problem

Research questions and friction points this paper is trying to address.

legal discourse

court judgments

annotated corpus

rhetorical roles

legal element extraction

Innovation

Methods, ideas, or system contributions that make the work stand out.

legal discourse annotation

sentence-level corpus

rhetorical role classification