Kevin Tobia


2024

pdf bib
CuRIAM: Corpus Re Interpretation and Metalanguage in U.S. Supreme Court Opinions
Michael Kranzlein | Nathan Schneider | Kevin Tobia
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Most judicial decisions involve the interpretation of legal texts. As such, judicial opinions use language as the medium to comment on or draw attention to other language (for example, through definitions and hypotheticals about the meaning of a term from a statute). Language used this way is called metalanguage. Focusing on the U.S. Supreme Court, we view metalanguage as reflective of justices’ interpretive processes, bearing on current debates and theories about textualism in law and political science. As a step towards large-scale metalinguistic analysis with NLP, we identify 9 categories prominent in metalinguistic discussions, including key terms, definitions, and different kinds of sources. We annotate these concepts in a corpus of U.S. Supreme Court opinions. Our analysis of the corpus reveals high interannotator agreement, frequent use of quotes and sources, and several notable frequency differences between majority, concurring, and dissenting opinions. We observe fewer instances than expected of several legal interpretive categories. We discuss some of the challenges in developing the annotation schema and applying it and provide recommendations for how this corpus can be used for broader analyses.