Giovanni Sartor


2022

pdf bib
Detecting Arguments in CJEU Decisions on Fiscal State Aid
Giulia Grundler | Piera Santin | Andrea Galassi | Federico Galli | Francesco Godano | Francesca Lagioia | Elena Palmieri | Federico Ruggeri | Giovanni Sartor | Paolo Torroni
Proceedings of the 9th Workshop on Argument Mining

The successful application of argument mining in the legal domain can dramatically impact many disciplines related to law. For this purpose, we present Demosthenes, a novel corpus for argument mining in legal documents, composed of 40 decisions of the Court of Justice of the European Union on matters of fiscal state aid. The annotation specifies three hierarchical levels of information: the argumentative elements, their types, and their argument schemes. In our experimental evaluation, we address 4 different classification tasks, combining advanced language models and traditional classifiers.

pdf bib
Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts
Sezen Perçin | Andrea Galassi | Francesca Lagioia | Federico Ruggeri | Piera Santin | Giovanni Sartor | Paolo Torroni
Proceedings of the Natural Legal Language Processing Workshop 2022

Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts. To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology. We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union.Our evaluation with human experts confirms that our method is more robust than the alternatives.

2021

pdf bib
A Corpus for Multilingual Analysis of Online Terms of Service
Kasper Drawzeski | Andrea Galassi | Agnieszka Jablonowska | Francesca Lagioia | Marco Lippi | Hans Wolfgang Micklitz | Giovanni Sartor | Giacomo Tagiuri | Paolo Torroni
Proceedings of the Natural Legal Language Processing Workshop 2021

We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, German, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages.