Adam Pollins


2024

pdf bib
Expanding Russian PropBank: Challenges and Insights for Developing New SRL Resources
Skatje Myers | Roman Khamov | Adam Pollins | Rebekah Tozier | Olga Babko-Malaya | Martha Palmer
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024

Semantic role labeling (SRL) resources, such as Proposition Bank (PropBank), provide useful input to downstream applications. In this paper we present some challenges and insights we learned while expanding the previously developed Russian PropBank. This new effort involved annotation and adjudication of all predicates within a subset of the prior work in order to provide a test corpus for future applications. We discuss a number of new issues that arose while developing our PropBank for Russian as well as our solutions. Framing issues include: distinguishing between morphological processes that warrant new frames, differentiating between modal verbs and predicate verbs, and maintaining accurate representations of a given language’s semantics. Annotation issues include disagreements derived from variability in Universal Dependency parses and semantic ambiguity within the text. Finally, we demonstrate how Russian sentence structures reveal inherent limitations to PropBank’s ability to capture semantic data. These discussions should prove useful to anyone developing a PropBank or similar SRL resources for a new language.

2023

pdf bib
How Good Is the Model in Model-in-the-loop Event Coreference Resolution Annotation?
Shafiuddin Rehan Ahmed | Abhijnan Nath | Michael Regan | Adam Pollins | Nikhil Krishnaswamy | James H. Martin
Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII)

Annotating cross-document event coreference links is a time-consuming and cognitively demanding task that can compromise annotation quality and efficiency. To address this, we propose a model-in-the-loop annotation approach for event coreference resolution, where a machine learning model suggests likely corefering event pairs only. We evaluate the effectiveness of this approach by first simulating the annotation process and then, using a novel annotator-centric Recall-Annotation effort trade-off metric, we compare the results of various underlying models and datasets. We finally present a method for obtaining 97% recall while substantially reducing the workload required by a fully manual annotation process.