Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering

Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, Katharina Kann


Abstract
We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms. To this end, we release corpora for 5 development and 9 test languages, as well as gold partial paradigms for evaluation. We receive 14 submissions from 4 teams that follow different strategies, and the best performing system is based on adaptor grammars. Results vary significantly across languages. However, all systems are outperformed by a supervised lemmatizer, implying that there is still room for improvement.
Anthology ID:
2021.sigmorphon-1.8
Volume:
Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
August
Year:
2021
Address:
Online
Editors:
Garrett Nicolai, Kyle Gorman, Ryan Cotterell
Venue:
SIGMORPHON
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
72–81
Language:
URL:
https://aclanthology.org/2021.sigmorphon-1.8
DOI:
10.18653/v1/2021.sigmorphon-1.8
Bibkey:
Cite (ACL):
Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, and Katharina Kann. 2021. Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering. In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 72–81, Online. Association for Computational Linguistics.
Cite (Informal):
Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering (Wiemerslage et al., SIGMORPHON 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.sigmorphon-1.8.pdf