FAIRification of LeiLanD

Eric Sanders, Sara Petrollino, Gilles R. Scheifer, Henk van den Heuvel, Christopher Handy


Abstract
LeiLanD (Leiden Language Data) is a searchable catalogue initiated by the Leiden University Centre for Linguistics (LUCL) with the support of CLARIAH. The catalogue contains metadata about language datasets collected at LUCL and other institutes of Leiden University. This paper describes a project to FAIRify the datasets increasing their findability and accessibility through a standardised metadata format CMDI so as to obtain a rich metadata description for all resources and to make them findable through CLARIN’s Virtual Language Observatory. The paper describes the creation of the catalogue and the steps that led from unstructured metadata to CMDI standards. This FAIRifi- cation of LeiLanD has enhanced the findability and accessibility of incredibly diverse collection of language datasets.
Anthology ID:
2024.lrec-main.623
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
7101–7106
Language:
URL:
https://aclanthology.org/2024.lrec-main.623
DOI:
Bibkey:
Cite (ACL):
Eric Sanders, Sara Petrollino, Gilles R. Scheifer, Henk van den Heuvel, and Christopher Handy. 2024. FAIRification of LeiLanD. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7101–7106, Torino, Italia. ELRA and ICCL.
Cite (Informal):
FAIRification of LeiLanD (Sanders et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.623.pdf