Mapping Work Task Descriptions from German Job Ads on the O*NET Work Activities Ontology

Ann-Sophie Gnehm, Simon Clematide


Abstract
This work addresses the challenge of extracting job tasks from German job postings and mapping them to the fine-grained work activities classification in the O*NET labor market ontology. By utilizing ontological data with a Multiple Negatives Ranking loss and integrating a modest volume of labeled job advertisement data into the training process, our top configuration achieved a notable precision of 70% for the best mapping on the test set, representing a substantial improvement compared to the 33% baseline delivered by a general-domain SBERT. In our experiments the following factors proved to be most effective for improving SBERT models: First, the incorporation of subspan markup, both during training and inference, supports accurate classification, by streamlining varied job ad task formats with structured, uniform ontological work activities. Second, the inclusion of additional occupational information from O*NET into training supported learning by contextualizing hierarchical ontological relationships. Third, the most significant performance improvement was achieved by updating SBERT models with labeled job ad data specifically addressing challenging cases encountered during pre-finetuning, effectively bridging the semantic gap between O*NET and job ad data.
Anthology ID:
2024.lrec-main.963
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
11049–11059
Language:
URL:
https://aclanthology.org/2024.lrec-main.963
DOI:
Bibkey:
Cite (ACL):
Ann-Sophie Gnehm and Simon Clematide. 2024. Mapping Work Task Descriptions from German Job Ads on the O*NET Work Activities Ontology. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 11049–11059, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Mapping Work Task Descriptions from German Job Ads on the O*NET Work Activities Ontology (Gnehm & Clematide, LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.963.pdf