Marija Stanojevic


2024

pdf bib
Speech and Language Biomarkers of Neurodegenerative Conditions: Developing Cross-Linguistically Valid Tools for Automatic Analysis
Iris E. Nowenstein | Marija Stanojevic | Gunnar Örnólfsson | María Kristín Jónsdóttir | Bill Simpson | Jennifer Sorinas Nerin | Bryndís Bergþórsdóttir | Kristín Hannesdóttir | Jekaterina Novikova | Jelena Curcic
Proceedings of the Fifth Workshop on Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments @LREC-COLING 2024

In the last decade, a rapidly growing body of studies has shown promising results for the automatic detection and extraction of speech and language features as biomarkers of neurodegenerative conditions such as Alzheimer’s disease. This has sparked great optimism and the development of various digital health tools, but also warnings regarding the predominance of English in the field and calls for linguistically diverse research as well as global, equitable access to novel clinical instruments. To automatically extract clinically relevant features from transcripts in low-resource languages, two approaches are possible: 1) utilizing a limited range of language-specific tools or 2) translating text to English and then extracting the features. We evaluate these approaches for part-of-speech (POS) rates in transcripts of recorded picture descriptions from a cross-sectional study of Icelandic speakers at different stages of Alzheimer’s disease and healthy controls. While the translation method merits further exploration, only a subset of the POS categories show a promising correspondence to the direct extraction from the Icelandic transcripts in our results, indicating that the translation method has to be linguistically validated at the individual POS category level.

2023

pdf bib
Factors Affecting the Performance of Automated Speaker Verification in Alzheimer’s Disease Clinical Trials
Malikeh Ehghaghi | Marija Stanojevic | Ali Akram | Jekaterina Novikova
Proceedings of the 5th Clinical Natural Language Processing Workshop

Detecting duplicate patient participation in clinical trials is a major challenge because repeated patients can undermine the credibility and accuracy of the trial’s findings and result in significant health and financial risks. Developing accurate automated speaker verification (ASV) models is crucial to verify the identity of enrolled individuals and remove duplicates, but the size and quality of data influence ASV performance. However, there has been limited investigation into the factors that can affect ASV capabilities in clinical environments. In this paper, we bridge the gap by conducting analysis of how participant demographic characteristics, audio quality criteria, and severity level of Alzheimer’s disease (AD) impact the performance of ASV utilizing a dataset of speech recordings from 659 participants with varying levels of AD, obtained through multiple speech tasks. Our results indicate that ASV performance: 1) is slightly better on male speakers than on female speakers; 2) degrades for individuals who are above 70 years old; 3) is comparatively better for non-native English speakers than for native English speakers; 4) is negatively affected by clinician interference, noisy background, and unclear participant speech; 5) tends to decrease with an increase in the severity level of AD. Our study finds that voice biometrics raise fairness concerns as certain subgroups exhibit different ASV performances owing to their inherent voice characteristics. Moreover, the performance of ASV is influenced by the quality of speech recordings, which underscores the importance of improving the data collection settings in clinical trials.