Yutian Zhao


2024

pdf bib
MKeCL: Medical Knowledge-Enhanced Contrastive Learning for Few-shot Disease Diagnosis
Yutian Zhao | Huimin Wang | Xian Wu | Yefeng Zheng
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Artificial intelligence (AI)-aided disease prediction has gained extensive research interest due to its capability to support clinical decision-making. Existing works mainly formulate disease prediction as a multi-label classification problem and use historical Electronic Medical Records (EMR) to train supervised models. However, in real-world clinics, such purely data-driven approaches pose two main challenges: 1) long tail problem: there are excessive EMRs for common diseases and insufficient EMRs for rare diseases, thus training over an imbalanced data set could result in a biased model that ignores rare diseases in diagnosis; 2) easily misdiagnosed diseases: some diseases can be easily distinguished while others sharing analogous conditions are much more difficult. General classification models without emphasizing easily misdiagnosed diseases may generate incorrect predictions. To tackle these two problems, we propose a Medical Knowledge-Enhanced Contrastive Learning (MKeCL) approach to disease diagnosis in this paper. MKeCL incorporates medical knowledge graphs and medical licensing exams in modeling in order to compensate for the insufficient information on rare diseases; To handle hard-to-diagnose diseases, MKeCL introduces a contrastive learning strategy to separate diseases that are easily misdiagnosed. Moreover, we establish a new benchmark, named Jarvis-D, which contains clinical EMRs collected from various hospitals. Experiments on real clinical EMRs show that the proposed MKeCL outperforms existing disease prediction approaches, especially in the setting of few-shot and zero-shot scenarios.