OntoMedRec: Logically-pretrained model-agnostic ontology encoders for medication recommendation (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s11280-024-01268-1.pdf

OntoMedRec: Logically-pretrained model-agnostic ontology encoders for medication recommendation

World Wide Web (2024) 27:28 https://doi.org/10.1007/s11280-024-01268-1 OntoMedRec: Logically-pretrained model-agnostic ontology encoders for medication recommendation Weicong Tan1 · Weiqing Wang1 · Xin Zhou1 · Wray Buntine2 · Gordon Bingham3 · Hongzhi Yin4 Received: 8 February 2024 / Revised: 15 March 2024 / Accepted: 4 April 2024 / Published online: 23 April 2024 © The Author(s) 2024 Abstract Recommending medications with electronic health records (EHRs) is a challenging task for data-driven clinical decision support systems. Most existing models learnt representations for medical concepts based on EHRs and make recommendations with the learnt representations. However, most medications appear in EHR datasets for limited times (the frequency distribution of medications follows power law distribution), resulting in insufficient learning of their representations of the medications. Medical ontologies are the hierarchical classification systems for medical terms where similar terms will be in the same class on a certain level. In this paper, we propose OntoMedRec, the logically-pretrained and model-agnostic medical Ontology Encoders for Medication Recommendation that addresses data sparsity problem with medical ontologies. We conduct comprehensive experiments on real-world EHR datasets to evaluate the effectiveness of OntoMedRec by integrating it into various existing downstream medication recommendation models. The result shows the integration of OntoMedRec improves the performance of various models in both the entire EHR datasets and the admissions with few-shot medications. We provide the GitHub repository for the source code. (https://github. com/WaicongTam/OntoMedRec) Keywords Medication recommendation · Logic tensor networks · Medical ontology 1 Introduction The mass application of electronic health records (EHRs) has made data-driven clinical decision-support systems possible [1]. Deep learning models designed to assist clinical prac- This article belongs to the Topical Collection: Special Issue on Advancing recommendation systems with foundation models Guest Editors: Kai Zheng, Renhe Jiang, and Ryosuke Shibasaki. B Weiqing Wang Extended author information available on the last page of the article 123 28 Page 2 of 17 World Wide Web (2024) 27:28 Figure 1 Frequency distribution of diagnoses and medications in MIMIC-III dataset. The last bin is the cropped diagnoses/medications with a frequency higher than 200/40000 titioners in a range of tasks have emerged, with notable categories encompassing patient risk prediction, re-admission forecasting, the generation of EHR representations, and medication recommendations for prescribers. To assist medical practitioners in prescribing medications, recommending sets of medications for them accurately and efficiently has become a challenging yet crucial task. Therefore, numerous data-driven medication recommendation models have been developed, exemplified by notable solutions such as 4SDrug [2], EDGE [3], and SafeDrug [4]. These models aim to predict the most suitable medication regimen based on a patient’s diagnoses, medical procedures, and/or prior prescription history, as demonstrated by systems like COGNet [5] and SARMR [6]. Existing medication recommendation models fall into two categories: instance-based models and longitudinal models. Instance-based models (e.g., LEAP [7] and 4SDrug [2]) recommend sets of drugs with patients’ diagnoses in the current admission, whereas longitudinal models (e.g., MICRON [8], SafeDrug [4] and COGNet [5]) utilise patients’ previous admissions. For both instance-based models and longitudinal medication recommendation models, we identify one challenge that has not been sufficiently addressed: data sparsity issue (challenge 1). Similar to the user-interaction sparsity challenge in other recommender system models [9, 10], medication recommendation models suffer from data sparsity issues deriving from the frequency distribution of medical concepts. As demonstrated in Figure 1, the majority of diagnoses and medications only appear at limited times in the entire MIMIC-III dataset and their occurrence follows the power law distribution. This inevitably leads to insufficient learning of the indication relationships between diagnoses and medications (i.e., for what medical conditions a medication was designed) in instance-based models and their respective embeddings in longitudinal models. As proven many other recommendation tasks (e.g., [11] and [12]), utilising external knowledge bases can alleviate the cold-start effect. One category of the notable knowledge base for medication recommendation models is medical ontologies. Therefore, to alleviate the data sparsity issue (challenge 1), similar to [13, 14], we leverage external structured knowledge (i.e., medical ontologies) [13, 14] as it provides prior knowledge for the medical terms in EHRs. In EHRs, diagnoses, procedures and medications are encoded in standardised hierarchical classification systems called as medical ontologies. Each medical term is a node of the ontology and the relation between them is “is-a” (e.g., benproperine is a cough suppressant). Figure 2 shows part of ATC ontology which is an ontology of medications. In this ontology, similar medications fall into the same parent node, yet there are definitive differences that distinguish them (i.e., the difference between siblings). For example, as demonstrated in Fig- 123 World Wide Web (2024) 27:28 Page 3 of 17 28 Figure 2 An excerpt of the ATC ontology. Some nodes are omitted ure 2, medications in “Other cough suppressant in ATC” (R05DB) and “Opium alkaloids and derivatives, cough suppressants” (R05DA) fall into the same category “Cough suppressants, excl. combinations with expectorants”(R05). However, they are intrinsically different since codeine cough suppressants (i.e., R05DA) and non-codeine cough suppressants (i.e., R05DB) have different clinical characteristics (e.g., physical dependency and drug-drug interaction). Benproperine and cloperastine have the same therapeutical classification (i.e., they are both non-codeine cough suppressants), yet they are two different chemicals. As we can see from this example and some existing studies in recommender models (e.g., [15] and [16]), effectively modelling the parental, ancestral and sibling relationships (similarities and differences) is beneficial to the medication recommendation task. Even though there are some works exploiting the modelling of medical ontologies in the medication recommendation task, these existing works cannot effectively model ontology relationships to benefit the medication recommendation task (challenge 2). Notable models integrating ontology information in medication recommendation include G-BERT [13] and KnowAugNet [14]. G-BERT uses a Graph Attention Network (GAT) [17] encoder trained end-to-end along with the medication recommendation module. KnowAugNet pretrains ontology encoders w (...truncated)