MSDAFL: molecular substructure-based dual attention feature learning framework for predicting drug–drug interactions (pdf)

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/bioinformatics/article-pdf/40/10/btae596/59835316/btae596.pdf

MSDAFL: molecular substructure-based dual attention feature learning framework for predicting drug–drug interactions

Bioinformatics, 2024, 40(10), btae596 https://doi.org/10.1093/bioinformatics/btae596 Advance Access Publication Date: 9 October 2024 Original Paper Systems biology Chao Hou1, Guihua Duan2, Cheng Yan 1 2 1,� School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan 410208, China School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China �Corresponding author. School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan 410208, China. E-mail: Associate Editor: Jianlin Cheng Abstract Motivation: Drug–drug interactions (DDIs) can cause unexpected adverse drug reactions, affecting treatment efficacy and patient safety. The need for computational methods to predict DDIs has been growing due to the necessity of identifying potential risks associated with drug com binations in advance. Although several deep learning methods have been recently proposed to predict DDIs, many overlook feature learning based on interactions between the substructures of drug pairs. Results: In this work, we introduce a molecular Substructure-based Dual Attention Feature Learning framework (MSDAFL), designed to fully utilize the information between substructures of drug pairs to enhance the performance of DDI prediction. We employ a self-attention module to obtain a set number of self-attention vectors, which are associated with various substructural patterns of the drug molecule itself, while also extracting interaction vectors representing inter-substructure interactions between drugs through an interactive attention module. Subsequently, an interaction module based on cosine similarity is used to further capture the interactive characteristics between the selfattention vectors of drug pairs. We also perform normalization after the interaction feature extraction to mitigate overfitting. After applying three-fold cross-validation, the MSDAFL model achieved average precision scores of 0.9707, 0.9991, and 0.9987, and area under the receiver operating characteristic curve scores of 0.9874, 0.9934, and 0.9974 on three datasets, respectively. In addition, the experiment results of fivefold cross-validation and cross-datum study also indicate that MSDAFL performs well in predicting DDIs. Availability and implementation: Data and source codes are available at https://github.com/27167199/MSDAFL. 1 Introduction Drug–drug interactions (DDIs) can cause unexpected adverse drug reactions, affecting treatment efficacy and patient safety (Vilar et al. 2014). DDIs refer to interactions that occur between two or more drug administration processes, including changes in drug properties and the occurrence of toxic side effects (Sun et al. 2016). Therefore, research on DDI prediction is of great practical importance. However, traditional biological or pharmacological methods are costly, time-consuming, and labor-intensive (Shao and Zhang 2013). Machine learning offers a fresh avenue for accurately predict ing DDIs (Mei and Zhang 2021). Methods based on feature similarity posit that drugs sharing similar attributes often exhibit comparable reaction patterns, relying largely on drug properties such as fingerprinting (Vilar et al. 2013), chemical structures (Takeda et al. 2017), pharmacological phenotypes (Li et al. 2015), and RNA profiles (Li et al. 2022). Enhancements in model efficacy are achieved by integrating various features. For instance, the DDI-IS-SL model forecasts DDIs through a blend of integrated similarity measures and semi-supervised learning techniques (Yan et al. 2020). Despite their advance ments, these feature similarity-based methods often overlook the structural details of drugs, and their feature selection heavily depends on specialized knowledge and experience. Graph neural networks (GNNs) have widely been imple mented to analyze the chemical structures of drugs and forecast DDIs. Contemporary GNN methodologies are divided into two main types. The first type focuses on embedding features di rectly from the molecular graphs of drugs, effectively utilizing a straightforward method to encapsulate graph-based data (Gilmer et al. 2017). In this method, atoms within the molecular graph are treated as nodes, with chemical bonds serving as the connecting edges. This setup allows for the embedding of the molecular graph by learning features of individual atoms and the interactions conveyed through the chemical bonds. For in stance, SSI-DDI deconstructs the DDI prediction task between two drugs to pinpoint pairwise interactions among their respec tive substructures (Nyamabo et al. 2021). DSN-DDI is a dual-view drug representation learning network specifically engineered to concurrently learn drug substructures from indi vidual drugs and drug pairs (Li et al. 2023). The second type leverages existing drug interaction networks, where drugs are nodes and their interactions are edges, treating the task of DDI prediction as akin to link prediction within these networks. Received: 25 June 2024; Revised: 24 August 2024; Editorial Decision: 29 September 2024; Accepted: 7 October 2024 © The Author(s) 2024. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. MSDAFL: molecular substructure-based dual attention feature learning framework for predicting drug–drug interactions 2 Hou et al. � � � We designed a new Molecular Substructure-based Dual Attention Feature Learning framework for predicting DDIs (MSDAFL). This framework leverages both self-attention and interactive attention mechanisms to effectively extract and process interaction information between drug substruc tures, enhancing the accuracy of DDI predictions. To uncover the hidden features of interactions between drug substructures, we computed the cosine similarity matrix. This approach has shown that these similarity vectors significantly contribute to the accuracy of predict ing DDIs. Additionally, to reduce overfitting during model training, we adopted a normalization strategy. This not only retains the essential interaction features but also improves the predictability and reliability of DDI outcomes. 2 Materials and methods 2.1 Dataset To evaluate the scalability and robustness of MSDAFL, we test our model on three public datasets, which vary in scale, density and widely used in previous studies. The scale of the dataset is determined by the number of drugs included. According to pre vious studies, we also treat the observed DDIs as positive sam ples and also randomly sample the non-existing DDIs to generate the negative samples. We perform stratified splitting to divide all the drug pairs into a training set, a validation set, and a testing set in a ratio of 6:2:2 (three-fold cross-validation) and 8 (...truncated)