QSAR based predictive modeling for anti-malarial molecules.

Bioinformation, Nov 2019

Malaria is a predominant infectious disease, with a global footprint, but especially severe in developing countries in the African subcontinent. In recent years, drug-resistant malaria has become an alarming factor, and hence the requirement of new and ...

Article PDF cannot be displayed. You can download it here:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5498782/pdf/

QSAR based predictive modeling for anti-malarial molecules.

  Open access   www.bioinformation.net Hypothesis Volume 13(5) QSAR based predictive modeling for anti-malarial molecules Deepak R. Bharti & Andrew M. Lynn* School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi-67; Andrew M. Lynn; E-mail ; *Corresponding Author Received March 17, 2017; Accepted April 21, 2017, Published May 31, 2017 Abstract: Malaria is a predominant infectious disease, with a global footprint, but especially severe in developing countries in the African subcontinent. In recent years, drug-resistant malaria has become an alarming factor, and hence the requirement of new and improved drugs is more crucial than ever before. One of the promising locations for antimalarial drug target is the apicoplast, as this organelle does not occur in humans. The apicoplast is associated with many unique and essential pathways in many Apicomplexan pathogens, including Plasmodium. The use of machine learning methods is now commonly available through open source programs. In the present work, we describe a standard protocol to develop molecular descriptor based predictive models (QSAR models), which can be further utilized for the screening of large chemical libraries. This protocol is used to build models using training data sourced from apicoplast specific bioassays. Multiple model building methods are used including Generalized Linear Models (GLM), Random Forest (RF), C5.0 implementation of a decision tree, Support Vector Machines (SVM), K-Nearest Neighbour and Naive Bayes. Methods to evaluate the accuracy of the model building method are included in the protocol. For the given dataset, the C5.0, SVM and RF perform better than other methods, with comparable accuracy over the test data. Keywords: Malaria, apicoplast, predictive model building, R statistical package Background: Malaria is endemic in many tropical and subtropical regions causing high mortality and morbidity. In the last 10-15 years, due to efforts of a global malaria eradication campaign, a significant fall has been observed in malaria infection cases. However, at the end of 2015, there were 212 million new cases of malaria and 429 thousand deaths have been reported across the globe. The majority of death cases have been recorded in Africa (~92 %) and the SouthEast Asia Region (~6%) [1]. Artemisinin derivatives are regarded as most effective drugs against malaria since the mid-1990s. In 2005, the WHO has recommended artemisinin-combination therapies (ACTs) be the first-line treatments for P. falciparum malaria worldwide [2]. The Artemisinin-derived molecules (ACTs) have a broad spectrum of activity (more than 120 targets) against many biologically important pathways of Plasmodium [3]. Despite their effectiveness, ISSN 0973-2063 (online) 0973-8894 (print) Bioinformation 13(5): 154-159 (2017)   154     drug-resistant malaria has been emerged in many Asian and African countries in recent years [4]–[7]. This scenario threatens the worldwide efforts for complete eradication of malaria and hence it is imperative to identify more drug targets as well as potent drugs to regulate the disease before current therapeutic agents lose their clinical relevance. Studies reveal that one of the most promising targets is the apicoplast due to its involvement in many essential biological pathways unique to Plasmodium [8]. An apicoplast is a non-photosynthetic vestigial plastid, bounded by four membrane layers, which occurs in almost all apicomplexan parasites. It has a 35 kb circular DNA quite similar to a cyanobacterial genome, which encodes approximately 55-60 genes of unknown functionality. However, Its presence is crucial for the cell [9]. There are various genetic and pharmacological studies, which confirm its essential role in cell survival. Genome analysis of apicoplast indicates their role in the biosynthesis of many                                                                                                         ©2017   Open access   important products including type II fatty acids, heme and ironsulphur cluster, and isoprenoid precursors [10]. The pathways related to above products are essentially similar to those of bacteria due to their endosymbiotic origin and entirely different from the pathways of the host organism. There were many antimalarial drugs proposed which targets cellular machinery (proteins/DNA) essential for cell survival ranging from replication, transcription, translation (parasite as well as apicoplast), fatty acid biosynthesis, heme, Iron-sulphur cluster and isoprenoid synthesis (exclusive to apicoplast). Earlier, targeting products of apicoplast gained popularity e.g. FASII pathway, but several genetic and pharmacological studies show evidence for the off-target activity of the inhibitor [11]. There were some successful attempts of targeting isoprenoid pathway [12] and heme biosynthesis [13], [14] already reported. Beside those anabolic pathway-based drug targets, efforts have been made to obstruct the cellular processes of apicoplast such as replication [15], transcription [16] and translation [17], as these processes are known to be quite similar to those of bacteria. Hence, antibacterial drugs are also considered as potential drugs for the malaria parasite. Recent reviews have listed various targets and related drugs [18]–[20]. A detailed view of target proteins summarizes pathways and drug candidates are listed in table 1. In the present study we are focus on predictive model building using bioassay data causing delayed death in malaria parasites. A delayed death is the very interesting phenomena where parasites survive, infect and multiplied but progeny is unable to infect host. With advancement in high-throughput bioassay techniques and computational resources, managing structural information along with bioactivity reading has become a well-established practice. This information can be utilized to screen large chemical libraries virtually, which reduces the cost and time for identifying potential drug-like molecules for further screening stages. One approach to applying this information is predictive model building. In recent years, numerous successful implementations of machine learning (ML) techniques are published for virtual screening of biologically active compounds [21]–[24]. In the present study, we employed various state of the art machine-learning techniques to build classification models using publicly available antimalarial bioassay data with known inhibitory effect against apicoplast formation. To build a robust predictive model we define best practices for data cleaning, preprocessing, feature selection and model building, which are described in this manuscript. A schematic overview of the model building workflow can be seen in Figure 1, and is described in detail in the next section. The met (...truncated)


This is a preview of a remote PDF: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5498782/pdf/
Article home page: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5498782

D. Bharti, A. Lynn. QSAR based predictive modeling for anti-malarial molecules., Bioinformation, pp. 154, Volume 13, Issue 5, DOI: 10.6026/97320630013154