HIVprotI: an integrated web based platform for prediction and design of HIV proteins inhibitors
Qureshi et al. J Cheminform (2018) 10:12
https://doi.org/10.1186/s13321-018-0266-y
Open Access
RESEARCH ARTICLE
HIVprotI: an integrated web based
platform for prediction and design of HIV
proteins inhibitors
Abid Qureshi, Akanksha Rajput, Gazaldeep Kaur and Manoj Kumar*
Abstract
A number of anti-retroviral drugs are being used for treating Human Immunodeficiency Virus (HIV) infection. Due to
emergence of drug resistant strains, there is a constant quest to discover more effective anti-HIV compounds. In this
endeavor, computational tools have proven useful in accelerating drug discovery. Although methods were published
to design a class of compounds against a specific HIV protein, but an integrated web server for the same is lacking.
Therefore, we have developed support vector machine based regression models using experimentally validated data
from ChEMBL repository. Quantitative structure activity relationship based features were selected for predicting inhibition activity of a compound against HIV proteins namely protease (PR), reverse transcriptase (RT) and integrase (IN).
The models presented a maximum Pearson correlation coefficient of 0.78, 0.76, 0.74 and 0.76, 0.68, 0.72 during tenfold
cross-validation on IC50 and percent inhibition datasets of PR, RT, IN respectively. These models performed equally
well on the independent datasets. Chemical space mapping, applicability domain analyses and other statistical tests
further support robustness of the predictive models. Currently, we have identified a number of chemical descriptors
that are imperative in predicting the compound inhibition potential. HIVprotI platform (http://bioinfo.imtech.res.in/
manojk/hivproti) would be useful in virtual screening of inhibitors as well as designing of new molecules against the
important HIV proteins for therapeutics development.
Keywords: HIV, Reverse transcriptase, Protease, Integrase, Inhibitors, QSAR, Algorithm, Web server
Background
Human Immunodeficiency Virus (HIV) is one of the
reasons for human death and suffering worldwide. It
causes Acquired Immunodeficiency Syndrome (AIDS) in
which gradual breakdown of the immune system allows
critical opportunistic diseases to flourish [1]. As per the
UNAIDS report, around 78 million people have become
infected with HIV and 35 million people have died of
AIDS-related illnesses since the start of the epidemic. In
2015 alone there were about 36.9 million people living
with HIV of which 1.1 million died (http://www.unaids.
org/en/resources/campaigns/HowAIDSchangedeverything/factsheet). Due to the high genetic variability and
*Correspondence:
Bioinformatics Centre, Institute of Microbial Technology, Council
of Scientific and Industrial Research, Sector 39A, Chandigarh 160036,
India
mutation rate of HIV, vaccines are not available to curb
the HIV infection [2].
Researchers have put a considerable focus on HIV therapy and a lot of compounds have been tested against this
pathogen [3, 4]. However, a few antiretroviral drugs have
been able to slow the disease progression. These drugs
blocked the function of proteins implicated in certain
stages of the HIV life-cycle [5]. Different HIV enzymes
are needed for the development of the retrovirus including reverse transcriptase (RT), protease (PR) and integrase (IN) [6]. RT creates complementary DNA from an
RNA template which can integrate into the host genome
and its inhibitors are widely used as antiretroviral drugs
[7]. For example, the first anti-HIV drug zidovudine or
azidothymidine (a nucleoside analog) was approved by
the Food and Drug Administration (FDA) in 1987. It
inhibits HIV reverse transcriptase, hence thwarting viral
© The Author(s) 2018. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium,
provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license,
and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/
publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Qureshi et al. J Cheminform (2018) 10:12
Page 2 of 15
replication [8]. PR slices the newly synthesized polyproteins at the relevant positions to form the mature protein apparatus and is a major drug-target for treatment
of HIV [9]. In 1995, saquinavir (invirase) became the first
approved protease inhibitor. It blocks the enzyme’s active
site, thus restricting the processing of HIV poly-proteins
[10, 11]. The IN enzyme enables the virus to integrate its
genetic material into the DNA of the host cell for a longterm infection. Compounds that inhibit the IN enzyme
have demonstrated potent anti-HIV activity [12]. For
example, raltegravir (isentress), the first integrase inhibitor was approved by FDA in 2007 [13]. Presently about 30
antiretroviral drugs are prescribed for the clinical treatment of AIDS [14]. An improved knowledge of the structure and function viral proteins has led antiviral drug
developers to design better antivirals to treat HIV infections [15].
To conserve capital and time for finding novel drugs,
scientists have extensively used different computational
approaches to scan virtual compound libraries prior to the
wet lab experiments [16]. The preferred targeted region
should be off-target free and conserved across many
strains of a virus for broad activity. Once the target is chosen, candidate antivirals can be selected by predicting the
potential inhibitor using bioinformatics approaches [17,
18]. Amongst the diverse methods, quantitative structure
activity relationship (QSAR) is being regularly used [19–
22]. In QSAR, associations involving chemical descriptors
and activity are employed to envisage the properties of
other compounds [23]. The chemical descriptors present
the structural information of a compound as numerical
values [24]. Virtual screening employing QSAR is a valuable bioinformatics approach which helps to identify and
devise of new antiviral drugs [25].
Several attempts have been made for predicting specific types of compounds against different HIV proteins
(discussed later). Nevertheless, till date there no web
server/software, which can regressively estimate the
IC50/percentage inhibition activity of diverse types of,
compounds against different HIV proteins. To accommodate this requirement, we created HIVprotI, a web based
algorithm for prediction and design of protein specific
anti-HIV compounds. In this approach, we employed
experimentally validated inhibitors against RT, PR, IN
(with IC50/percentage inhibition) from ChEMBL [26].
We calculated molecular descriptors and performed feature selection to pick the best performing descriptors,
which were employed to build support vector machine
(SVM) based QSAR models for the prediction of inhibit (...truncated)