A novel chemogenomics analysis of G protein-coupled receptors (GPCRs) and their ligands: a potential strategy for receptor de-orphanization
Eelke van der Horst
0
Julio E Peironcely
0
Adriaan P IJzerman
0
Margot W Beukers
0
Jonathan R Lane
0
Herman WT van Vlijmen
0
Michael TM Emmerich
Yasushi Okuno
Andreas Bender
0
0
Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University
,
Einsteinweg 55, 2333CC
,
The Netherlands
Background: G protein-coupled receptors (GPCRs) represent a family of well-characterized drug targets with significant therapeutic value. Phylogenetic classifications may help to understand the characteristics of individual GPCRs and their subtypes. Previous phylogenetic classifications were all based on the sequences of receptors, adding only minor information about the ligand binding properties of the receptors. In this work, we compare a sequencebased classification of receptors to a ligand-based classification of the same group of receptors, and evaluate the potential to use sequence relatedness as a predictor for ligand interactions thus aiding the quest for ligands of orphan receptors. Results: We present a classification of GPCRs that is purely based on their ligands, complementing sequence-based phylogenetic classifications of these receptors. Targets were hierarchically classified into phylogenetic trees, for both sequence space and ligand (substructure) space. The overall organization of the sequence-based tree and substructure-based tree was similar; in particular, the adenosine receptors cluster together as well as most peptide receptor subtypes (e.g. opioid, somatostatin) and adrenoceptor subtypes. In ligand space, the prostanoid and cannabinoid receptors are more distant from the other targets, whereas the tachykinin receptors, the oxytocin receptor, and serotonin receptors are closer to the other targets, which is indicative for ligand promiscuity. In 93% of the receptors studied, de-orphanization of a simulated orphan receptor using the ligands of related receptors performed better than random (AUC > 0.5) and for 35% of receptors de-orphanization performance was good (AUC > 0.7). Conclusions: We constructed a phylogenetic classification of GPCRs that is solely based on the ligands of these receptors. The similarities and differences with traditional sequence-based classifications were investigated: our ligandbased classification uncovers relationships among GPCRs that are not apparent from the sequence-based classification. This will shed light on potential cross-reactivity of GPCR ligands and will aid the design of new ligands with the desired activity profiles. In addition, we linked the ligand-based classification with a ligand-focused sequencebased classification described in literature and proved the potential of this method for de-orphanization of GPCRs.
-
Background
G protein-coupled receptors (GPCRs) comprise a large
family, more than 800 in human [1], of cell surface
receptors that consist of seven transmembrane (TM) helices.
These receptors are activated by a variety of external
stimuli, including light, ions, small molecules, lipids, and
proteins; moreover, the majority of therapeutic drugs act
on GPCRs [2]. Because of the limited number of target
crystal structures [3-6], GPCR drug design relies largely
on ligand-based approaches [7] such as property-based
methods [8], pharmacophore models [9], and
substructure methods [10]. These methods do not require any
knowledge about the target protein; however, combining
them with target information often increases their
potential. The resulting so-called 'chemogenomics' approaches
thus involve both ligand-based and target-based aspects
[11]. They do not focus on a single group of ligands and
one individual target, but rather on groups of ligands
against groups of targets. The central idea is that similar
targets have similar ligands [12,13]. Therefore,
relationships between targets from the sequence side can be
exploited to search for novel receptor ligands on the
chemical structure side.
Traditionally, the GPCR superfamily has been classified
based on sequence homology of the receptors.
Kolakowski grouped all seven transmembrane (7-TM)
proteins into classes A to F for receptors proven to bind
Gproteins and class O for the other 7-TM proteins [14].
Class A receptors resemble rhodopsin and form the
largest cluster. Later, Fredriksson et al. proposed a more
elaborate classification for known and predicted human
GPCRs [1]. Surgand et al. presented a sequence-based
phylogenetic classification of GPCRs viewed from a
ligand perspective [15]. By selecting residues pointing
inwards into the generic binding pocket of GPCRs, the
authors assembled a set of 30 residues most likely to be
accessible for ligand binding. Based on these residues,
phylogenetic clustering was performed. Although only a
subset of residues was used, the classification was similar
to classifications based on the full sequence. Applications
of a grouping such as proposed by Surgand et al.
constitute ligand design for related receptors, as well as
deorphanization of GPCRs [15]. However, the study by
Surgant et al. is somewhat limited by the scarcity of
structural protein data where the identification of binding site
residues was solely based on the structure of bovine
rhodopsin. It could not yet take into account recent advances
that yielded three pharmacologically relevant X-ray
crystal structures, namely those of the human 2 and turkey
1 adrenoceptors, as well as of the human adenosine A2A
receptor [3,5,6,16]. Building further on Surgand's work,
Gloriam et al. proposed an extended set of
ligand-accessible residues, derived from visual inspection of the newly
available X-ray GPCR crystal structures, from supporting
mutagenesis data and from the evaluation of previously
established residue sets [17]. The resulting set of 44
residues was then applied to cluster class A GPCRs into a
phylogenetic tree, which reflected similarities in binding
site of the receptors.
Complementary to these sequence-based
classifications are the ligand-based classifications of GPCRs.
Approaches that use ligand similarity measures for target
classification have been previously described [18,19].
Keiser et al. related targets by pair-wise comparison of
their ligands [20]. From a set of 65 k ligands, a network
was constructed connecting almost all 246 targets
through sequential linkage. From this, previously
unknown antagonism of methadone on the muscarinic
M3 receptor and of emetine on the 2-adrenoceptor was
identified.
While sequence-based similarity relies on comparison
of the residues at certain positions in the sequence, there
is no unambiguously defined method to measure
ligandbased similarity. One way of defining ligand similarity is
to consider the overlap of substructures in the molecules.
Frequent substructure mining is a method for finding the
most common substructures in a set of molecules
[2123]. It evaluates all possible substructures, not only
discrete fragments that are present in the molecules; it is
therefore an exhaustive appro (...truncated)