A novel chemogenomics analysis of G protein-coupled receptors (GPCRs) and their ligands: a potential strategy for receptor de-orphanization (pdf)

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/1471-2105-11-316.pdf

A novel chemogenomics analysis of G protein-coupled receptors (GPCRs) and their ligands: a potential strategy for receptor de-orphanization

Eelke van der Horst 0 Julio E Peironcely 0 Adriaan P IJzerman 0 Margot W Beukers 0 Jonathan R Lane 0 Herman WT van Vlijmen 0 Michael TM Emmerich Yasushi Okuno Andreas Bender 0 0 Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University , Einsteinweg 55, 2333CC , The Netherlands Background: G protein-coupled receptors (GPCRs) represent a family of well-characterized drug targets with significant therapeutic value. Phylogenetic classifications may help to understand the characteristics of individual GPCRs and their subtypes. Previous phylogenetic classifications were all based on the sequences of receptors, adding only minor information about the ligand binding properties of the receptors. In this work, we compare a sequencebased classification of receptors to a ligand-based classification of the same group of receptors, and evaluate the potential to use sequence relatedness as a predictor for ligand interactions thus aiding the quest for ligands of orphan receptors. Results: We present a classification of GPCRs that is purely based on their ligands, complementing sequence-based phylogenetic classifications of these receptors. Targets were hierarchically classified into phylogenetic trees, for both sequence space and ligand (substructure) space. The overall organization of the sequence-based tree and substructure-based tree was similar; in particular, the adenosine receptors cluster together as well as most peptide receptor subtypes (e.g. opioid, somatostatin) and adrenoceptor subtypes. In ligand space, the prostanoid and cannabinoid receptors are more distant from the other targets, whereas the tachykinin receptors, the oxytocin receptor, and serotonin receptors are closer to the other targets, which is indicative for ligand promiscuity. In 93% of the receptors studied, de-orphanization of a simulated orphan receptor using the ligands of related receptors performed better than random (AUC > 0.5) and for 35% of receptors de-orphanization performance was good (AUC > 0.7). Conclusions: We constructed a phylogenetic classification of GPCRs that is solely based on the ligands of these receptors. The similarities and differences with traditional sequence-based classifications were investigated: our ligandbased classification uncovers relationships among GPCRs that are not apparent from the sequence-based classification. This will shed light on potential cross-reactivity of GPCR ligands and will aid the design of new ligands with the desired activity profiles. In addition, we linked the ligand-based classification with a ligand-focused sequencebased classification described in literature and proved the potential of this method for de-orphanization of GPCRs. - Background G protein-coupled receptors (GPCRs) comprise a large family, more than 800 in human [1], of cell surface receptors that consist of seven transmembrane (TM) helices. These receptors are activated by a variety of external stimuli, including light, ions, small molecules, lipids, and proteins; moreover, the majority of therapeutic drugs act on GPCRs [2]. Because of the limited number of target crystal structures [3-6], GPCR drug design relies largely on ligand-based approaches [7] such as property-based methods [8], pharmacophore models [9], and substructure methods [10]. These methods do not require any knowledge about the target protein; however, combining them with target information often increases their potential. The resulting so-called 'chemogenomics' approaches thus involve both ligand-based and target-based aspects [11]. They do not focus on a single group of ligands and one individual target, but rather on groups of ligands against groups of targets. The central idea is that similar targets have similar ligands [12,13]. Therefore, relationships between targets from the sequence side can be exploited to search for novel receptor ligands on the chemical structure side. Traditionally, the GPCR superfamily has been classified based on sequence homology of the receptors. Kolakowski grouped all seven transmembrane (7-TM) proteins into classes A to F for receptors proven to bind Gproteins and class O for the other 7-TM proteins [14]. Class A receptors resemble rhodopsin and form the largest cluster. Later, Fredriksson et al. proposed a more elaborate classification for known and predicted human GPCRs [1]. Surgand et al. presented a sequence-based phylogenetic classification of GPCRs viewed from a ligand perspective [15]. By selecting residues pointing inwards into the generic binding pocket of GPCRs, the authors assembled a set of 30 residues most likely to be accessible for ligand binding. Based on these residues, phylogenetic clustering was performed. Although only a subset of residues was used, the classification was similar to classifications based on the full sequence. Applications of a grouping such as proposed by Surgand et al. constitute ligand design for related receptors, as well as deorphanization of GPCRs [15]. However, the study by Surgant et al. is somewhat limited by the scarcity of structural protein data where the identification of binding site residues was solely based on the structure of bovine rhodopsin. It could not yet take into account recent advances that yielded three pharmacologically relevant X-ray crystal structures, namely those of the human 2 and turkey 1 adrenoceptors, as well as of the human adenosine A2A receptor [3,5,6,16]. Building further on Surgand's work, Gloriam et al. proposed an extended set of ligand-accessible residues, derived from visual inspection of the newly available X-ray GPCR crystal structures, from supporting mutagenesis data and from the evaluation of previously established residue sets [17]. The resulting set of 44 residues was then applied to cluster class A GPCRs into a phylogenetic tree, which reflected similarities in binding site of the receptors. Complementary to these sequence-based classifications are the ligand-based classifications of GPCRs. Approaches that use ligand similarity measures for target classification have been previously described [18,19]. Keiser et al. related targets by pair-wise comparison of their ligands [20]. From a set of 65 k ligands, a network was constructed connecting almost all 246 targets through sequential linkage. From this, previously unknown antagonism of methadone on the muscarinic M3 receptor and of emetine on the 2-adrenoceptor was identified. While sequence-based similarity relies on comparison of the residues at certain positions in the sequence, there is no unambiguously defined method to measure ligandbased similarity. One way of defining ligand similarity is to consider the overlap of substructures in the molecules. Frequent substructure mining is a method for finding the most common substructures in a set of molecules [2123]. It evaluates all possible substructures, not only discrete fragments that are present in the molecules; it is therefore an exhaustive appro (...truncated)