In silico analysis of methyltransferase domains involved in biosynthesis of secondary metabolites (pdf)

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/1471-2105-9-454.pdf

In silico analysis of methyltransferase domains involved in biosynthesis of secondary metabolites

BMC Bioinformatics In silico analysis of methyltransferase domains involved in biosynthesis of secondary metabolites Mohd Zeeshan Ansari 0 Jyoti Sharma 0 Rajesh S Gokhale 0 Debasisa Mohanty 0 0 Address: National Institute of Immunology, Aruna Asaf Ali Marg , New Delhi-110067 , India Background: Secondary metabolites biosynthesized by polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) family of enzymes constitute several classes of therapeutically important natural products like erythromycin, rapamycin, cyclosporine etc. In view of their relevance for natural product based drug discovery, identification of novel secondary metabolite natural products by genome mining has been an area of active research. A number of different tailoring enzymes catalyze a variety of chemical modifications to the polyketide or nonribosomal peptide backbone of these secondary metabolites to enhance their structural diversity. Therefore, development of powerful bioinformatics methods for identification of these tailoring enzymes and assignment of their substrate specificity is crucial for deciphering novel secondary metabolites by genome mining. Results: In this work, we have carried out a comprehensive bioinformatics analysis of methyltransferase (MT) domains present in multi functional type I PKS and NRPS proteins encoded by PKS/NRPS gene clusters having known secondary metabolite products. Based on the results of this analysis, we have developed a novel knowledge based computational approach for detecting MT domains present in PKS and NRPS megasynthases, delineating their correct boundaries and classifying them as N-MT, C-MT and O-MT using profile HMMs. Analysis of proteins in nr database of NCBI using these class specific profiles has revealed several interesting examples, namely, C-MT domains in NRPS modules, N-MT domains with significant homology to C-MT proteins, and presence of NRPS/PKS MTs in association with other catalytic domains. Our analysis of the chemical structures of the secondary metabolites and their site of methylation suggested that a possible evolutionary basis for the presence of a novel class of N-MT domains with significant homology to C-MT proteins could be the close resemblance of the chemical structures of the acceptor substrates, as in the case of pyochelin and yersiniabactin. These two classes of MTs recognize similar acceptor substrates, but transfer methyl groups to N and C positions on these substrates. Conclusion: We have developed a novel knowledge based computational approach for identifying MT domains present in type I PKS and NRPS multifunctional enzymes and predicting their site of methylation. Analysis of nr database using this approach has revealed presence of several novel MT domains. Our analysis has also given interesting insight into the evolutionary basis of the novel substrate specificities of these MT proteins. - Background Nonribosomal peptide synthetases (NRPSs), polyketide synthases (PKSs) and fatty acid synthases (FASs) employ a common biosynthetic strategy to synthesize their metabolic products by stepwise condensation of simple amino or carboxylic acid monomers. The core catalytic domains involved in the biosynthesis of the polyketide/nonribosomal peptide/fatty acid backbone moieties are ketosynthase (KS), acyltransferase (AT), dehydratase (DH), enoylreductase (ER), ketoreductase (KR), acyl carrier protein (ACP), condensation (C), adenylation (A) and thiolation (T) [1,2]. Apart from these core catalytic domains, a number of auxiliary functional domains, often called tailoring domains, introduce a variety of different chemical modifications to the backbone moieties of these secondary metabolites to further increase their structural diversity. Bioinformatics analysis of various catalytic domains present in NRPS and PKS proteins has been an area of active research in recent years [3-8]. These studies [3-8] have not only led to development of novel computational methods for in silico identification of secondary metabolites by genome mining [9-16], they have also guided rational reprogramming of secondary metabolite biosynthetic pathways to generate designed "natural products" [12,17-20]. However, all these studies including our earlier work have concentrated on core catalytic domains and no detailed bioinformatics analyses have been carried out for important tailoring enzymes like, methyltransferases. Methyltransferase (MT) domains present in NRPS and PKS clusters constitute a major class of tailoring domains/ enzymes involved in biosynthesis of secondary metabolites. They catalyze the transfer of methyl group from Sadenosylmethionine (SAM or AdoMet) to the carbon, nitrogen or oxygen atoms at various positions on the backbones of polyketides, nonribosomal peptides and fatty acids and therefore have been classified as C-MT, NMT and O-MT respectively depending upon their site of methylation. These enzymatic domains in general have a bidomain structure, where the first subdomain contains the binding site for methyl group donor, while the second subdomain harbors the binding site for acceptor substrate [21,22]. The presence of MT domains in multifunctional NRPS and PKS proteins is generally inferred from chemical structure of the secondary metabolite products. There are only few in vitro studies on enzymatic characterization of NRPS/PKS MT domains [23-27]. A recent study on MT domains from type II PKS biosynthetic pathways has revealed interesting correlation between regioselectivity of methylation and MT sequence [24]. However, no such analysis has been carried out for MT domains present in type I PKS or NRPS proteins. In contrast to type II PKS MTs which are stand alone proteins, MT domains in type I PKS and NRPS are present along with other catalytic domains on a single polypeptide chain. Therefore, it has been difficult to decipher the correct length and domain boundaries for MT domains in type I PKS or NRPS proteins. Various studies have suggested that the size of N-MT domain is typically 450 amino acids, while C-MT and O-MT are generally 300 amino acids long. A set of 3 conserved sequence motifs has been identified in most MTs [28-30]. Mutational studies of N-MTs of peptide synthetases have shown that these 3 motifs are essential for the catalysis [31]. The knowledge of these MT sequence motifs and the expected spacing between them is often used for discerning presence of MT domains in multifunctional NRPS and PKS proteins. However, because of the high degree of sequence divergence, delineating the correct boundary of these proteins is quite often a difficult task. In our earlier study, we attempted to identify MT domains in various NRPS/PKS gene clusters based on pairwise alignment with MT domain from actinomycin cluster [32]. However, this domain identification protocol failed to detect 23 out of 32 MT domains. The 23 unidentified MT domains included the three groups of MTs (C-, O- and N-MTs), for which proper te (...truncated)