Genome mining of unusual biosynthetic gene clusters (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41429-026-00895-2.pdf

Genome mining of unusual biosynthetic gene clusters

The Journal of Antibiotics (2026) 79:139–140 https://doi.org/10.1038/s41429-026-00895-2 SPECIAL FEATURE: EDITORIAL Genome mining of unusual biosynthetic gene clusters Yi Tang 1 1234567890();,: 1234567890();,: Received: 11 January 2026 / Accepted: 18 January 2026 / Published online: 27 February 2026 © The Author(s), under exclusive licence to the Japan Antibiotics Research Association 2026 In the post-genomics and post-transcriptomics era, natural product discovery has undergone a transformation that is driven by unprecedented ability to identify, analyze and characterize biosynthetic pathways. With decreasing costs of genome sequencing and DNA synthesis, together with advances in synthetic biology tools, millions of biosynthetic gene clusters (BGCs) from bacteria, fungi, plants and even animals are now accessible for mining and engineering. A challenge for the ﬁeld is how to prioritize such a vast database of BGCs to identify new natural product structures and biological activities; as well as new biosynthetic logic and enzymatic machinery. In the last two decades, different bioinformatic, genetic and chemical-biological strategies have been developed to help scientists in the pursuit of new compounds and enzymes [1]. BGCs of interest can now be readily assembled even in the absence of the native producing organisms, and can be introduced into chassis organisms for heterologous expression and compound identiﬁcation. These studies have established canonical biosynthetic logic, especially for the largest classes of natural products such as polyketides (PK), nonribosomal peptides (NRP), ribosomally synthesized and post-translationally modiﬁed peptides (RiPPs), terpenes and alkaloids. The wealth of biosynthetic knowledge has enabled the prediction of natural product structures and aided in silico dereplication of BGCs prior to experimental work. In turn, scientists and natural product hunters are increasingly looking for “unusual” BGCs that do not follow canonical logic. Such strategy increases the likelihood of new natural product discovery. In this special issue of “Unusual Biosynthetic Gene Clusters”, a collection of * Yi Tang 1 Department of Chemistry and Biochemistry, Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, CA, USA review and research articles showcasing this approach is included. In the biosynthesis of PK and NRP, assembly-line enzymes are used by Nature to polymerize building blocks and modify the nascent products. These enzymes, including polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs), have been well characterized. However, deviation from canonical PKS and NRPS functions can be identiﬁed using in silico tools, including domain organization predictions, sequence similarity analysis, active site structural analysis, etc. For example, in the paper by Asai and coworkers, the authors searched for atypical fungal PKSs [2]. The authors focused on BGCs that encode dual PKSs that can function collaboratively. Structural prediction of the thioesterase (TE) domain was performed to reveal potential new reactivities. These efforts led to identiﬁcation of one BGC, when heterologously produced, afforded a dimeric alkylresorcinol. Dimerization is facilitated by the unusual TE, and also requires one of the PKSs to have tandem acyl carrier protein domains. Watanabe and coworkers performed heterologous expression of a unique fungal NRPS that is present only in seven different strains of the pathogenic fungus Aspergillus lentulus [3]. A new quinazoline lentoquinazoline was isolated and characterized. This 6-6-6 tricyclic alkaloid is derived from anthranilate, L-leucine and L-asparagine. The authors noted that the central L-leucine in the alkaloid is not epimerized, which is atypical for this family of compounds. This was consistent with NRPS sequence analysis in which the epimerization domain was predicted to be inactive due to active site mutations. In the paper by Wakimoto and coworkers, the authors focused on the biosynthesis of the cyclic peptide momomycin by an NRPS assembly line [4]. The terminal TE domain was shown to prefer a cyclic secondary amine as nucleophile, which is the N-terminal hydroxyproline. Structural prediction and mutagenesis were performed to establish a potential binding mode of this unusual nucleophile in NRPS cyclization. 140 Zhang and Zhang reported the discovery of novel polyketide–peptide hybrid compounds, aquimarinols A–D, from the marine bacterium Aquimarina muelleri [5]. These compounds are fatty acids amidated with a threoniol moiety. The reduction of the threonine residue is catalyzed by a C-terminal reductase domain. Notably, the fatty acid portion contains a β-formamide group that is introduced by the combined actions of a glutamine amidotransferase and a formyltransferase. The unique features of the compounds are reﬂected in the corresponding BGC. An emerging area of interest is to identify biosynthetic pathways that do not encode widely-studied core enzymes (PKSs, NRPS, etc). Instead, alternative scaffold-building enzymes are employed to generate natural product skeletons. Such pathways are intriguing because of (1) the difﬁculty in identifying the key biosynthetic enzyme, and (2) the inability to predict product structures from BGCs. As a result, most current databases and structure prediction tools do not account for these BGCs. We have referred to such BGCs/pathways as “Unknown” (biosynthetic logic) “unknowns” (natural products) [6]. Mao and coworkers summarized recent advances in this area, and included pathways that use PLP-dependent enzymes, tRNAdependent cyclodipeptide synthases, N-N bond-forming enzymes, Pictet Spenglase, etc., as scaffold-building enzymes [7]. The interesting biosynthetic logic of one such pathway, that of lincosamides, is reviewed in detail by Mori and Yang [8]. Lincosamides such as lincomycin A are potent ribosome inhibitors and feature a thiooctose core connected to either proline or alkylproline, and an unusual S-alkyl substitution. The review highlights three enzymes that are key to building these features, including S-glycosyltransferase, carrier protein-dependent condensation enzyme, and PLP-dependent β-lyase. Structural bases of catalysis by these enzymes were reviewed. Takahashi and coworkers provided structural-function insights into biosynthetic enzymes that use aminoacylated tRNA as substrates to introduce amino acids into complex natural products [9]. The authors introduced the concept of AI-driven “forecasting biosynthesis”, in which the function of a cryptic aminoacyl-tRNA-synthetase can be predicted. These enzymes use aminoacyl-tRNA as either an electrophile with tRNA as leaving group; or with the α-amine as a nucleophile. The structures of several examples from different classes of natural products were predicted with Alphafold 3. Ushimaru and Yu reviewed how sequence similarity network (SSN) analysis can be used (...truncated)