Genome mining of unusual biosynthetic gene clusters
The Journal of Antibiotics (2026) 79:139–140
https://doi.org/10.1038/s41429-026-00895-2
SPECIAL FEATURE: EDITORIAL
Genome mining of unusual biosynthetic gene clusters
Yi Tang
1
1234567890();,:
1234567890();,:
Received: 11 January 2026 / Accepted: 18 January 2026 / Published online: 27 February 2026
© The Author(s), under exclusive licence to the Japan Antibiotics Research Association 2026
In the post-genomics and post-transcriptomics era, natural
product discovery has undergone a transformation that is
driven by unprecedented ability to identify, analyze and
characterize biosynthetic pathways. With decreasing costs
of genome sequencing and DNA synthesis, together with
advances in synthetic biology tools, millions of biosynthetic
gene clusters (BGCs) from bacteria, fungi, plants and even
animals are now accessible for mining and engineering. A
challenge for the field is how to prioritize such a vast
database of BGCs to identify new natural product structures
and biological activities; as well as new biosynthetic logic
and enzymatic machinery.
In the last two decades, different bioinformatic, genetic and
chemical-biological strategies have been developed to help
scientists in the pursuit of new compounds and enzymes [1].
BGCs of interest can now be readily assembled even in the
absence of the native producing organisms, and can be
introduced into chassis organisms for heterologous expression and compound identification. These studies have
established canonical biosynthetic logic, especially for the
largest classes of natural products such as polyketides (PK),
nonribosomal peptides (NRP), ribosomally synthesized and
post-translationally modified peptides (RiPPs), terpenes and
alkaloids. The wealth of biosynthetic knowledge has
enabled the prediction of natural product structures and
aided in silico dereplication of BGCs prior to experimental
work. In turn, scientists and natural product hunters are
increasingly looking for “unusual” BGCs that do not follow
canonical logic. Such strategy increases the likelihood of
new natural product discovery. In this special issue of
“Unusual Biosynthetic Gene Clusters”, a collection of
* Yi Tang
1
Department of Chemistry and Biochemistry, Department of
Chemical and Biomolecular Engineering, University of California,
Los Angeles, CA, USA
review and research articles showcasing this approach is
included.
In the biosynthesis of PK and NRP, assembly-line
enzymes are used by Nature to polymerize building blocks
and modify the nascent products. These enzymes, including
polyketide synthases (PKSs) and nonribosomal peptide
synthetases (NRPSs), have been well characterized. However, deviation from canonical PKS and NRPS functions
can be identified using in silico tools, including domain
organization predictions, sequence similarity analysis,
active site structural analysis, etc. For example, in the paper
by Asai and coworkers, the authors searched for atypical
fungal PKSs [2]. The authors focused on BGCs that encode
dual PKSs that can function collaboratively. Structural
prediction of the thioesterase (TE) domain was performed to
reveal potential new reactivities. These efforts led to identification of one BGC, when heterologously produced,
afforded a dimeric alkylresorcinol. Dimerization is facilitated by the unusual TE, and also requires one of the PKSs
to have tandem acyl carrier protein domains.
Watanabe and coworkers performed heterologous
expression of a unique fungal NRPS that is present only in
seven different strains of the pathogenic fungus Aspergillus
lentulus [3]. A new quinazoline lentoquinazoline was isolated and characterized. This 6-6-6 tricyclic alkaloid is
derived from anthranilate, L-leucine and L-asparagine. The
authors noted that the central L-leucine in the alkaloid is not
epimerized, which is atypical for this family of compounds.
This was consistent with NRPS sequence analysis in which
the epimerization domain was predicted to be inactive due
to active site mutations.
In the paper by Wakimoto and coworkers, the authors
focused on the biosynthesis of the cyclic peptide momomycin by an NRPS assembly line [4]. The terminal TE
domain was shown to prefer a cyclic secondary amine as
nucleophile, which is the N-terminal hydroxyproline.
Structural prediction and mutagenesis were performed
to establish a potential binding mode of this unusual
nucleophile in NRPS cyclization.
140
Zhang and Zhang reported the discovery of novel
polyketide–peptide hybrid compounds, aquimarinols A–D,
from the marine bacterium Aquimarina muelleri [5]. These
compounds are fatty acids amidated with a threoniol moiety. The reduction of the threonine residue is catalyzed by a
C-terminal reductase domain. Notably, the fatty acid portion
contains a β-formamide group that is introduced by the
combined actions of a glutamine amidotransferase and a
formyltransferase. The unique features of the compounds
are reflected in the corresponding BGC.
An emerging area of interest is to identify biosynthetic
pathways that do not encode widely-studied core enzymes
(PKSs, NRPS, etc). Instead, alternative scaffold-building
enzymes are employed to generate natural product skeletons. Such pathways are intriguing because of (1) the difficulty in identifying the key biosynthetic enzyme, and (2)
the inability to predict product structures from BGCs. As a
result, most current databases and structure prediction tools
do not account for these BGCs. We have referred to such
BGCs/pathways as “Unknown” (biosynthetic logic) “unknowns” (natural products) [6]. Mao and coworkers
summarized recent advances in this area, and included
pathways that use PLP-dependent enzymes, tRNAdependent cyclodipeptide synthases, N-N bond-forming
enzymes, Pictet Spenglase, etc., as scaffold-building
enzymes [7].
The interesting biosynthetic logic of one such pathway,
that of lincosamides, is reviewed in detail by Mori and
Yang [8]. Lincosamides such as lincomycin A are potent
ribosome inhibitors and feature a thiooctose core connected
to either proline or alkylproline, and an unusual S-alkyl
substitution. The review highlights three enzymes that are
key to building these features, including S-glycosyltransferase, carrier protein-dependent condensation enzyme,
and PLP-dependent β-lyase. Structural bases of catalysis by
these enzymes were reviewed.
Takahashi and coworkers provided structural-function
insights into biosynthetic enzymes that use aminoacylated
tRNA as substrates to introduce amino acids into complex
natural products [9]. The authors introduced the concept of
AI-driven “forecasting biosynthesis”, in which the function
of a cryptic aminoacyl-tRNA-synthetase can be predicted.
These enzymes use aminoacyl-tRNA as either an electrophile with tRNA as leaving group; or with the α-amine as a
nucleophile. The structures of several examples from different classes of natural products were predicted with
Alphafold 3.
Ushimaru and Yu reviewed how sequence similarity
network (SSN) analysis can be used (...truncated)