Deciphering complex breakage-fusion-bridge genome rearrangements with Ambigram
Article
https://doi.org/10.1038/s41467-023-41259-w
Deciphering complex breakage-fusionbridge genome rearrangements with
Ambigram
Received: 13 October 2022
Check for updates
1234567890():,;
1234567890():,;
Accepted: 28 August 2023
Chaohui Li 1,2, Lingxi Chen
Shuai Cheng Li 1
1,2
, Guangze Pan1, Wenqian Zhang1 &
Breakage-fusion-bridge (BFB) is a complex rearrangement that leads to tumor
malignancy. Existing models for detecting BFBs rely on the ideal BFB
hypothesis, ruling out the possibility of BFBs entangled with other structural
variations, that is, complex BFBs. We propose an algorithm Ambigram to
identify complex BFB and reconstruct the rearranged structure of the local
genome during the cancer subclone evolution process. Ambigram handles
data from short, linked, long, and single-cell sequences, and optical mapping
technologies. Ambigram successfully deciphers the gold- or silver-standard
complex BFBs against the state-of-the-art in multiple cancers. Ambigram dissects the intratumor heterogeneity of complex BFB events with single-cell
reads from melanoma and gastric cancer. Furthermore, applying Ambigram to
liver and cervical cancer data suggests that the BFB mechanism may mediate
oncovirus integrations. BFB also exists in noncancer genomics. Investigating
the complete human genome reference with Ambigram suggests that the BFB
mechanism may be involved in two genome reorganizations of Homo Sapiens
during evolution. Moreover, Ambigram discovers the signals of recurrent
foldback inversions and complex BFBs in whole genome data from the 1000
genome project, and congenital heart diseases, respectively.
Breakage-fusion-bridge (BFB) is a mechanism that leads to complex
genome rearrangements in multiple cancers1–13. The rearrangement of
BFB is mediated by the recursive cycles of BFB14–16 (Fig. 1a). A BFB cycle
begins with the fold-back inversion (FBI) of two sister chromatids due
to the lack of telomeres during DNA replication, resulting in twocentromeres in the fused bridge. When two centromeres are stretched
to opposite poles in the anaphase, the two sister chromatids are split
with a double-strand break on the bridge between two centromeres,
unnecessarily the same as the previous fusion site. Since each daughter
cell contains chromatids without telomeres, another BFB cycle may
start again. Repetition of BFB cycles contributes to a surge in stair-like
copy number (CN) amplifications and FBIs15–18. The above depicts a
perfect BFB event that is solely driven by reverse complementary FBI,
where the genomic positions of two breakpoints of a reverse
complementary FBI are the same. However, some BFB events involve
imperfect FBI whose breakpoint positions are different, resulting in the
loss of DNA segments near the breakpoints4. Furthermore, studies
reported that structure variations (SVs) such as deletion, duplication,
insertion, and translocation could be involved during BFB cycles outside the FBI breakpoints13,19–21. In this study, we coin the BFB rearrangement beyond perfect BFB as complex BFB rearrangement.
As the BFB process delivers anaphase bridges and dicentric
chromosomes, investigators detected it using classical cytogenetic
techniques in the early time10. However, these BFB cytogenetic signatures are not directly discernible from high-throughput DNA
sequencing reads. Most studies inferred the consistency of a specific
observation with BFB events from sequencing reads by two distinct
hallmarks3–9,11,12,16: (i) oscillating CN with exponential or stair-like gains;
1
Department of Computer Science, City University of Hong Kong, Hong Kong, China. 2These authors contributed equally: Chaohui Li and Lingxi Chen.
e-mail:
Nature Communications | (2023)14:5528
1
Article
https://doi.org/10.1038/s41467-023-41259-w
a
c
Mechanism of BFB
Telomere
loss
Locate reference path
Fusion
FBI
Workflow of Ambigram
Find BFB candidate
Bridge
Fusion
m 1,2 = 1 2
Refine CN of BFB
patterns and loops (ILP)
ILP Results:
Construct DAG
(Directed Acyclic Graph)
Accumulation of BFB recursive cycles
b
8
Resolve the local genomic
map (LGM) of BFB
d
Hallmarks of BFB
1,2 =
m
Bridge
3
2
Enumerate BFB
mono-chains (m) and loops ()
Breakage
FBI
1
1 2
4
3,5 = 1,
1,2 =
1,2 = 0, …
1 2 3 4 5 6
(3,6)
6 5 4 3
6
2 1, …
2 1
4,5 = 1,
(4,6)
5
3 4 5
2 1
1 2, …
3,6 = 1,
1,6 = 1
(3,5)
(4,5)
5 4
4 5
5 4 3
3 4 5 6| 6 5 4 3 2 1
Metrics of Ambigram
8
6
Stair-like
CN gains
4
2
2
Fold-back
inversions
e
Complex BFB
(DEL/DUP/INS/TRX)
Short/linked/
long reads,
optical mapping
Single-cell
reads
Onco-virus
integration
Ambigram resolves complex BFB events on pan-cancer and complex disease
Melanoma
Lung cancer
Breast cancer
Pancreatic cancer
Gastric cancer
Liver cancer
Cervical cancer
Heart disease
Fig. 1 | Schematic overview. a The mechanism of BFB. b The hallmarks of BFB.
c Workflow of Ambigram. d Metrics of Ambigram. e Summary of Ambigram
applications. All anatomy images with free licenses are provided by Freepik. BFB
breakage-fusion bridge, FBI fold-back inversion, CN copy number, ILP integer linear
programming, DEL deletion, DUP duplication, INS insertion, TRX translocation.
and ii) enrichment of FBIs with the fold-back direction of head-to-head
(FBI-hh) or tail-to-tail (FBI-tt) (Fig. 1b). However, a pattern consistent
with the two hallmarks does not imply that BFB yields the pattern.
Leveraging the “palindrome” or “ambigram” nature of the BFBinduced local genomic map, i.e., the rearranged structure of the local
genome, researchers started to utilize well-established algorithms to
mathematically expand CNs or FBIs into possible BFB paths for array
comparative genomic hybridization (aCGH) or pair-end sequencing
(PE) data (Kinsella et al.22, BFBFinder23,24, and Greenman et al.25,26,
Table 1). However, these mathematical models have limitations in
interpreting real-world data for the following reasons. (i) These models
solely focus on perfect BFB, ruling out the possibility of other deletion,
duplication, insertion, or translocation in complex BFB
rearrangements13,19–21. (ii) These models lag behind recent advances in
linked read sequencing27,28, long-read sequencing29,30, and optical
mapping alignment31. Linked read, long read, and optical mapping data
offer larger than 100 kb linkage of the DNA fragment that could accelerate the detection of the accurate and long complex BFB local genomic
map. (iii) Single-cell sequencing32–35 could facilitate investigation of the
intratumor heterogeneity of complex BFB at single-cell resolution while
existing methods are incompatible. Recently, the community has
developed computational pipelines to detect and resolve complex
somatic genome rearrangements, including complex BFB
(AmpliconArchitect + AmpliconClassifier19, AmpliconReconstructor20,
and LINX21, Table 1). However, these pipelines merely support short
sequencing reads or optical mapping data.
In this work, to overcome the limitations above, (...truncated)