Highly comparable metabarcoding results from MGI-Tech and Illumina sequencing platforms
Highly comparable metabarcoding results
from MGI-Tech and Illumina sequencing
platforms
Sten Anslan1,2, Vladimir Mikryukov1,2, Kęstutis Armolaitis3,
Jelena Ankuda3, Dagnija Lazdina4, Kristaps Makovskis4,
Lars Vesterdal5, Inger Kappel Schmidt5 and Leho Tedersoo1,2
1
Institute of Ecology and Earth Sciences, University of Tartu, Tartu, Tartumaa, Estonia
Mycology and Microbiology Center, University of Tartu, Tartu, Tartumaa, Estonia
3
Department of Ecology, Institute of Forestry of Lithuanian Research Centre for Agriculture and
Forestry (LAMMC), Kaunas, Lithuania
4
Latvian State Forest Research Institute SILAVA, Riga, Latvia
5
Department of Geosciences and Natural Resource Management, University of Copenhagen,
Copenhagen, Denmark
2
ABSTRACT
Submitted 2 July 2021
Accepted 14 September 2021
Published 30 September 2021
Corresponding author
Sten Anslan,
Academic editor
Vladimir Uversky
Additional Information and
Declarations can be found on
page 16
With the developments in DNA nanoball sequencing technologies and the
emergence of new platforms, there is an increasing interest in their performance in
comparison with the widely used sequencing-by-synthesis methods. Here, we test the
consistency of metabarcoding results from DNBSEQ-G400RS (DNA nanoball
sequencing platform by MGI-Tech) and NovaSeq 6000 (sequencing-by-synthesis
platform by Illumina) platforms using technical replicates of DNA libraries that
consist of COI gene amplicons from 120 soil DNA samples. By subjecting raw
sequencing data from both platforms to a uniform bioinformatics processing, we
found that the proportion of high-quality reads passing through the filtering steps
was similar in both datasets. Per-sample operational taxonomic unit (OTU) and
amplicon sequence variant (ASV) richness patterns were highly correlated, but
sequencing data from DNBSEQ-G400RS harbored a higher number of OTUs. This
may be related to the lower dominance of most common OTUs in DNBSEQ data set
(thus revealing higher richness by detecting rare taxa) and/or to a lower effective read
quality leading to generation of spurious OTUs. However, there was no statistical
difference in the ASV and post-clustered ASV richness between platforms, suggesting
that additional denoising step in the ASV workflow had effectively removed the
‘noisy’ reads. Both OTU-based and ASV-based composition were strongly correlated
between the sequencing platforms, with essentially interchangeable results.
Therefore, we conclude that DNBSEQ-G400RS and NovaSeq 6000 are both equally
efficient high-throughput sequencing platforms to be utilized in studies aiming to
apply the metabarcoding approach, but the main benefit of the former is related to
lower sequencing cost.
DOI 10.7717/peerj.12254
Copyright
2021 Anslan et al.
Distributed under
Creative Commons CC-BY 4.0
Subjects Bioinformatics, Entomology, Genomics, Molecular Biology, Zoology
Keywords Metabarcoding, COI, Illumina, NovaSeq, DNBSEQ, MGI-Tech
How to cite this article Anslan S, Mikryukov V, Armolaitis K, Ankuda J, Lazdina D, Makovskis K, Vesterdal L, Schmidt IK, Tedersoo L.
2021. Highly comparable metabarcoding results from MGI-Tech and Illumina sequencing platforms. PeerJ 9:e12254
DOI 10.7717/peerj.12254
INTRODUCTION
Metabarcoding, the identification of organisms via DNA marker genes from
environmental samples or a mixture of heterospecific specimens (Taberlet et al., 2018), is a
powerful tool in biodiversity analysis (Kelly et al., 2018; Pont et al., 2021; Valentin et al.,
2019; Watts et al., 2019). This approach has been efficiently used to characterize the
community composition of microbial and animal taxa from various types of
environmental samples such as soil (Bahram et al., 2018; Nilsson et al., 2019), water
(Djurhuus et al., 2018; Liu et al., 2020), sediments (Kang et al., 2021; Wurzbacher et al.,
2017), dust (de Groot et al., 2021; Rocchi et al., 2017) and feces (Ando et al., 2020; Anslan
et al., 2021). In animals, metabarcoding has also been widely used to identify
host-associated microbiomes, determine the structure of entire holobionts and dietary
differences in various species (Alberdi et al., 2019; Kueneman et al., 2019). The information
acquired through DNA marker gene sequencing has greatly boosted our knowledge about
the ecology and distribution patterns of various aquatic and terrestrial animal groups such
as nematodes, arthropods and annelids (Arribas et al., 2016; Beng & Corlett, 2020;
Compson et al., 2020; Deiner et al., 2017; Zawierucha et al., 2021).
Since the mid-2000s, the metabarcoding technique has greatly benefited from
technological advances in library preparation, primer and sample-specific index design,
novel sequencing platforms as well as from optimized bioinformatics workflows and
accumulating reference data (Taberlet et al., 2018; Nilsson et al., 2019). Short-read,
second-generation high-throughput sequencing (HTS) technologies are currently the most
widely used means for metabarcoding due to a relatively low cost per sample, high
sequencing depth and accuracy. Sequencing instruments produced by Illumina, Inc. (e.g.,
MiSeq and NovaSeq) using sequencing-by-synthesis technology are dominating the
market as they offer viable solutions for both ultra-high sequencing depth and paired-end
sequencing of short- and mid-sized amplicons (up to 500–600 bases; Kumar, Cowley &
Davis, 2019). By utilizing recent advances in DNA nanoball sequencing technology
(Drmanac et al., 2010; Li et al., 2019), MGI-Tech, Inc. has produced several DNBSEQ
(MGISEQ) platforms with similar throughput and quality profiles compared with
Illumina sequencing (Jeon et al., 2021; Kumar, Cowley & Davis, 2019). The results from
Illumina and MGI-Tech sequencing platforms are highly comparable and may be used
interchangeably for RNA sequencing and whole genome sequencing (Jeon et al., 2019; Kim
et al., 2021; Korostin et al., 2020). However, the error rate of DNBSEQ technology
(MGI-2000 instrument) was marginally higher than for Illumina (HiSeq instrument) when
using 2 × 150 paired-end sequencing mode on both platforms (quality scores >30: 95.03%
and 97.18% for MGISEQ-2000 and HiSeq 2500, respectively; Korostin et al., 2020).
The results of these early genome sequencing-oriented studies suggest that MGI-Tech
platforms may be used efficiently also in metabarcoding studies. In early 2021, sequencing
costs for MGI-Tech DNBSEQ-T7 were about 50% lower compared with Illumina NovaSeq
platform (cost per read) for the greatest throughput analyses (Tedersoo et al., 2021).
So far, only a single metabarcoding study has been conducted to compare these sequencing
platforms (DNBSEQ-G400 and Illumina MiSeq) for recovering rRNA gene 16S and
Anslan et al. (2021), PeerJ, DOI 10.7717/peerj.12254
2/21
ITS amplicons of bacterial and fungal mock communities (Sun et al., 2021). For the
ITS2 amplicon, Sun et al. (2021) reported small but significant differences between
DNBSEQ-G400 and MiSeq platforms, but this difference can be attributed t (...truncated)