An internet-based bioinformatics toolkit for plant biosecurity diagnosis and surveillance of viruses and viroids
Barrero et al. BMC Bioinformatics (2017) 18:26
DOI 10.1186/s12859-016-1428-4
METHODOLOGY ARTICLE
Open Access
An internet-based bioinformatics toolkit for
plant biosecurity diagnosis and surveillance
of viruses and viroids
Roberto A. Barrero1*†, Kathryn R. Napier1,2†, James Cunnington3, Lia Liefting4, Sandi Keenan5, Rebekah A. Frampton5,
Tamas Szabo1, Simon Bulman5, Adam Hunter1, Lisa Ward4, Mark Whattam3 and Matthew I. Bellgard1*
Abstract
Background: Detection and preventing entry of exotic viruses and viroids at the border is critical for protecting
plant industries trade worldwide. Existing post entry quarantine screening protocols rely on time-consuming
biological indicators and/or molecular assays that require knowledge of infecting viral pathogens. Plants have
developed the ability to recognise and respond to viral infections through Dicer-like enzymes that cleave viral
sequences into specific small RNA products. Many studies reported the use of a broad range of small RNAs
encompassing the product sizes of several Dicer enzymes involved in distinct biological pathways. Here we
optimise the assembly of viral sequences by using specific small RNA subsets.
Results: We sequenced the small RNA fractions of 21 plants held at quarantine glasshouse facilities in Australia and
New Zealand. Benchmarking of several de novo assembler tools yielded SPAdes using a kmer of 19 to produce the
best assembly outcomes. We also found that de novo assembly using 21–25 nt small RNAs can result in chimeric
assemblies of viral sequences and plant host sequences. Such non-specific assemblies can be resolved by using
21–22 nt or 24 nt small RNAs subsets. Among the 21 selected samples, we identified contigs with sequence
similarity to 18 viruses and 3 viroids in 13 samples. Most of the viruses were assembled using only 21–22 nt long
virus-derived siRNAs (viRNAs), except for one Citrus endogenous pararetrovirus that was more efficiently assembled
using 24 nt long viRNAs. All three viroids found in this study were fully assembled using either 21–22 nt or 24 nt
viRNAs. Optimised analysis workflows were customised within the Yabi web-based analytical environment. We
present a fully automated viral surveillance and diagnosis web-based bioinformatics toolkit that provides a flexible,
user-friendly, robust and scalable interface for the discovery and diagnosis of viral pathogens.
Conclusions: We have implemented an automated viral surveillance and diagnosis (VSD) bioinformatics toolkit that
produces improved viruses and viroid sequence assemblies. The VSD toolkit provides several optimised and
reusable workflows applicable to distinct viral pathogens. We envisage that this resource will facilitate the
surveillance and diagnosis viral pathogens in plants, insects and invertebrates.
Keywords: Bioinformatics, Plant biosecurity, Next generation sequencing, Plant viruses and viroids, Quarantine,
viRNAs, Virus diagnosis, Yabi, Small RNA-Seq, Workflows
* Correspondence: ;
†
Equal contributors
1
Centre for Comparative Genomics, Murdoch University, Murdoch, WA 6150,
Australia
Full list of author information is available at the end of the article
© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Barrero et al. BMC Bioinformatics (2017) 18:26
Background
Increases in global trade and movement are placing
significant pressure on post entry quarantine systems,
with an increase in the frequency of incursions of pathogens causing the emergence of diseases and pests that
are both difficult and costly to eradicate and control [1].
The challenge of maximising the benefits of global trade
whilst minimising the negative impacts of biosecurity
threats is one faced by most nations [2]. Historically, the
geographical isolation of Australia and New Zealand,
coupled with stringent quarantine screening measures,
has provided protection from the introduction of exotic
pests and pathogens that have the potential to harm
human health, agriculture, the environment and the
economy.
Plant biosecurity is defined as “a set of measures
designed to protect crops from emergency plant pests at
national, regional and individual farm level” [1, 3]. The
diagnosis of viral pathogens is a crucial component of
plant biosecurity surveillance, required to prevent the
potential introduction of exotic plant viruses and viroids.
Existing ‘specific’ serological and molecular detection
methods such as enzyme-linked immunosorbent assay
(ELISA), polymerase chain reaction (PCR), or nucleic
acid spot hybridization, while highly sensitive, are
limited by their ability to detect only known plant
viruses/viroids. These methods lack the capacity to
detect unknown, poorly characterised or highly variable
viral pathogens [4, 5]. Furthermore the host range of
many viral pathogens is not defined and known exotic
viruses/viroids could be missed if these infect new plant
species for which standard screening assays are not
applied. If pathogens are not initially detected via these
methods, more ‘investigational’ techniques may be
applied, such as electron microscopy, host plant inoculation, or PCR using degenerate primers [5]. The time and
effort taken to screen imported plants using these existing methods has a direct economic impact, with plants
that are currently imported into Australia and New
Zealand spending up to two years in quarantine (https://
bicon.agriculture.gov.au/BiconWeb4.0).
Recent studies have demonstrated both the detection
of viral pathogens and the identification of novel viruses
by the deep sequencing of small RNAs (small RNA-Seq)
of plant species [4–7]. RNA silencing is a natural antiviral defence system present in plants, insects and invertebrates that recognise dsRNA viral genomes and/or
viral intermediate sequences, and cleave them into small
interfering RNAs (siRNA) of 21-24 nt in length [8].
These virus-derived siRNAs (viRNAs) accumulate in the
small RNA fraction of host plants making it amenable to
identify viruses through a next generation sequencing
(NGS) approach, even at extremely low viral titres and
in symptomless infections [9, 10]. Small RNA NGS
Page 2 of 12
screening of viral pathogens is more cost- and timeeffective compared with current detection methods. The
bottleneck for the uptake of NGS technology for routine
surveillance and diagnosis of viral sequences is the lack
of an automated bioinformatics pipeline that enables
users to evaluate, scrutinize and modify a (...truncated)