Mathematical modeling of 16S ribosomal DNA amplification reveals optimal conditions for the interrogation of complex microbial communities with phylogenetic microarrays
Oleg Paliy
1
Brent D. Foy
0
Associate Editor: Jonathan Wren
0
Department of Physics, Wright State University
, Dayton,
OH 45435, USA
1
Department of Biochemistry and Molecular Biology
Motivation: Many current studies of complex microbial communities rely on the isolation of community genomic DNA, amplification of 16S ribosomal RNA genes (rDNA) and subsequent examination of community structure through interrogation of the amplified 16S rDNA pool by high-throughput sequencing, phylogenetic microarrays or quantitative PCR. Results: Here we describe the development of a mathematical model aimed to simulate multitemplate amplification of 16S ribosomal DNA sample and subsequent detection of these amplified 16S rDNA species by phylogenetic microarray. Using parameters estimated from the experimental results obtained in the analysis of intestinal microbial communities with Microbiota Array, we show that both species detection and the accuracy of species abundance estimates depended heavily on the number of PCR cycles used to amplify 16S rDNA. Both parameters initially improved with each additional PCR cycle and reached optimum between 15 and 20 cycles of amplification. The use of more than 20 cycles of PCR amplification and/or more than 50 ng of starting genomic DNA template was, however, detrimental to both the fraction of detected community members and the accuracy of abundance estimates. Overall, the outcomes of the model simulations matched well available experimental data. Our simulations also showed that species detection and the accuracy of abundance measurements correlated positively with the higher sample-wide PCR amplification rate, lower template-to-template PCR bias and lower number of species in the interrogated community. The developed model can be easily modified to simulate other multitemplate DNA mixtures as well as other microarray designs and PCR amplification protocols. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.
1 INTRODUCTION
Owing to the development and refinement of novel DNA and
RNA interrogation technologies, there is a surge of studies in the
current literature exploring the populational structure and function
of various complex microbial communities (Brodie et al., 2007;
Gao et al., 2007; Huber et al., 2007). The ability to isolate and
subsequently examine total community DNA and RNA without any
need to culture individual microbial species and cells allows analysis
of systems that would otherwise be difficult to profile and examine
including microbiota of human intestine and other epithelial surfaces
and the microbes of soils and ocean waters (Eckburg et al., 2005;
Gao et al., 2007; Huber et al., 2007; Kent and Triplett, 2002).
Gene coding for the small ribosomal subunit RNA molecule (16S
rRNA in prokaryotes and 18S rRNA in eukaryotes) has been used
in the vast majority of such studies due to its ubiquitous presence
in all organisms and because of the conservation of its nucleotide
sequence (Cannone et al., 2002). In a typical experimental design to
profile microbial community structure, total genomic DNA (gDNA)
isolated from a sample of interest is subjected to rounds of 16S
rRNA gene (rDNA) specific amplification in polymerase chain
reaction (PCR) using two universal primers complementary to the
beginning and the end of prokaryotic 16S rRNA molecule (Frank
et al., 2008). The amplified DNA is then interrogated by a detection
method of choice such as DNA sequencing or microarray analysis.
Because on average 16S rRNA gene constitutes only 0.25% of the
total genomic DNA (see below), selective 16S rDNA amplification
is crucial to increase the sensitivity of detection (Paliy et al., 2009)
and to obtain good measures of bacterial presence and relative
abundance in the community samples. However, the optimal use
of such PCR amplification in relation to microarray and DNA
sequencing detection have not been yet fully explored.
A number of studies have been published, though, examining
the thermodynamic behavior of DNA molecules during DNA
amplification, and the biases that can be observed during many
rounds of PCR amplification (Kanagawa, 2003; Kurata et al., 2004;
Polz and Cavanaugh, 1998). Because most microbial communities
consist of a large number of different microbial species with varied
16S rRNA gene sequences, any PCR amplification of community
DNA is multitemplate. PCR amplification of such gDNA has been
shown to introduce a deviation of the post-amplification fractions
from the initial ratios of DNA molecules (termed PCR bias) due to
unequal amplification of different DNA molecules during PCR (Polz
and Cavanaugh, 1998). Several mechanisms of this effect have been
described that include (i) unequal denaturation of templates based
on GC content of DNA sequences; (ii) higher binding efficiency of
GC-rich variants of degenerate amplification primers to the template
at the same annealing temperature; and (iii) competitive re-annealin (...truncated)