qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data
Genome Biology
e2VHt0oea0lul7.emtmheao8n,sdIssue 2, Article R19 Me qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data
Jan Hellemans 0
Geert Mortier 0
Anne De Paepe 0
Frank Speleman 0
Jo Vandesompele 0
0 Address: Center for Medical Genetics, Ghent University Hospital , De Pintelaan, B-9000 Ghent , Belgium
Although quantitative PCR (qPCR) is becoming the method of choice for expression profiling of selected genes, accurate and straightforward processing of the raw measurements remains a major hurdle. Here we outline advanced and universally applicable models for relative quantification and inter-run calibration with proper error propagation along the entire calculation track. These models and algorithms are implemented in qBase, a free program for the management and automated analysis of qPCR data.
-
Background
Since its introduction more than 10 years ago [1], quantitative
PCR (qPCR) has become the standard method for
quantification of nucleic acid sequences. The ease of use and high
sensitivity, specificity and accuracy has resulted in a rapidly
expanding number of applications with increasing
throughput of samples to be analyzed. The software programs
provided along with the various qPCR instruments allow for
straightforward extraction of quantification cycle values from
the recorded fluorescence measurements, and at best,
interpolation of unknown quantities using a standard curve of
serially diluted known quantities. However, these programs
usually do not provide an adequate solution for the
processing of these raw data (coming from one or multiple runs) into
meaningful results, such as normalized and calibrated
relative quantities. Furthermore, the currently available tools all
have one or more of the following intrinsic limitations:
dedicated for one instrument, cumbersome data import, a limited
number of samples and genes can be processed, forced
number of replicates, normalization using only one reference
gene, lack of data quality controls (for example, replicate
variability, negative controls, reference gene expression
stability), inability to calibrate multiple runs, limited result
visualization options, lack of experimental archive, and
closed software architecture.
To address the shortcomings of the available software tools
and quantification strategies, we modified the classic
deltadelta-Ct method to take multiple reference genes and gene
specific amplification efficiencies into account, as well as the
errors on all measured parameters along the entire
calculation track. On top of that, we developed an inter-run
calibration algorithm to correct for (often underestimated)
run-torun differences.
Our advanced models and algorithms are implemented in
qBase, a flexible and open source program for qPCR data
management and analysis. Four basic principles were
followed during development of the program: the use of
correct models and formulas for quantification and error
propagation, inclusion of data quality control where required,
automation of the workflow as much as possible while
retaining flexibility, and user friendliness of operation. Our
quantification framework and software fit exactly in current
thinking that places emphasis on getting every step of a
realtime PCR assay right (such as RNA quality assessment,
appropriate reverse transcription, selection of a proper
normalization strategy, and so on [2]), especially if small
differences between samples need to be reliably demonstrated. In
this entire workflow, data analysis is an important last step.
Results and discussion
Determination of the error on estimated amplification
efficiencies
qBase employs a proven, advanced and universally applicable
relative quantification model. An important underlying
assumption is that PCR efficiency is assay dependent and
sample independent. While this may not be true in every
experimental situation, there is currently no consensus on
how sample specific PCR efficiencies should be calculated and
used for robust quantification. Most evaluation studies
attribute a lack of precision to these sample specific efficiency
estimation methods. Hence, the gold standard is still the use
of a PCR efficiency estimated by a serial dilution series
(preferably of pooled cDNA samples, to mimic as much as possible
the actual samples to be measured), at least if one aims at
accurate and precise quantification. Sample specific PCR
efficiency estimation has its usefulness, but currently only for
outlier detection [3-5].
Calculation of relative quantities from quantification cycle
values requires knowledge of the amplification efficiency of
the PCR. As stated above, amplicon specific amplification
efficiencies are preferably determined using linear regression
(formulas 1 and 5 in Materials and methods) of a serial
dilution series with known quantities (either relative or absolute).
However, the error on the estimated amplification efficiency
is almost never determined, nor taken into account. This
error can be calculated using linear regression as well
(formulas 2 to 4 and 6), and should subsequently be propagated
during conversion of the quantification cycle values to the
relative quantities. The formula for the error on the slope
provides the mathematical basis to learn how more accurate
amplification efficiency estimates can be achieved, that is, by
expanding the range of the dilution and including more
measurement points.
Calculation of normalized relative quantities and error
minimization
Methods for the conversion of quantification cycle values (Cq;
see Materials and methods for terminology) into normalized
relative quantities (NRQs) were first reported in 2001. The
simplest model described by Livak and Schmittgen [6]
assumes 100% PCR efficiency (reflected by a value of 2 for the
base E of the exponential function) and uses a single reference
gene for normalization:
Pfaffl [7] modified the above model by adjusting for
differences in PCR efficiency between the gene of interest (goi) and
a reference gene (ref):
NRQ =
This model constituted an improvement over the classic
delta-delta-Ct method, but cannot deal with multiple (f)
reference genes, which is required for reliable measurements of
subtle expression differences [8]. Therefore, we further
extended this model to take into account multiple stably
expressed reference genes for improved normalization.
Although not yet published, this advanced and generalized
model of relative quantification has been applied previously
in our nucleic acid quantification studies [8-12].
NRQ =
The calculation of relative quantities, normalization and
corresponding error propagation is detailed in formulas 7-16.
The basic principle of the delta-Cq quantification model is
that a difference (delta) in quantification cycle value between
two samples (often a true unknown and calibrator or
reference sample) is transformed into relative quantities using the
exponential function (...truncated)