More power to you: Simple power calculations for treatment effects with one degree of freedom
JAMIE I. D. CAMPBELL
0
1
VALERIE A. THOMPSON
)
0
1
0
This work was funded by the Natural Sciences and Engineering Re- search Council of Canada. We thank Raymond Gunter
, Lorin Elias, Deb- orah Saucier,
and Eyvind Ohm for their feedback and suggestions on the ment of Psychology, University of Saskatchewan
, 9 Campus Drive, Saskatoon, SK, S7N 5A5 Canada (
1
University of Saskatchewan
, Saskatoon, Saskatchewan,
Canada
Although numerous computer programs for statisticalpower analysis are available, power is an underused aspect of experimental analysis, perhaps because of the perceived difficulty of performing the necessary calculations or because existing computer software can be expensive or complicated to learn. For single-degree-of-freedomtests, however, it is possible to calculate power in a straightforward manner, using the t distribution. Because these calculations are based on t, they use easily understood and readily available quantities. These calculations can be performed with a desk calculator; we also present a simple-to-use program called MorePower that will perform the necessary calculations. The straightforward nature of the calculations potentially will enable more researchers to consider issues of power when planning and reporting their experiments.
-
There are many sophisticated computer programs
available for power analysis, but these may be expensive,
complicated to learn, or limited in their application (Thomas &
Krebs, 1997). Here, we present simple formulas for
calculating power and related statistics for any analysis of
variance (ANOVA) test with one degree of freedom. We
explain how to do the calculations by hand, as well as by
using a simple-to-use freeware program for Microsoft
Windows, called MorePower (http://duke.usask.ca/~campbelj/
work /MorePowerx.html). Our formulas and program
have several distinctive features. First, because they are
based on the familiar and relatively simple t test, the
formulas and program are easy to use and understand;
nonetheless, they apply to any single-degree-of-freedom test
(main effect or interaction) from within-subjects,
betweensubjects, or mixed designs of arbitrary complexity
(assuming equal ns for between-subjects factors). Second, effect
size is estimated (or provided by the user) in the original
units of measurement, rather than in less intuitive
variancebased ratios or smallmediumlarge approximations [the
MorePower program also calculates h2, however; i.e., MST/
(MST 1 MSE * df )]. Third, to specify effect size, one may
enter either MST or the difference in original units; to
specify error variance, one may enter either MSE or the
variance of the difference. Thus, given the F-ratio and
MSE provided in most research reports, one can easily
compute observed power and effect size for any ANOVA
effect with one degree of freedom. This makes it a trivial
task to assess the power associated with published null
effects or to use previous results to estimate the required
effect size or sample size for planned experiments.
Calculating Power for the
Between-Subjects t Test
Our power analysis extends the approach developed by
Cochran and Cox (1957) for the between-subjects t test
(also, see Hays, 1994, for an analogous derivation for z).
Ordinarily, one rejects the null hypothesis when the
observed difference between means (dc) is large enough
such that t exceeds the value set a priori to represent the
Type I error rate (ta ):
In Equation 1, s2d 5 variance of the difference between
means and nt 5 number of observations per treatment
(equal ns assumed). Power is the probability of drawing
a sample in which dc satisfies the above requirement
from a sampling distribution whose mean (db ) is greater
than 0; power, therefore, is the probability that t . tb :
tb =
For our analysis, we combined Equations 1 and 2 and then
solved for power (probability t . tb), effect size (db), and
sample size (nt).
Effect Size:
Sample Size:
tb = ta
db =
nt =
Extension to multifactor designs. We extend this
analysis to multifactor ANOVA designs, including
betweensubjects, within-subjects, and mixed designs. This
extension is possible because sd2 can be derived from MSE (see
below) and because F and t2 are equivalent for any 2k
effect in a multifactor design. That is, any F test with one
degree of freedom in the numerator corresponds to a
pairwise comparison (e.g., a 2 3 2 interaction corresponds
to a pairwise comparison of the difference of
differences), even when the 2k effect is embedded in a design
that includes factors with more than two levels.
Quantities Needed for Calculations
Power, sample size, and effect size can be calculated by
hand using Equations 3, 4, and 5, or by using the
MorePower program, which is described in the next section. In
either case, the researcher needs to have an estimate of
error variance (sd2 or MSE) and a value for a, as well as
two of the remaining three variables: sample size (nt),
effect size (db), or power (one can calculate the third, given
values for the other two).
Computing sd2. One can estimate sd2 either from an
estimate of individual cell variances or from MSE
(MorePower will calculate sd2, given a value of MSE). Equation 6
calculates sd2 from individual cell variances. For
betweensubjects designs, the intercell correlations are all zero;
thus, the value of the second half of Equation 6 is reduced
to zero:
Equation 7 computes sd2 from MSE:
sd2 = s21 + s22 - 2r12s1s2 .
In this equation, B is the number of two-level
betweensubjects factors in the relevant test, W is the number of
two-level within-subjects factors in the relevant test, and
y is the total number of within-subjects cells in the
experiment. This equation can be used to calculate sd2 for any
comparison, including those having both between- and
within-subjects factors. Note that when the effect of
interest involves only between-subjects or only
withinsubjects factors, Equation 7 can be simplified to
Equations 7a and 7b, respectively:
sd2 = MSE 2 B
sd2 = MSE
Calculating db (effect size). The quantity db refers to
the size of the difference one is interested in and is equal
to the absolute difference between treatment conditions.
For pairwise comparisons and main effects, therefore, db
equals the absolute value of the difference between the
means for the two levels of the relevant factor, averaged
over the levels of the other factors. For 2 3 2 interactions,
db is the absolute value of the difference of differences,
(M11 2 M12) 2 (M21 2 M22) (again, averaged over the
levels of other factors), and for 2 3 2 3 2 interactions, it is
the absolute value of the difference between the
difference of differences [(M111 2 M112) 2 (M121 2 M122)] 2
[(M211 2 M212) 2 (M221 2 M222)] and so on. For example,
if M11, M12, M21, and M22 were equal to 50, 30, 15, and
5, then db for the 2 3 2 interaction would equal (50 2
30) 2 (15 2 5), or 10.
The observed effect size for a specific main or
interaction effect can also be calculated using MS (...truncated)