JMASM20: Exact Permutation Critical Values For The KruskalWallis OneWay ANOVA
Journal of Modern Applied Statistical Methods
November
JMASM20: Exact Permutation Critical Values For hTe Kr uskalWallis OneWay ANOVA
Justice I. Odiase 0 1
Sunday M. Ogbonmwan 0 1
0 Part of the Applied Statistics Commons, Social and Behavioral Sciences Commons , and the
1 University of Benin , Nigeria
Follow this and additional works at: http://digitalcommons.wayne.edu/jmasm Statistical The ory Commons Recommended Citation

Justice I. Odiase Sunday M. Ogbonmwan
Department of Mathematics
University of Benin, Nigeria
The exhaustive enumeration of all the permutations of the observations in an experiment is the only
possible way of truly constructing exact tests of significance. The permutation paradigm requires no
distributional assumptions and works well with values that are normal, almost normal and nonnormally
distributed. The KruskalWallis test does not require the assumptions that the samples are from normal
populations and that the samples have the same standard deviation. In this article, the exact permutation
distribution of the KruskalWallis test statistic is generated empirically by actually obtaining all the
distinct permutations of an experiment. The tables of exact critical values for the KruskalWallis oneway
ANOVA are produced.
Introduction
Variation is inherent in nature and errors are
made occasionally when inferences are drawn
from experiments. The risk in decision making
cannot be totally eliminated but it can be
controlled if correct statistical procedures are
employed. The unconditional permutation
approach is a statistical procedure that ensures
that the probability of a type I error is exactly α
and ensures that the resulting distribution of the
test statistic is exact
(Agresti, 1992; Good, 2000;
Pesarin, 2001)
.
Scheffe (1943)
demonstrated that for a
general class of problems, the permutation
approach is the only possible method of
J. I. Odiase is a Lecturer in the Department of
Mathematics. His areas of research are statistical
computing and nonparametric statistics. Email
him at . S. M.
Ogbonmwan is an Associate Professor of
Statistics, Department of Mathematics,
University of Benin, Nigeria. His areas of
research are statistical computing and
nonparametric statistics. Email him at
.
constructing exact tests of significance. It is
asymptotically as powerful as the best
parametric test
(Hoeffding, 1952)
. In this article,
consideration is given to the exhaustive
permutation of the ranks of the observations in a
single factor multisample experiment to arrive
at the exact distribution of the KruskalWallis
(KW) test statistic.
The method of obtaining an exact test of
significance originated with
Fisher (1935)
. The
essential feature is that all the distinct
arrangements of the observations are considered,
with the proviso that all permutations are equally
likely under the null hypothesis. An exact test on
the level of significance α is constructed by
choosing a proportion, α , of the permutation as
the critical region.
Statisticians have considered for some
decades the possibility of generating exact
critical values for the common test statistics that
are in use today. This has resulted in the
development of several ways such as the exact
conditional permutation approach
(Fisher, 1935;
Agresti, 1992)
, the Monte Carlo approaches
such as the Bootstrap
(Efron, 1979; Efron and
Tibshirani, 1993)
, the Bayesian approach
(Casella & Robert, 2004), and the likelihood
approach
(Owen, 1988; BarndorffNielsen &
Hall, 1988)
.
The works of Siegel and Castellan
(1989),
Conover (1999)
,
Headrick (2003)
,
Bagui
& Bagui (2004)
are contributions to the quest for
exact critical values but the distributions are
obtained from either simulation or asymptotic
approximations of the distribution of the KW
test statistic. For small samples, ni ≤ 5, i = 1(1)p
in a psample experiment, the null distribution of
KW statistic is not known and a chisquare
approximation will not be a good approximation,
(see
Bagui & Bagui (2004)
). The consideration
given in this article produces the exact
distribution of the KW test statistic for small
samples.
Distributionfree analysis of variance
The singlefactor ANOVA model for
comparing p populations or treatment means
assumes that for i = 1, 2, …, p, a random sample
of size n is drawn from a normal population with
mean µ i and variance σ 2. The normality
assumption is required for the validity of the F
test while the validity of the KruskalWallis test
for testing equality of the µ i’s
(Kruskal &
Wallis, 1952)
depends only on the amount by
which observed values deviate from their means
µ i’s (random error) having the same continuous
distribution.
Given a multisample experiment with
X i = (X i1 , X i 2 ,..., X in i )T , i = 1(1)p
and
XN = (X 1 , X 2 ,..., X p ),
= ∑
p
where N
ni , the total number of
i=1
observations in the data set. Suppose that one
ranks all the N observations from 1 (smallest Xij)
to N (largest Xij), the permutation test procedure
presented in this article, computes an empirical
estimate of the cumulative distribution of the test
statistic T under the null hypothesis. Let the
layout of the ranks of the observations Xij be as
follows:
R i = (ri1 , ri 2 ,..., r in i
)T , i = 1(1)p.
and
RN = (R 1 , R 2 ,..., R p ) , N = ∑
p
i=1
ni .
Under the null hypothesis, RN is composed of N
independent and identically distributed random
variables and hence conditioned on the observed
data set. An exhaustive permutation of the ranks
yields
M =
N
!
P
Π [(n i ) ! ]
i = 1
permutations of the N ranks of the variates of p
subsets of size ni, i = 1(1)p which are equally
likely, each having the conditional probability
M1.
When H 0 : µ 1 = µ 2 = ... = µ p is true,
the N observations are assumed to have come
from the same distribution, in which case all
possible assignments of the rank 1, 2, …, N to
the p samples are equally likely and the ranks
will be intermingled in these samples. Let Rij
denote the rank of the jth observation in the ith
treatment Xij. Let Ri . and Ri. denote
respectively the total and mean of the ranks in
the ith treatment. The KW test statistic is a
measure of the extent to which the Ri. ’s deviate
from their common expected value
2
H0 is rejected if the computed value of the
statistic indicates too great a discrepancy
between observed and expected rank averages.
The KW test statistic is
, and
N + 1
H =
12 p R 2
∑ i• − 3(N + 1) .
N (N + 1) i=1 ni
If H0 is rejected when H ≥ c , then c
should be chosen so that the test has level α .
That is, c should be the uppertail critical value
of the distribution of H when H0 is true. Under
H0, each possible assignment can be
enumerated, the value of H determined for each
one, and the null distribution obtained by
counting the number of times each value of H
occurs. When H0 is true, the largesample
approximation is applied if p = 3, ni ≥ 6 , i =
1(1)3 or p > 3, ni ≥ 5 , i = 1(1)p
(Devore, 1982;
Rohatgi, 1984)
. H has approximately a
chisquared distribution with p – 1 degrees of
freedom. An approximate level α test is given
by: Reject H0 if H ≥ χ α2, p−1 .
Methodology
The process of obtaining the permutations starts
by choosing the test statistic T and the
acceptable significance level α . Let π 1, π 2,
…, π n be a set of all distinct permutations of the
ranks of the data set in the experiment. The
permutation test procedure is as follows:
1. Rank the observations of the experiment
as required by the KW test.
2. Compute the observed value of the KW
test statistic (H1 = t0).
3.
Obtain a distinct permutation π i , of the
ranks in Step 1.
4. Compute the KW test statistic Hi for
permutation π i in Step 3, that is, Hi =
H(π i ).
5. Repeat Steps 3 and 4 for i = 2, 3, …, M.
6. Construct an
distribution for H
p0 = p(H ≤ H i ) =
empirical
cumulative
1 M
∑ ψ (t 0 −H i ),
M i=1
where
ψ(·) = ⎨⎧ 1, if t 0 ≥ H i .
⎩ 0, if t 0 < H i
7.Under the empirical distribution, if p0 ≤ α ,
reject the null hypothesis.
The complexity in permutation test lies
in obtaining all the distinct permutations of the
observations in a given experiment. For
example, a foursample experiment with six
variates in each sample requires
2,308,743,493,056 permutations. The frequency
distribution is constructed for all the distinct
occurrences of the test statistic from which the
probability distribution of the test statistic is
computed.
The number of permutations of the
ranks of a twosample experiment is
∑ n ⎛⎜ n1 ⎞⎟ ⎛⎜ n2 ⎞⎟⎟ , n = min (n1, n2),
i=0 ⎝⎜ i ⎠⎟ ⎝⎜ i ⎠
see
Odiase & Ogbonmwan (2005)
for details.
After obtaining the permutations of the
ranks of a two sample experiment, the number of
ways to permute the ranks of any n3 of the
combined ranks (n1 + n2 + n3) of the variates of
the threesample experiment yields
⎛ 3
⎝⎜⎜⎛ n1 + nn23 + n3 ⎠⎟⎟⎞ ∑ i=n0 ⎝⎜⎜⎛ ni1 ⎠⎟⎞⎟ ⎝⎜⎛⎜ ni2 ⎠⎟⎟⎞ = ⎝⎜⎜⎜ ∑ k=n13nk ⎟⎠⎟⎟⎞ ∑ i=n0 ⎝⎜⎜⎛ ni1 ⎠⎟⎞⎟ ⎝⎜⎛⎜ ni2 ⎠⎟⎟⎞
A complete enumeration of the distinct
permutations of the ranks of a foursample
experiment yields
⎛ 4 ⎞ ⎛ 3
⎜ ∑ nk ⎟ ⎜ ∑ nk ⎟⎞ ∑ n ⎛⎜ n1⎟⎞ ⎜⎛ n2 ⎞⎟⎟ =
⎜⎜ k=1 ⎟⎟ ⎜⎜ k=1 ⎟⎟ i=0 ⎝⎜ i ⎠⎟ ⎝⎜ i ⎠
⎝ n4 ⎠ ⎝ n3 ⎠
⎛ j
4 ⎜ ∑ n ⎟⎞ n ⎛ n1 ⎞⎟ ⎛⎜ n2 ⎞
∏j=3 ⎜⎜⎝ k=n1j k ⎟⎟⎠ ∑ i=0 ⎜⎝⎜ i ⎠⎟ ⎝⎜ i ⎟⎟⎠
Continuing in this manner, for p ≥ 3
treatments, the distinct permutations of the ranks
of the variates are enumerated through
∏j=p3 ⎜⎛⎜⎜⎝ ∑ k =jn1 njk ⎞⎟⎟⎟⎠ ∑ i=n0 ⎛⎜⎝⎜ ni1 ⎞⎟⎠⎟ ⎛⎜⎝⎜ ni2 ⎞⎟⎟⎠ = ∏jp=1 ⎜⎛⎜⎜⎝ ∑ k =jn1 njk ⎟⎞⎟⎟⎠ .
612 EXACT PERMUTATION CRITICAL VALUES FOR THE KRUSKALWALLIS
For the balanced case, n1 = n2 = … = np
= n, the number of distinct permutations of the
p ⎛ jn⎞
ranks of the variates is Πj=1 ⎜⎜⎝ n ⎠
⎟⎟ . As an
illustration, let
and
Ri = (ri1,ri2,...,rini )T , i = 1(1)p
RN = (R1, R2,...,Rp ).
Consider a threesample experiment with
observations xij, n1 =3, n2 = n3 = 2, that is,
⎜⎛ x11
⎜ x12 x22 x32 ⎟ . Assuming there are no ties,
⎜⎝ x13 ⎠⎟
the configuration of the ranks of the experiment
⎜⎛ r11 r21 r31 ⎞⎟
can be taken as ⎜ r12 r22 r32 ⎟ . An exhaustive
⎜⎝ r13 ⎠⎟
permutation of this experiment yields 210
distinct permutations of the ranks.
First obtain the 6 permutations of the
ranks of the 4 variates of the last two treatments,
that is,
⎜⎝ r22 rr3321 ⎠⎟⎞⎟ ,⎝⎛⎜⎜ rr2321 rr3221 ⎠⎟⎞⎟ ,⎝⎛⎜⎜ rr2322 rr3211 ⎠⎟⎞⎟ ,
⎛⎜ r21
⎜⎛ r21
⎜⎝ r31 rr3222 ⎠⎟⎞⎟ , ⎜⎜⎝⎛ rr3221 rr2321 ⎠⎟⎞⎟ , ⎜⎛ r31
⎜⎝ r32 rr2221 ⎠⎟⎞⎟ .
There are 35 ways to permute any 3 ranks of the
combined 7 ranks of the variates of the
experiment.
Each of the 35 ways will combine with the 6
permutations of the remaining 4 ranks of the
variates making up the last two treatments in any
configuration of the experiment, that is,
⎛ 7⎞ 2 ⎛ 2⎞ ⎛ 2⎞
⎜ ⎟ ∑ ⎜ ⎟ ⎜ ⎟⎟ .
⎜⎝ 3⎟⎠ i=0 ⎝⎜ i ⎠⎟ ⎝⎜ i ⎠
Consider the set of all these 210
permutations, for each one of them, compute the
test statistic of interest and hence calculate the
probability of the different values of the test
statistic based on the number of times each is
occurring. When ties occur in the data set, the
tied observations are usually assigned the mean
of the ranks they would have been assigned if
they were distinct. Ties do not pose any problem
to the permutation test presented in this article.
Assuming no ties, the experiment just presented
will have ranks {1, 2, 3, 4, 5, 6, 7} represented
⎛
⎜ 1
⎜
⎜ 2
⎜
⎜ 3
⎜ 4
⎜ :
⎜
⎜ :
⎜
⎜ n1
⎝
:
:
:
:
:
:
:
:
n1 + n2
n1 + n2 + n3
:
:
:
:
∑
p−1
i=1
⎞
ni + 1⎟
⎟
⎟
⎟
⎟
⎟
⎟
⎟
⎟
p ⎟
ni ⎟
⎠
∑
i=1
⎜⎝ 3
Permutation algorithms
Considering the associated complexity
in a complete enumeration of the distinct
permutations necessary for the compilation of
the distribution of the KW test statistic,
computer algorithms for an exhaustive
enumeration are now presented.
The first step in developing permutation
algorithm is to formulate an initial configuration
of the ranks of the variates of an experiment by
taking the trivial configuration given below as:
Algorithm (PERMUTATION) of
Odiase
& Ogbonmwan (2005)
can handle the
permutation of the ranks of the variates in a
twosample experiment. Algorithm 1 in this article
generates the distinct permutations of the ranks
of the variates of a threesample experiment and
relies on the permutation of the ranks of the
variates in a twosample experiment.
Algorithm 2 calls Algorithm 1 and then
generates the distinct permutations of the ranks
of the variates of a foursample experiment.
Algorithms 1 and 2 can be extended to take care
of the sample sizes under consideration.
Results
Critical values for the KW test statistic
The algorithms were implemented in
Intel Visual Fortran. Figures 1 – 10 show the
small sample distribution of the KW test
statistic for different sample sizes for 3 and 4
samples. The resulting tables of exact critical
values as obtained from the exact permutation
distribution of the KW test statistic are
presented in Tables 1 and 2.
Conclusion
Figures 1 and 2 reveal the fact that the chi
squared distribution, which is the large sample
approximation of the KW test statistic, will
poorly approximate the exact distribution of the
KW test statistic for very small sample sizes.
As sample sizes increase, the shape of the chi
squared distribution begins to emerge as seen in
Figures 3 – 10.
The critical values for a test statistic are
usually determined by cutting off the most
extreme 100α % of the theoretical frequency
distribution of the test statistic, where α is the
level of significance, see
Siegel and Castellan
(1989)
. The critical values of the KW test
statistic contained in Tables 1 and 2 are obtained
from the enumeration of all the distinct
permutations of the ranks of the variates in an
experiment. These critical values are exact and
therefore ensures that the probability of a type I
error in decisions arising from the use of the
KW test is exactly α .
Restore original ranks
43: for II0 ← 1, P do
44: for JJ0 ← 1, K(II0) do
45: Z1(JJ0, II0) ← Z(JJ0, II0)
46: end for
47: end for
Algorithm 2 (4 samples)
Generate ranks
1: KK ← 0
2: for I ← 1, P do
3: KK ← KK + K(I1)
4: for J ← 1, K(I) do
5: Z(J, I) ← KK + J
6: Z1(J, I) ← Z(J, I)
7: Y(J, I) ← Z(J, I)
8: Y1(J, I) ← Y(J, I)
9: X(J, I) ← Z(J, I)
10: X1(J, I) ← X(J, I)
11: end for
12: end for
13: call Algorithm 1
14: for R2 ← 1, P do
15: for R3 ← 1, K do
16: Y(R3,R2) ← Z1(R3,R2)
17: Y1(R3,R2) ← Z1(R3,R2)
18: end for
19: end for
Adjust Algorithm 1 as follows and insert here:
20: Change all the loop variables
21: Change the variable names TEMPA, TEMPA1, TEMPA2, TT, TT1, …
22: Replace Steps 10, 22, … with [Variable name ← P – 2, P]
23: Replace all [P – 2] with [P – 3]
24: Replace [Y1] with [Z1]
25: Replace [Obtain a distinct permutation of ranks in the last two samples] with [Call Algorithm 1]
26: Construct the empirical distribution of H
27: Sort values of H in ascending order of magnitude
28: Construct the CDF for H
Sample Size
5,5,2,1
5,5,2,2
5,5,3,1
5,5,3,2
5,5,3,3
5,5,4,1
5,5,4,2
5,5,4,3
5,5,4,4
5,5,5,1
5,5,5,2
5,5,5,3
Agresti , A. ( 1992 ). A survey of exact inference for contingency tables , Statistical Science , 7 , 131  177 .
Bagui , S. , & Bagui , S. ( 2004 ). An algorithm and code for computing exact critical values for the KruskalWallis nonparametric oneway ANOVA . Journal of Modern Applied Statistical Methods , 3 , 498  503 .
BarndorffNielsen , O. E. , & Hall , P. ( 1988 ). On the levelerror after Bartlett adjustment of the likelihood ratio statistic . Biometrika , 75 , 374  378 .
Conover , W. J. ( 1999 ). Practical nonparametric statistics . New York: Wiley.
Devore , J. L. ( 1982 ). Probability and statistics for engineering and the sciences . California: Brooks/Cole Publishing Company.
Efron , B. ( 1979 ). Bootstrap methods: Another look at the jackknife . The Annals of Statistics , 7 , 1  26 .
Efron , B. , & Tibshirani , R. J. ( 1993 ). An introduction to the bootstrap . New York: Chapman and Hall.
Fisher , R. A. ( 1935 ). The design of experiments . Edinburgh: Oliver and Boyd
Good , P. ( 2000 ). Permutation tests: A practical guide to resampling methods for testing hypotheses (2nd ed .). New York: Springer Verlag.
Headrick , T. C. ( 2003 ). An algorithm for generating exact critical values for the KruskalWallis oneway ANOVA . Journal of Modern Applied Statistical Methods , 2 , 268  271 .
Hoeffding , W. ( 1952 ). Large sample power of tests based on permutations of observations . The Annals of Mathematical Statistics , 23 , 169  192 .
Kruskal , W. H. , & Wallis , W. A. ( 1952 ). Use of ranks in onecriterion variance analysis . Journal of the American Statistical Association , 47 , 583  634 .
Odiase , J. I. , & Ogbonmwan , S. M. ( 2005 ) An algorithm for generating unconditional exact permutation distribution for a twosample experiment . Journal of Modern Applied Statistical Methods , 4 , 319  332 .
Owen , A. B. ( 1988 ). Empirical likelihood ratio confidence intervals for a single functional . Biometrika , 75 , 237  249 .
Pesarin , F. ( 2001 ). Multivariate permutation tests . New York: Wiley.
Rohatgi , V. K. ( 1984 ). Statistical inference . New York: John Wiley & Sons.
Scheffe , H. ( 1943 ) Statistical inference in the nonparametric case . The Annals of Mathematical Statistics , 14 , 305  332 .
Siegel , S. , & Castellan , N. J. ( 1989 ). Nonparametric statistics for the behavioural sciences (3rd ed .). New York: McGrawHill .