An Algorithm And Code For Computing Exact Critical Values For The Kruskal-Wallis Nonparametric One-Way ANOVA

Journal of Modern Applied Statistical Methods, Dec 2004

In this article, an algorithm and code to compute exact critical values (or percentiles) for Kruskal-Wallis test on k independent treatment populations with equal or unequal sample sizes using Visual Basic (VB.NET) is provided. This program has the ability to calculate critical values for any k , sample sizes (ni ) , and significance level (α ) . An exact critical value table for k = 4 is also developed. The table will be useful to practitioners since it is not available in standard nonparametric statistics texts. The program can also be used to compute any other critical values.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://digitalcommons.wayne.edu/cgi/viewcontent.cgi?article=1594&context=jmasm

An Algorithm And Code For Computing Exact Critical Values For The Kruskal-Wallis Nonparametric One-Way ANOVA

Journal of Modern Applied Statistical Methods November An Algorithm And Code For Computing Exact Critical Values For The Kr uskal-Wallis Nonparametric One-Way ANOVA Sikha Bagui 0 0 Th e University of West Florida , Pensacola , USA Follow this and additional works at: http://digitalcommons.wayne.edu/jmasm Part of the Applied Statistics Commons, Social and Behavioral Sciences Commons, and the Statistical The ory Commons Recommended Citation - Article 20 Sikha Bagui Subhash Bagui University of West Florida, Pensacola In this article, an algorithm and code to compute exact critical values (or percentiles) for Kruskal-Wallis test on k independent treatment populations with equal or unequal sample sizes using Visual Basic (VB.NET) is provided. This program has the ability to calculate critical values for any k , sample sizes (ni ) , and significance level (α ) . An exact critical value table for k = 4 is also developed. The table will be useful to practitioners since it is not available in standard nonparametric statistics texts. The program can also be used to compute any other critical values. Introduction Headrick (2003) wrote an article for generating exact critical values for the Kruskal-Wallis (KW) one-way ANOVA using Fortran 77. In this article we present Visual Basic (VB.NET) Code for generating exact critical values for K-W tests using the Visual Basic Programming Language. VB.NET is more user friendly and more accessible than Fortran 77. While Fortran 77 may not be available to all, the proposed VB.NET program can be a simpler alternative to Fortran 77. When one or more treatment populations violate normality assumption or the homogeneity of treatment population variances, it is customary to use Kruskal-Wallis (1952 ) rank-based nonparametric test as an alternative to the conventional F test for one-way analysis Sikha Bagui is an Assistant Professor in the Department of Computer Science. Her areas of research are database and database design, data mining, pattern recognition, and statistical computing. Email: . Subhash Bagui is a Professor in the Department of Mathematics and Statistics. His areas of research are statistical classification and pattern recognition, bio-statistics, construction of designs, tolerance regions, statistical computing and reliability. Email: s. of variance (ANOVA) for k independent treatment populations. In order to find the critical values of KW tests, one needs to find the null distribution of the K-W statistics. In one-way ANOVA, the null hypothesis is that the effect of all treatment populations are the same. Thus, it is reasonable to use such a type of null distribution of the KW statistics which are derived under the assumption that all observations for treatment populations n1, n2 , , nk are from the same population to calculate the critical values of KW tests, where ni is the sample size of the i th treatment population. The K-W statistic depends on the rank-sums of each treatment population that are obtained from the combined ranks of N = n1 + n2 + + nk observations. It is known that the large sample null distribution of K-W statistic is approximately a chi-square (χ 2 ) distribution with (k −1) degrees of freedom (d.f.). Conover (1999) suggested that whenever k ≥ 4 and ni > 5, for each treatment population, a chisquare critical with (k −1) d.f. (χα2;k−1) be used to test the null hypothesis. But for small samples, say ni ≤ 5, the null-distribution of K-W statistic is not known and a chi-square approximation will not be a good approximation. The common nonparametric text books such as Conover (1999) , Gibbons (1992) ; Siegeland and Castellan (1989 ) provided exact critical values for the K-W test for k = 3 and ni ≤ 5 observations per treatment population. Major statistical software such as MINITAB, SPSS provide only the asymptotic P -value of the KW statistics. In view of all these, in this article, we provide a VB.NET program as an alternative to Fortran 77 to compute exact critical points for the K-W tests, and also report a table for exact critical values for the K-W test for k = 4 treatment populations and ni ≤ 5 observations per treatment population. Even though the number of ways N ranks can be divided into groups of n1, n2 , , nk grows, our VB.NET program works well with reasonable values of k and ni . Methodology To calculate the K-W statistics, first we need to generate N uniform pseudo-random numbers from the interval (0,1) . We assume that the probability of a tie is zero. Then the random variates are ranked to form permutation of numbers from 1 to N . The program then sequentially divides the permutation of ranks into k classes according to the users specific sample sizes of n1, n2 , , nk . The program then calculates rank sums of each treatment population, Rj , and next computes the value of K-W statistic H = 12 k R2 ∑ j − 3(N +1) . N (n +1) j=1 n j This process is replicated a sufficient number of times until the null distribution of H is modeled adequately. Then the program selects a critical value that is associated with a percentile values of 0.90, 0.95, 0.975 or 0.99 (or equivalently a alpha level of 0.10, 0.05, 0.025 or 0.01 ). In some cases, returned values may coincide with two different alpha values, since returned values are true for a range of P values. For example, givenα = 0.05 , k = 3, and ni = 5 , our VB.NET program will return a critical value of 5.659997 with a replication of 100,000 runs which is same as the value reported by Headrick (2003) for α = 0.05 , k = 3, and ni = 5 . Also, with adequate number of runs, our VB.NET program yields the same values reported by Conover (1999) in Table A8. Table A8 is for k = 3 , ni ≤ 5 and α = 0.1, 0.05, and 0.01. In Table 1 critical values are provided for K-W statistic for k = 4 , ni ≤ 5 and α = 0.1, 0.05, 0.025, and 0.01. At the bottom of the table, the asymptotic chi-square critical values of H from a chi-square critical value table are also provided. The notation Kα is a (α )100 % percentile for the K-W statistics which is equivalent to (1−α ) level critical value of the K-W statistic. This table will be very useful to the practitioners because it is not available in standard nonparametric text books. The critical values in Table 1 are generated using 1 million replications in each case. Conclusion In case of large N , the program needs large number of replications in order to adequately model the null distribution of K-W statistic H . So the replication number should be in increasing order such as 10, 000, 50, 000, 100, 000, 500, 000, and 1, 000, 000 , etc. and stop the process once two consecutive values are almost the same. If there are k independent treatment populations, then at least N !/(n1 !)(n2 !) (nk !) replications are necessary for a near fit of H . For a good fit of H , one needs much more replications than N! / (n1 !)(n2 !) (nk !) . The VB.NET code is given in the Appendix. The VB.NET program is very user friendly. The VB.NET program allows the user to provide the values of replication numbers, total number of observations, percentile fractions, and separate class sizes based on which the program will return a critical value. COMPUTING EXACT CRITICAL VALUES FOR THE KRUSKAL-WALLIS Appendix Public Class Form1 Inherits System.Windows.Forms.Form 'Kruskal-Wallis One-Way Anova Dim a, m, n, i, j, k, v, x, y, z, sumy, row, count As Integer Dim output, prompt1, prompt2, prompt_value, group_value As String Dim output2 As String Dim sums, sumz, H, percentile As Single Dim file1 As System.IO.StreamWriter COMPUTING EXACT CRITICAL VALUES FOR THE KRUSKAL-WALLIS 502 For i = 1 To array1.GetUpperBound(0) array2(i) = array1(i) Conover , W.J. ( 1999 ). Practical nonparametric statistics. (3rd ed.) . New York: Wiley. Gibbons , J.D. ( 1992 ). Nonparametric statistical inference . (3rd ed.) . New York: Marcel Dekker. Headrick , T.C. ( 2003 ). An algorithm for generating exact critical values for the KruskalWallis One-way ANOVA . Journal of Modern Applied Statistical Methods , 2 , 268 - 271 . Kruskal , W.H. & Wallis , W.A. ( 1952 ). Use of ranks in one-criterion analysis of variance . Journal of American Statistical Association , 47 , 583 - 621 . Minitab ( 2000 ). Minitab for Windows , release 13 .3, Minitab Inc ., State College , PA. Siegel , S. , & Castellan , N.J. ( 1989 ). Nonparametric statistics for the behavioral sciences, (2nd ed .). New York: McGraw-Hill . SPSS ( 2002 ). SPSS for Windows , version 11 .0, SPSS , Inc., Chicago, IL. output = array4(v) output2 = percentile & " = " & output file1. WriteLine(output2) file1. Close () End Sub Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase .Load


This is a preview of a remote PDF: http://digitalcommons.wayne.edu/cgi/viewcontent.cgi?article=1594&context=jmasm

Sikha Bagui, Subhash Bagui. An Algorithm And Code For Computing Exact Critical Values For The Kruskal-Wallis Nonparametric One-Way ANOVA, Journal of Modern Applied Statistical Methods, 2004,