Detection of interactions between a dichotomous moderator and a continuous predictor in moderated multiple regression with heterogeneous error variance

Behavior Research Methods, Feb 2009

Moderated multiple regression (MMR) has been widely used to investigate the interaction or moderating effects of a categorical moderator across a variety of subdisciplines in the behavioral and social sciences. In view of the frequent violation of the homogeneity of error variance assumption in MMR applications, the weighted least squares (WLS) approach has been proposed as one of the alternatives to the ordinary least squares method for the detection of the interaction effect between a dichotomous moderator and a continuous predictor. Although the existing result is informative in assuring the statistical accuracy and computational ease of the WLS-based method, no explicit algebraic formulation and underlying distributional details are available. This article aims to delineate the fundamental properties of the WLS test in connection with the well-known Welch procedure for regression slope homogeneity under error variance heterogeneity. With elaborately systematic derivation and analytic assessment, it is shown that the notion of WLS is implicitly embedded in the Welch approach. More importantly, extensive simulation study is conducted to demonstrate the conditions in which the Welch test will substantially outperform the WLS method; they may yield different conclusions. Welch’s solution to the Behrens-Fisher problem is so entrenched that the use of its direct extension within the linear regression framework can arguably be recommended. In order to facilitate the application of Welch’s procedure, the SAS and R computing algorithms are presented. The study contributes to the understanding of methodological variants for detecting the effect of a dichotomous moderator in the context of moderated multiple regression. Supplemental materials for this article may be downloaded from brm.psychonomic-journals.org/content/supplemental.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://link.springer.com/content/pdf/10.3758%2FBRM.41.1.61.pdf

Detection of interactions between a dichotomous moderator and a continuous predictor in moderated multiple regression with heterogeneous error variance

GWOWEN SHIEH 0 0 National Chiao Tung University , Hsinchu, Taiwan Moderated multiple regression (MMR) has been widely used to investigate the interaction or moderating effects of a categorical moderator across a variety of subdisciplines in the behavioral and social sciences. In view of the frequent violation of the homogeneity of error variance assumption in MMR applications, the weighted least squares (WLS) approach has been proposed as one of the alternatives to the ordinary least squares method for the detection of the interaction effect between a dichotomous moderator and a continuous predictor. Although the existing result is informative in assuring the statistical accuracy and computational ease of the WLS-based method, no explicit algebraic formulation and underlying distributional details are available. This article aims to delineate the fundamental properties of the WLS test in connection with the well-known Welch procedure for regression slope homogeneity under error variance heterogeneity. With elaborately systematic derivation and analytic assessment, it is shown that the notion of WLS is implicitly embedded in the Welch approach. More importantly, extensive simulation study is conducted to demonstrate the conditions in which the Welch test will substantially outperform the WLS method; they may yield different conclusions. Welch's solution to the Behrens-Fisher problem is so entrenched that the use of its direct extension within the linear regression framework can arguably be recommended. In order to facilitate the application of Welch's procedure, the SAS and R computing algorithms are presented. The study contributes to the understanding of methodological variants for detecting the effect of a dichotomous moderator in the context of moderated multiple regression. Supplemental materials for this article may be downloaded from brm.psychonomic-journals.org/content/supplemental. - Researchers are often interested in determining whether the direction or strength of the relation between a predictor variable and a response variable varies with the value of a third or moderator variable. The existence of moderating effects implies that the predictor has a fundamentally distinct impact on the response across levels of the moderator. The formulation of differential prediction behavior occurs in diverse research settings such as gender studies. Essentially, the moderated relationships can be conceptualized and analyzed in terms of interaction effects between the predictor and moderator variables. It is consensually recognized that moderated multiple regression (MMR) has become the major technique for testing hypotheses about moderating effects of categorical variables in psychology, management, education, and related disciplines; see Aguinis (2004) for general and illuminating expositions. When the null hypothesis of no moderating effect is rejected, it indicates that the predictorresponse relationship is stronger for one moderator-based group than for another. Neglect of an interaction effect or failure to detect a moderating effect generally leads to prediction bias in favor of subjects in some groups and against members of the other groups. The procedure for detecting the effects of categorical moderator variables is methodologically identical to that for testing the equality of regression slope coefficients in two or more regression lines. Accordingly, the test can be conducted with the ordinary least square (OLS) partial F test in traditional MMR analysis. However, numerous studies have shown that MMR may often yield erroneous conclusions; in particular, many theory-based hypotheses of moderated phenomena are frequently not supported. In response to the failures to detect sound hypothesized moderating effects, several researchers have investigated the accuracy of MMR to evaluate moderating effects under various conditions. For example, Aguinis and StoneRomero (1997) and Stone-Romero, Alliger, and Aguinis (1994) provided thorough treatments of the methodological artifacts and statistical implications associated with the effects of dichotomous moderators. More specif ically, considerable attention has been devoted to raising awareness of the often violated assumption of homogeneous error variance when assessing moderating effects of categorical variables; see Aguinis, Petersen, and Pierce (1999) and Aguinis and Pierce (1998) for comprehensive descriptions and excellent reviews. It should be noted that the 12-year review of Aguinis et al. showed that the violation of homogeneity assumption is approximately 40% to 60% of the MMR tests reported in three prestigious journals with rigorous methodological standards: Academy of Management Journal, Journal of Applied Psychology, and Personnel Psychology. Hence, they suggested that the violation situation is at least as common for tests reported in other journals in organizational science. Naturally, the accuracy of the OLS-based F test depends on the strong assumption of homogeneous withingroup error variance. Several empirical studies have been conducted to ascertain the effects of heterogeneous error variance on the performance of the F test. For detailed discussions, see Alexander and DeShon (1994), DeShon and Alexander (1994, 1996), Dretzke, Levin, and Serlin (1982), and Overton (2001). These Monte Carlo simulations concluded that the Type I error rate and power of the regular MMR F test may be substantially affected when group sample sizes are equal, and severely distorted when group sample sizes are unequal. Thus, researchers may commit a Type I error or a Type II error, depending on the specific sample characteristics and postulated model formulations. Consequently, the study may discover a fake interaction effect (Type I error) or mistakenly dismiss an important moderator variable (Type II error). In either case, the result impedes theoretical development and scientific advancement of moderation research. Apparently, the regular F test is not a proper procedure that accounts for the nature of heterogeneity of within-group error variance, and continual efforts have explored alternative methods when testing hypotheses about categorical moderators. The complexity of heterogeneity error variance incurs numerous investigations, which offer various approximations and computing algorithms for solving the problem. Since the problem of heterogeneity of error variance in testing for equality of regression slopes is statistically equivalent to the problem of variance heterogeneity in ANOVA, the available tests for examining mean equality under variance heterogeneity in ANOVA can be applied to the detection of categorical moderating effects in MMR. Notably, the methods of Alexander and Govern (1994), James (1951), and Welch (1951) for testing the equality of K ( 2) independent means under heterogeneity of variance have been adapted to the tests for the equality of K independent regression slopes under heterogeneous error variance in Alexander and DeShon (1994), DeShon and Alexander (1994, 1996), and Dretzke et al. (1982). The exact formulations and test procedures of the three approximations A, J, and F * developed from Alexander and Govern (1994), James (1951), and Welch (1951), respectively, can be found in DeShon and Alexander (1996). On the basis of the numerical examinations for two-group situation in the abovementioned studies, the performance of the three methods A, J, and F * was essentially equivalent for K 2, and choosing the best approximation is difficult. Nonetheless, the A approach was shown to possess a number of desirable characteristics and was recommended by DeShon and Alexander (1996) as the general procedure in lieu of regular MMR F statistics for testing categorical moderator hypotheses when the assumption on homogeneity of within-group error variance is not tenable. Although programs for computing A, J, and F * are available in Aguinis et al. (1999) and DeShon and Alexander (1996), it was pointed out in Overton (2001) that these test procedures do not support follow-up analyses. Accordingly, Overton proposed a weighted least squares (WLS) approach for the K 2 groups that maintains MMR within the familiar multiple regression framework. Moreover, it was demonstrated in Overton that the WLS F test is not only accurate, but can also be easily executed using the standard procedures of the SAS statistical package. Therefore, one advantage of the WLS-based method over the existing approaches is that the corresponding follow-up analysis of moderating effects can be readily performed using the embedded features of SAS or other popular software systems. Consequently, there appears to be a lack of consensus in the literature on which method is most appropriate for detecting the effects of a dichotomous moderator variable in MMR under heterogeneous error variance. Although the notion of the WLS procedure and its corresponding computing aspect are thoroughly presented in Overton (2001), no explicit analytical form of the test statistics was available. Even though Overton noted that the WLS-based MMR and the OLS-based MMR yield identical regression coefficient estimators but differ in their standard errors for the coefficient estimators, no further detailed expressions were given. It is of both methodological importance and practical interest to obtain the exact formulations of the coefficient estimators, associated estimated variance, and resulting test statistic in the context of the WLS principle. To our knowledge, no research to date has examined the theoretical issue of WLS-based procedures in greater detail. On the other hand, the hypothesis testing procedure of F-type statistics for assessing the moderating effects of a dichotomous variable is nondirectional in nature. Depending on the purpose of the study, a particular one-sided test might be preferable. Hence, it is more flexible and informative to conduct the test with a t statistic, since it can be used for one-sided alternatives, whereas a partial F test cannot. Furthermore, a confidence interval may be more useful for interpreting the magnitude of the moderating effect. As in many applications, a t statistic can be naturally adopted to construct a confidence interval. However, this is apparently not the case for an F test. Since the detection of slope differences between two regression lines or interactions between a dichotomous moderator and a continuous predictor represents the vast majority of MMR research, this article attempts to derive the analytical results for the WLS analysis and the related criteria in a unified and relatively transparent way. The general formulations of the aforementioned techniques for comparing the equality of regression slopes of two and more groups are appealing and yet may not be necessary and advantageous for the current focus on the problem of the two-group situation. As discussed earlier, the nonsquared form of a partial F statistic or a partial t expression proves to be more versatile than the squared form of the F test in this particular MMR application. The examination of the established results helps strengthen the importance of the problem and reveals the closer functional relation between the existing vital approaches. It will be explicitly shown later that the WLS-based methods are closely related to the well-known Welch procedures. In the process, we also hope to account for some important findings and ambiguous issues that may have been overlooked in the literature. The rest of the article is organized as follows. The next section describes the fundamental theory and analytical results for the inference of interactions between a dichotomous moderator and a continuous predictor in the context of MMR with heterogeneous error variance. Then the emphasis is placed on the underlying similarities and differences between the Welch and WLS methods. Extensive numerical investigations are conducted to exemplify the critical and subtle discrepancy between the two approaches. In order to enhance the application of the prevalent Welch procedure, the SAS and R programs are provided to ease the inferential analyses of hypothesis testing imposed by the technique. Relationship Between Welch and WLS Procedures Consider the following two simple linear regression models of the form where 1i and 2j are iid N(0, 12) and N(0, 22) random variables, respectively, i 1, . . . , N1, and j 1, . . . , N2. In view of moderated multiple regression with the focus on the two-group parallelism problem, it is often more illuminating to combine the two models in Equation 1 as the following multiple regression model with a dichotomous moderator variable: sis. Also, more importantly, detailed analytical examinations are conducted to demonstrate the relationship and discrepancy of existing prominent methods. Conceivably, the well-supported recommendations offered in the presentation may be useful for empirical research. Subgroup OLS The fundamental statistical results are well known for the two simple regression models with normal error assumption given in Equation 1for example, see Rencher (2000). Suppose that 10, 11, 20, and 21 are the subgroup OLS estimators of 10, 11, 20, and 21, respectively. Although it is not necessary to apply matrix algebra to derive these estimators for the two simple linear models, the matrix formulations presented in Appendix A will later be shown to be useful for demonstrating the relationship between WLS and OLS estimators. We are especially concerned with the statistical properties associated with the two estimators of slope coefficients 11 and 21, specifically, N( 11, 12/SSX1) and 21 ~ N( 21, 22/SSX2), where SSX1 Ni1 1(X1i X1)2 and SSX2 Nj21(X2j X2)2; X1 and X2 are the respective sample means of the X1i and X2j observations. For inferential purposes, 12 SSE1/(N1 2) taonrds of22 12 SaSnEd2/(22N.2Note2)thaaret th1e1 usua2l1unbNia(sed11estim2a1-, 12/SSX1 22/SSX2). Moreover, the error sum of squares SSE1 12 2(N1 2) and SSE2 22 2(N2 2), where 2(N1 2) and 2(N2 2) are chi-square distributions with N1 2 and N2 2 degrees of freedom, respectively. Welch Procedures As a straightforward extension of Welchs well-known approximate degrees of freedom or approximate t solution to the BehrensFisher problem of comparing the difference between two means, Welch (1938, p. 356) also described the methods for comparing the difference in two regression slope coefficients within the simple regression framework of Equation 1. For the purpose of testing the hypothesis H0: 11 21, one of the methods proposed by Welch (1938) is to consider the approximate distribution V t() when 11 21, where and t( ) is a t distribution with degrees of freedom ( 12/SSX1, 22/SSX2) with Consequently, the unknown parameters 12 and 22 in are replaced by 12 and 22 for the practical purpose of hypoth Methodologically, the existence and magnitude of regression coefficient 1 11 21, representing the influence of the moderating effect, is the major concern for the analysis of the moderated multiple regression model in Equation 2. In the present section, we discuss the statistical tests for the homogeneity of slopes of two simple regression models by summarizing the fundamental results from different disciplines; this not only underscores the importance of the problem, but also provides a comprehensive review of various solutions for moderation analyesis testing. Hence, it leads to the modified approximation with random number of degrees of freedom: The null hypothesis is rejected at the significance level if where t , /2 is the 100(1 /2) percentile of the t distribution t( ) with degrees of freedom . Several studies have shown that Welchs approximate degrees of freedom approach offers a reasonably accurate solution to the BehrensFisher problem; for example, see Best and Rayner (1987), Davenport and Webster (1975), Nel, van der Merwe, and Moser (1990), Scariano and Davenport (1986), and Scheff (1970). Because the prescribed Welchs V for comparing regression slope coefficients is a natural adaptation of Welchs original approximate t test for equality of two normal means, the V test (Equation 5) should possess the same advantage of accurate control of the magnitudes of size and power. Nonetheless, it can be demonstrated that the general WelchAspin F * test for comparing regression slope equality presented in DeShon and Alexander (1996) reduces to the F * method in Dretzke et al. (1982) under the two-group circumstance. Moreover, the F * of Dretzke et al. is actually the square of the V given in Equation 5, F * V 2. Consequently, F * is referred to the F distribution with degrees of freedom 1 and , according to the notations used here. Notably, the accurate performance of the Welch procedure for comparing the slopes of two regression lines has been demonstrated in DeShon and Alexander (1996), Dretzke et al., and Overton (2001). Also, a related Z procedure proposed in Welch (1938) is presented in Appendix B for the sake of completeness and convenient reference. WLS Under the notion of heterogeneous error variance 12 22, Overton (2001) considered a WLS analysis of the problem using multiple regression model in Equation 2. In general, the properties of WLS estimators differ from those of OLS estimators; the WLS approach is employed to correct for heteroscedasticity, in that error variance changes as a function of covariate variables. For example, see Kutner, Nachtsheim, Neter, and Li (2005, sections 11.1 and 18.4). Moreover, WLS is a special case of generalized least squares, in which the error terms not only may have different variances but may also be correlated in pairs. However, the situation of heterogeneity of group error variances considered here requires only a single weight for each group. As reported in Overton (2001, p. 222), the WLS-based and OLS-based analyses of moderated multiple regression model (Equation 2) yield identical coefficient estimators for 20, 21, 0, and 1, but differ in their respective variance estimators. However, Overton did not provide the specific analytic formulations for the WLS estimator W1 and the corresponding estimated variance V( W1). More importantly, the exact formulation of the test procedure was not given; instead, only numerical results were presented in Overton. Although empirical investigations are useful in assessing the properties of the competing methods, it is of pedagogical interest to see how different the WLS estimator is from other procedures that have been used in many applications. As expected, the WLS test statistic TW for the test of H0: 1 0 can be expressed as Even though it appears to be correctly specified, the general form does not provide much information about the theoretical implications of TW. It follows from the matrix representation and manipulation, in Equation C7 of Appendix C, that the WLS estimator of 1 is W1 11 21, regardless of the selection of relative weights. However, the estimated variance V( W1) varies with the designated weights. In particular, the WLS procedure of Overton (2001) employs the following weights: It follows from Equation C14 of Appendix C that the WLS test statistic for H0: 1 0 is where V is defined in Equation C13. Also, T has the approximate t distribution where dfw N 4. The null hypothesis is rejected at the significance level , if |T | tdfw, /2. Accordingly, the WLS procedure of Overton (2001) is the square of T or F ( T 2) in our notation. An alternative modified WLS (MWLS) method is also examined in Overton (2001). Equations C9, C10, and C11 of Appendix C provide details about the derivation of the MWLS Tm method. It is important to emphasize that WLS estimators of 1 11 21 are identical for any proper selection of weights; however, the estimated variances are generally different, as are the variance estimatesVm andV in Equations C10 and C13 of the two WLS-based test statistics Tm and T , respectively. Nonetheless, the only exception occurs in the special circumstance of balanced group sizes N1 N2 that Vm V , and Tm and T with the same referred t distribution t(dfw). Although this distinguishing property between two WLS criteria was not mentioned in Overton, this phenomenon was already shown in the simulation results of Type I error rate and power reported for the two tests in Tables 1 and 4 of Overton, respectively, for the balanced group size N1 N2 50. However, Overton concluded that the MWLS Fm T m2 test does not perform as well as the WLS test F T 2, according to his extensive Monte Carlo investigations across a wide range of model configurations. In contrast to the WLS analysis, the OLS regression analysis of the model in Equation 2 is straightforward. As shown in Equation C15 of Appendix C, the OLS test statistic for H0: 1 0 is where O2 SSEO/(N 4) and SSEO SSE1 SSE2. The null hypothesis is rejected at the significance level if |TO| tN4, /2. It is well known that the pattern of test TO parallels exactly what is known about the usual twosample pooled-variance t test under homogeneous group error variances. As evidenced in the simulation studies of Overton (2001), the test TO is outperformed by the WLSbased tests Tm and T and Welchs V test under the condition of heterogeneous group error variance. Distinction Between Welch and WLS Procedures It is notable from the methodical presentations in the two preceding subsections that the Welchs procedures and the WLS methods are apparently developed from different perspectives. It is not particularly surprising to see that two distinct principles lead to substantially different formulations and properties for the developed methods. However, there are important similarities and differences between the methodological formulations of Welchs V and WLS-based Tm, and Welchs Z and WLS-based T . The following discussions derive the relevant results to show the closer relation of these renowned tests. First, it follows from the resulting expressions (Equations 3 and C11) that, in fact, the two statistics of Welchs V and the WLS-based Tm are identical. Therefore, it is worthwhile to note that Welchs V statistic accommodates implicitly the notion of WLS. However, the respective approximate t distributions of the two tests are different. Actually, it can be shown that this phenomenon also exists in the framework of the BehrensFisher problem for comparing two normal means under heterogeneous variance assumption; specifically, the corresponding degrees of freedom for the referred t distribution of V and Tm in Equations 4 and C11 are and dfw N 4, respectively. Note that the degrees of freedom of V is bounded between the minimum of (N1 2, N2 2) and N 4. Therefore, the associated critical values of t , /2 and tdfw, /2 are in the order of tdfw, /2 t , /2. See Ghosh (1973) for the monotonicity properties of the family of t distribution. Hence, the observed significance level or p value of the Welchs V test is always greater than or equal to that of the WLS Tm test; in other words, the WLS test is more liberal than the Welch approach is, in the sense that the WLS test tends to reject the null hypothesis more often than the Welch method does. Correspondingly, the achieved significance level and power of the Welch method never exceed those of the WLS procedure. Interestingly, these characteristics of the Welch and WLS approaches were not addressed in Overton (2001), yet the simulation studies in Tables 15 of Overton exemplified these features between the two approaches under the notations of F * and MWLS for V and Tm, respectively. Additionally, the prescribed Welchs Z test of Equation B2 and WLS-based T procedure in Equation 7 are entangled in formulation and approximate distribution. In view of the distinctive form of the estimated variance V given in Equation C13, the statistic T of Equations 6 and C14 is in relation to the Z statistic in Equation B1 through d T Z, where d {(N 8)/(N 4)}1/2. Hence, the approximate distribution of T t(dfw) defined in Equation 7 can be rewritten as On the contrary, Welchs approximation for Z is Z ct ( f *), as presented in Equation B2. Consequently, Welchs Z approach and the specific transformation dT of the WLS-based T method belong to a family of (approximate) distributions, each of which is approximated by k t( f ), where, for Z, k c and f f *, and for dT , k d and f dfw. Note that the approximate ct ( f *) distribution of Welchs Z is optimized in terms of the scalar multiplier c and degrees of freedom f *. Therefore, the approximate distribution in terms of dt (dfw) for the transformed statistic dT or Z does not possess the optimality property and is less adequate under the moment-matching criterion of Welch (1938). Later, the differences between the WLS T and Welchs V methods will be further examined and reinforced in the simulation study. On the basis of the present results, the WLS-based T and Welchs V methods emerge as the prominent and representative test procedures of the two distinct WLS (error variance heterogeneity neutralization) and Welch (approximate degrees of freedom) methodologies. To further help clarify similarities and differences for the competing T and V methods, simulation investigations are performed to examine their numerical performance. For ease of exposition, the two test procedures will be referred to in the remainder of this article as the WLS and Welch approaches, respectively. It was concluded in Overton (2001) that the behavior in controlling the Type I error rate for the WLS method is not as spectacular as the Welch procedure and other formulas. On the other hand, the WLS test is virtually identical to the Welch procedure in its ability to detect a true interaction effect. However, the conditions in which the Welch and WLS procedures incur the most discrepancy were not identified. Hence, we performed an extensive replication of Overtons simulation study to reevaluate the empirical Type I error rate for the detection of interaction effects. Throughout the numerical study, the nominal Type I error rate is set as .05. The estimates of the true Type I error rate associated with given sample size and model configurations are computed through Monte Carlo simulation of 10,000 independent data sets. For each replicate, N1 and N2 values of predictor X1 and X2 are generated from the designated independent normal distribution N(0, X21) and N(0, X22), respectively. These values in turn determine the respective mean responses for generating N1 and N2 values of normal outcomes Y1 and Y2 for the two underlying regression models with error variance 12 and 22, as defined in Equation 1. Then the two test statistics T and V are computed. Accordingly, the simulated Type I error rate is the proportion of the 10,000 replicates, whose values of |T | and |V | exceed the critical values tdfw,.025 and t ,.025 for the WLS and Welch procedures, respectively. As in Overton (2001, p. 223), the three harmonic group sample size means are 20, 50, and 100, and the ratio of sample sizes varies from 1:1 to 1:2 to 1:5. The resulting sample sizes (N1, N2) combinations are (20, 20), (15, 30), and (12, 60) for the harmonic mean of 20; (50, 50), (38, 75), and (30, 150) for the harmonic mean of 50; and (100, 100), (75, 150), and (60, 300) for the harmonic mean of 100. For each of the three harmonic means of 20, 50, and 100, a total of 88 model settings are summarized in three tables, according to the combined configurations of sample size allocation (N1 and N2), predictor standard deviation (SD) ratio ( X1/ X2), and error SD ratio ( 1/ 2). Overall, the numerical results are summarized in a total of nine tables. Although the results fit in with the general conclusions of Overton, some important and distinctive situations could still be of practical interest, in the sense that the two contending procedures have the more obvious potential of yielding different conclusions. Space limitations preclude reporting results for all situations. To exemplify the critical and subtle discrepancy between the two approaches, only the tables in which the harmonic mean sample size is 20 are presented. Tables 13 contain the simulated Type I error rates for sample sizes (N1, N2) (20, 20), (15, 30), and (12, 60), respectively. The full set of simulation results is available upon request. It was noted in Overton (2001) that the Type I error rates of the WLS test ranged from .044 to .060, and 91% of the 264 condition error rates were in the narrow .045.055 range. According to our simulations, the simulated Type I Condition 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Table 1 Simulated Type I Error Rates of WLS and Welch Procedures When Sample Sizes N1 20 and N2 20 ( .05) Difference Difference X1/ X2 1/ 2 WLS Welch WLS .05 Welch .05 1/1 1/1 .0532 .0526 .0032 .0026 1/1 1/2 .0528 .0513 .0028 .0013 1/1 1/4 .0523 .0465 .0023 .0035 1/1 1/8 .0552 .0485 .0052 .0015 1/2 1/1 .0491 .0466 .0009 .0034 1/2 1/2 .0500 .0491 .0000 .0009 1/2 1/4 .0513 .0482 .0013 .0018 1/2 1/8 .0542 .0494 .0042 .0006 1/2 2/1 .0561 .0504 .0061 .0004 1/2 4/1 .0570 .0489 .0070 .0011 1/2 8/1 .0565 .0492 .0065 .0008 1/4 1/1 .0574 .0511 .0074 .0011 1/4 1/2 .0529 .0504 .0029 .0004 1/4 1/4 .0531 .0529 .0031 .0029 1/4 1/8 .0505 .0482 .0005 .0018 1/4 2/1 .0580 .0502 .0080 .0002 1/4 4/1 .0591 .0519 .0091 .0019 1/4 8/1 .0589 .0495 .0089 .0005 error rates of WLS had a similar range of .0444 to .0606, where the value 0.0606 is associated with Condition 31 in Table 2. Nonetheless, only 85.23% (225 cases) of the 264 conditions were in the interval of .045.055. In particular, there are 8, 13, and 7 cases in Tables 13, respectively, with large deviation not inside the prescribed range. Hence, the corresponding percentage of the simulated Type I error rates that is within the range of .045.055 in Tables 13 is as low as 68.18% (60 out of the 88 conditions). Moreover, for the 28 cases outside the range of .045.055, only a single value, .0445 (Condition 15 in Table 3), fell below .045. It appears that the WLS method tends to give less precise and positively biased Type I error for comparatively small sample sizes. On the contrary, the Welch procedure yielded a range of .0447 to .0555 with 98.49% (260/264) between .045 and .055. Incidentally, it is noteworthy from Tables 13 that Condition 35 in Table 2 is the only case of the Welch test to incur a large deviation (.0051) not within the bound of .005. The remarkably accurate results of the Welch procedure presented here are consistent with the findings in Overton (2001, p. 224). Hence, it is clear that the Welch procedure has the important advantage over the WLS test of accurate control of Type I error rate, especially for small samples. Moreover, since the considered distri.0528 .0529 .0486 .0512 .0468 .0486 .0505 .0498 .0564 .0519 .0542 .0504 .0566 .0504 .0517 .0480 .0514 .0515 .0474 .0518 .0488 .0505 .0543 .0488 .0532 .0482 .0561 .0500 .0445 .0484 .0526 .0548 .0506 .0496 .0493 .0466 .0510 .0512 .0523 .0483 .0566 .0512 .0589 .0540 .0548 .0511 .0483 .0492 .0452 .0471 .0524 .0481 .0544 .0488 .0585 .0528 .0459 .0487 .0515 .0508 .0508 .0489 .0550 .0523 .0453 .0493 .0531 .0534 .0549 .0511 butions in Equations 7 and 4 for the WLS and Welch tests are approximations, care needs to be taken in interpreting the implications of their results in the simulated power of Overton. It is important to note that the magnitude of empirical power of both procedures depends mostly on the accuracy of their respective approximate critical values. Therefore, it should be taken into consideration that the simulated power of the WLS test may be attained at the cost of an inordinate or unstable Type I error rate in the same conditions. From a practical standpoint of providing a generally useful and versatile solution without specifically confining itself to any particular settings, the failure to give an accurate Type I error rate is one obvious limitation of the WLS test. Although not all study designs are planned with moderate or small sample sizes, it is understandable that some intrinsically original or special research would accommodate larger numbers of participants with difficulty. In this respect, it is essential for researchers to have a reliable procedure for detecting the moderating effects over all sample size situations one might encounter in applied work. The soundness of Welchs approach and its primitive form have received critical acclaims from numerous researchers. The newly recognized trait of WLS enhances the incredible versatility of Welchs methodology. More importantly, the overall adequate performance in Type I error rates for detecting interaction effects fortifies the distinct advantage of the Welch procedure over the WLS method. As suggested by a referee, we also investigated the behavior of the Welch procedure under imperfect conditions. Accordingly, Monte Carlo simulations have been performed for the MMR analysis with two types of nonnormal errorsnamely, gamma and uniform distributions. Since this issue is not the primary focus of the present article, the details are not given here. However, the performance seems completely acceptable regarding the robustness of the Welch procedure against mild departures from normal error assumption. Finally, the computational aspect for assessing moderating effects in the context of an MMR example will be described in the next section. In addition to the statistical performance in Type I error rate and power, Overton (2001) emphasized the practical importance of computational requirement Group 1 (Z 1) Group 2 (Z 0) and program availability of the competing methods. It was shown in Overton that the WLS-based moderated multiple regression analysis can be readily conducted with standard SAS procedures. For the practical purpose of expediting the application of the suggested Welch method, the corresponding SAS (SAS Institute, 2008) and R (R Development Core Team, 2008) computing algorithms for performing the Welch procedure are developed. In the process, we also show numerical evidence below that the procedures for testing moderating effects can have an important impact on the results, conclusions, and, ultimately, the theory that involves moderated relationships. To facilitate the following illustration in the context of MMR analysis, it is constructive to consider briefly whether the exemplifying aim of the numerical study is to determine the changes, as a function of gender group membership (Z ), in the relationship between the employers job performance (Y ) and preemployment test score (X ). As Aguinis and Pierce (1998) pointed out, one of the most typical situations in organizational study is when the group with the larger sample size is associated with the larger error variance. To exemplify this notable circumstance of practical importance, the two groups of values (Y, X ) listed in Table 4 represent random samples generated from the underlying model conf igurations with ( X1, X2) (1, 4), ( 1, 2) (1, 8), ( 11, 21) (0.15, 1.4223) for group sample sizes (N1, N2) (12, 60). Accordingly, the value of the Welch test statistic defined in Equation 3 is V 2.0740 with degrees of freedom 24.7708 and p value .0486, whereas the WLS method yields the outcomes of T 1.9786, dfw 68, and p value .0519 for this data set. Conceivably, although the two values of the Welch and WLS test statistics are only slightly different, the embedded subtle approximation features lead to dissimilar conclusions on the basis of significant level .05. Moreover, the OLS test results in TO 0.3812, with p value .7043. Thus, the extraordinarily large p value of OLS gives an apparently disparate and implausible outcome, compared with the Welch procedure. This is not a particularly surprising result, however, in view of the existing numerical findings that the Type I error rate of the ordinary F test is excessively conservative when the group with the larger sample size is associated with the larger error variance. In general, the established notion of the Welch procedure in adjusting the degrees of freedom for the approximate distribution has resulted in wide acceptance in the literature. The SAS and R programs for the implementation of the Welch procedure presented in Appendixes D and E are available to interested researchers upon request. Users can easily identify the statements with the self-contained exposition, and it only requires a slight modification of the program to accommodate their own data specifications. Conclusions Several previous investigations have shown that the standard MMR F test is not robust to the heterogeneity of error variance for evaluating regression slope differences across groups. In addition to the existing attempts, the WLS procedure of Overton (2001) provides an attractive alternative to mitigate the impact of violating the assumption of homogeneous error variance on conclusions of testing the hypotheses regarding the interaction between a dichotomous moderator variable and a continuous predictor variable. For pedagogic reasons, one must have a thorough understanding of the fundamental details of the methodology, and how the technique improves upon existing approaches for MMR analysis, before the theoretical idea can finally be considered appropriate for making sound application. This article elucidates the similarities and differences between the Welch and WLS methods through rigorous analytical presentations and numerical assessments. In particular, it shows how the Welch statistics have exactly the same or similar expressions as do the WLS-based MMR statistics. Therefore, from the methodological point of view, Welchs procedure exhorts the same tactic as the WLS method for tackling the problem of error variance heterogeneity; in other words, the prevailing Welch approach implicitly possesses the same tempting property that distinguishes the WLS method from other available techniques. However, the resulting testing procedures differ in their adjustments of the degrees of freedom for respective distributional approximations. Furthermore, the primitive Welch approaches for comparing means of two and more groups under heterogeneity of variance have been widely discussed in standard texts of statistical methods in psychology and business (see, e.g., Berenson, Krehbiel, & Levine, 2006; Howell, 2007). Although there is some concern about the application of the Welch procedure for four or more groups (DeShon & Alexander, 1996), this research has been confined to the case of two groups. With the underlying WLS characteristic, accurate performance and computational ease, it is prudent to recommend the extended Welchs approximate t procedure for detecting the moderated effects of dichotomous moderators in MMR. G.S. thanks the two anonymous referees for their valuable comments on earlier drafts of the article. This research was partially supported by the National Science Council. Correspondence concerning this article should be addressed to G. Shieh, Department of Management Science, National Chiao Tung University, Hsinchu, Taiwan 30050 (e-mail: gwshieh@mail .nctu.edu.tw). Aguinis, H. (2004). Regression analysis for categorical moderators. New York: Guilford. Aguinis, H., Petersen, S. A., & Pierce, C. A. (1999). Appraisal of the homogeneity of error variance assumption and alternatives for multiple regression for estimating moderated effects of categorical variables. Organizational Research Methods, 2, 315-339. Aguinis, H., & Pierce, C. A. (1998). Heterogeneity of error variance and the assessment of moderating effects of categorical variables: A conceptual review. Organizational Research Methods, 1, 296-314. Aguinis, H., & Stone-Romero, E. F. (1997). Methodological artifacts in moderated multiple regression and their effects on statistical power. Journal of Applied Psychology, 82, 192-206. SUPPLEMENTAL MATERIALS Descriptions of the analytical procedures used and SAS and R code to implement them may be downloaded as supplemental materials for this article from brm.psychonomic-journals.org/content/supplemental. APPENDIX A The Subgroup OLS Analysis Let Y1 and Y2 be N1 1 and N2 1 column vectors of Y1is and Y2js. Denote XD1 [ 1N1, X1] and XD2 [1N2, X2], where 1N1 and 1N2 are N1 1 and N2 1 column vectors of 1s, and X1 and X2 are the N1 1 and N2 1 column vectors of X1is and X2js. Then, the matrix formulations of the two models in Equation 1 are Y1 XD1 1 1 and Y2 XD2 2 2, respectively, where 1 [ 10, 11]T and 2 [ 20, 21]T. Consequently, the separate OLS estimators are Also, the variance and covariance matrices are APPENDIX B Welchs Z Procedure One of the alternative methods suggested in Welch (1938, p. 360) for the BehrensFisher problem is the Z statistic (see Fenstad, 1983; Best & Rayner, 1987; and Paul, Best, & Rayner, 1992, for further details). Although it was not explicitly stated, the notion of the Z test for use in the BehrensFisher situation can be easily generalized for the comparison of regression coefficients as well. Following Welchs (1938) derivation, and that of Paul et al. (1992), the extended Z statistic has the form It follows from Welch (1938) under the null hypothesis H0: 11 mated by a scalar multiplier of t distribution 21 that the distribution of Z can be approxi APPENDIX B (Continued) and t( f *) is a t distribution with degrees of freedom Although the Welch statistics of V in Equation 3 and Z just described in Equation B1 are equally efficient in terms of asymptotic relative efficiency, it is important to note that the finite sample comparisons of Best and Rayner (1987) and Paul et al. (1992) recommended the statistic V over the statistic Z for testing the hypothesis of equality of two normal means. In the case of comparison of regression coefficients, it is conceivable that the test procedure V in Equation 4 should possess the same advantage over the Z method in Equation B2. Consider the matrix formulation of the model in Equation 2, APPENDIX C The WLS Procedures The variancecovariance matrix of estimator is SSEW/(N 4) and the error sum of squares where [ 20, 21, 0, 1]T is the 4 1 column vector of regression coefficients; Y and are the respective column vectors of Yk and k, k 1, . . . , N; and X D [ 1N, X, Z, ZX] is the N 4 design matrix with 1 N, X, Z, and ZX, which are the column vectors of all 1s, Xks, Zks, and Zk Xks, respectively. Under the variance heterogeneity assumption, the variancecovariance matrix of is Cov() Diag{ 22IN2, 12IN1}, an N N diagonal matrix, where IN1 and IN2 are the identity matrixes of dimensions N1 and N2, respectively. The WLS approach modified the ordinary least squares method by applying an appropriate weight matrix to the model in Equation C1 as follows: where Y* WY, X D* WXD, * W , and the weight matrix W Diag{W2, W1} is the N N diagonal matrix with the first N2 and the last N1 diagonal elements equal to w2 and w1, where W2 w2IN2 and W1 w1IN1. Let Var( *k) w2 , then w2 w22 22 for k 1, . . . , N2 and w2 w12 12 for k N2 1, . . . , N2 N1. The subsequent WLS analysis follows that of the regular OLS linear regression. Hence, the WLS estimator W [ W20, W21, W0, W1]T of can be readily obtained as W ( X D*TX D*)1 X D*TY* ( XTDW2XD)1 XTDW2Y. w2 (X D*TX D*) 1 and the corresponding natural XD W)TW2(Y It follows that the hypothesis of H0: 1 0 or H 21 is therefore tested with the partial t statistic w2 (XTDW2XD)1 , TW where W1 is the WLS estimator of 1 and V( W1) is the estimated variance of W1 and is the (4, 4) element of VW. The null hypothesis is rejected at the significance level if |TW| tdfw, /2, where dfw N 4. Although the prescribed results are sufficient for the purpose of numerical computation for the selected weight matrix, it is not obvious from the general expressions exactly what the particular estimator W1 and the associated variance estimator V( W1) turn out to be. The following detailed presentation and illustration define the analytic results for the WLS approach and provide the connections of WLS with other related procedures. Note that the overall response Y and the design matrix XD can be expressed as where 1 and 2 are the separate OLS estimators of 1 and 2 given in Equation A1. Moreover, the estimated variance and covariance matrix is Let W [ W20, W21, W0, W1]T with W0 W10 W20, W1 W11 W21. It is clear from Equation C5 that the WLS estimators of ( 10, 11) and ( 20, 21) are identical to the separate OLS result of Equation A1 described in Appendix A. More importantly, the WLS estimator W1 of 1 21 11 coincides the difference of the two OLS estimators regardless of the relative weights: However, it follows from Equation C6 that the estimated variance of W1 is Hence, the WLS estimator of Equation C2 can be written as XD*TY* = APPENDIX C (Continued) where 0N2 2 is an N2 2 matrix of 0s. It follows from the block diagonal property of W and the unique constant multiple of identity for the two weight matrices of W1 and W2 that and the explicit formulation of V( W1) ultimately depends on the designated weights of w1 and w2. To neutralize the variance heterogeneity, a natural choice of weights (w1, w2) is the square root of the inverse of respective unbiased variance estimator (m1, m2), where mi 12 , i 1 and 2. (C9) i Thus, it can be shown with the weights defined in Equation C9 that SSEW and 2W in Equation C3 are greatly simplified as SSEW m12SSE1 m22SSE2 N 4 and 2W 1, respectively. Moreover, the resulting estimated variance Vm of W1 obtained by Equation C8 is Then, more informatively, it can be readily seen from Equations C7 and C10 that the test TW for the inference of the coefficient parameter of 1 given in Equation C4 can be expressed as Consequently, the WLS-based partial F test statistic (MWLS) described in Overton (2001), denoted by our notation Fm here, can be expressed as Fm T m2, which in turn follows approximately an F distribution with degrees of freedom 1 and dfw under the null hypothesis H0: 1 0. Alternatively, the relative weights based on an unbiased estimator for the reciprocal of respective variance can be considered APPENDIX C (Continued) In this situation, the obtained estimator of 1 remains as W1 11 21, as given above in Equation C7. Again, it is equivalent to the difference of the two OLS estimators. Furthermore, it can be shown with the relative weights in Equation C12 that SSEW 12SSE1 22SSE2 N 8, w2 ( N 8)/(N 4) and the estimated variance V of W1 is V Hence, the corresponding test statistic for H0: 1 0 is denoted by In contrast, the particular results for OLS regression analysis of the models in Equation 2 or C1 can be viewed as a special case of WLS analysis by setting w1 w2 1. Hence, it can be immediately obtained from Equations C5 and C6 that the OLS estimator O W and the estimated variance and covariance matrix where O2 SSEO/(N 4) and SSE O SSE1 SSE2. Certainly,VO andVW are markedly different. Specifically, the OLS-based test for H0: 1 0 is of the form APPENDIX D The SAS Program for OLS, WLS, and Welch Procedures DATA BRM;INPUT Y X @@; *REQUIRED USER SPECIFICATIONS PORTION; *SPECIFY THE NUMBER OF CASES FOR THE DEFINITION OF MODERATOR Z; IF _N_ < 13 THEN Z=1;ELSE Z=0;XZ=X*Z; *SPECIFY THE DATA IN TERMS OF PAIRED-VALUES OF Y AND X SEQUENTIALLY; DATALINES; 0.8427 -0.6411 0.7367 -0.4710 1.3724 -0.0829 1.0208 -0.9724 1.0173 1.9320 -0.2700 -0.5577 -1.1647 0.5300 -1.5785 -0.7436 -0.5066 -0.0115 -0.3525 0.5926 0.5504 0.1367 1.7815 2.0942 20.6221 5.8494 0.0773 -4.6409 7.2835 4.0269 -4.8243 2.3101 -8.1438 -5.7033 8.1696 1.0393 0.3258 3.3170 -11.5477 -5.2614 -3.7603 -0.3760 11.9974 7.8449 -10.0975 -5.0683 -1.3754 -1.4049 -9.6727 -4.3949 -18.7952 -3.3753 4.8187 -3.8163 17.8028 5.2447 1.5166 -1.7987 -18.1552 0.2492 16.7311 -0.5163 -6.6831 -0.6478 4.1600 -4.0378 -20.5484 -4.2237 -3.4620 -5.1611 5.8139 11.1631 -2.2820 7.1399 -5.3675 1.6372 3.1738 4.3391 2.8675 -3.2008 -0.0571 -2.4078 -5.4507 1.3559 -5.1712 4.8134 17.2321 2.4378 -2.5326 0.3754 -11.2807 -3.6007 -0.4768 2.7956 -9.5239 -5.8884 -5.5169 -2.3669 11.5640 3.7946 4.0643 1.2667 12.5105 1.5345 11.6301 3.3982 -1.6980 -2.0588 -9.9924 -0.7673 4.4596 7.6824 10.2650 3.2056 -7.4692 0.3997 -4.5522 4.5721 -3.7881 -3.0503 10.4150 8.9247 5.1856 4.9535 -0.6495 -4.8053 -4.6085 3.9554 1.7367 -0.4456 6.5870 2.2837 -5.1916 -0.4220 -5.2506 -0.1328 10.1324 0.4992 -5.8459 0.7123 1.3268 3.2873 -3.5859 -3.8313 ; *END OF REQUIRED USER SPECIFICATIONS; PROC SORT;BY Z; PROC REG DATA=BRM NOPRINT TABLEOUT OUTEST=W;MODEL Y=X/MSE;BY Z; DATA W1(KEEP=Z DF MSE BETAH);SET W;IF _TYPE_='PARMS';RENAME _EDF_=DF _MSE_=MSE X=BETAH; DATA W2(KEEP=Z STDERR);SET W;IF _TYPE_='STDERR';RENAME X=STDERR; DATA W3;MERGE W1 W2;WLSW=(DF-2)/(DF*MSE); *OLS; PROC REG DATA=BRM;MODEL Y= X Z XZ; *WLS; DATA WLS;MERGE BRM W3;BY Z; PROC REG DATA=WLS;MODEL Y=X Z XZ;WEIGHT WLSW; APPENDIX D (Continued) *WELCH; DATA WELCH(KEEP=DF1 DF2 STDERR1 STDERR2 BETAH1 BETAH2 WELCH_T WELCH_DF WELCH_PVALUE); SET W3;IF Z=1;DF1=DF;STDERR1=STDERR;BETAH1=BETAH; SET W3;IF Z=0;DF2=DF;STDERR2=STDERR;BETAH2=BETAH; WELCHVAR=STDERR1**2+STDERR2**2;APS1=(STDERR1**2)/WELCHVAR; WELCH_T=(BETAH1-BETAH2)/SQRT(WELCHVAR); WELCH_DF=1/((APS1**2)/DF1+((1-APS1)**2)/DF2); WELCH_PVALUE=2*(1-PROBT(ABS(WELCH_T),WELCH_DF)); PROC PRINT; APPENDIX E The R Program for OLS, WLS, and Welch Procedures welch=function () { #REQUIRED USER SPECIFICATIONS PORTION #SPECIFY THE FIRST GROUP OF DATA IN TERMS OF PAIRED-VALUES OF Y AND X SEQUENTIALLY yx1=c(0.8427, -0.6411, 0.7367, -0.4710, 1.3724, -0.0829, 1.0208, -0.9724, 1.0173, 1.9320, -0.2700, -0.5577, -1.1647, 0.5300, -1.5785, -0.7436, -0.5066, -0.0115, -0.3525, 0.5926, 0.5504, 0.1367, 1.7815, 2.0942) #SPECIFY THE SECOND GROUP OF DATA IN TERMS OF PAIRED-VALUES OF Y AND X SEQUENTIALLY yx2=c(20.6221, 5.8494, 0.0773, -4.6409, 7.2835, 4.0269, -4.8243, 2.3101, -8.1438, -5.7033, 8.1696, 1.0393, 0.3258, 3.3170, -11.5477, -5.2614, -3.7603, -0.3760, 11.9974, 7.8449, -10.0975, -5.0683, -1.3754, -1.4049, -9.6727, -4.3949, -18.7952, -3.3753, 4.8187, -3.8163, 17.8028, 5.2447, 1.5166, -1.7987, -18.1552, 0.2492, 16.7311, -0.5163, -6.6831, -0.6478, 4.1600, -4.0378, -20.5484, -4.2237, -3.4620, -5.1611, 5.8139, 11.1631, -2.2820, 7.1399, -5.3675, 1.6372, 3.1738, 4.3391, 2.8675, -3.2008, -0.0571, -2.4078, -5.4507, 1.3559, -5.1712, 4.8134, 17.2321, 2.4378, -2.5326, 0.3754, -11.2807, -3.6007, -0.4768, 2.7956, -9.5239, -5.8884, -5.5169, -2.3669, 11.5640, 3.7946, 4.0643, 1.2667, 12.5105, 1.5345, 11.6301, 3.3982, -1.6980, -2.0588, -9.9924, -0.7673, 4.4596, 7.6824, 10.2650, 3.2056, -7.4692, 0.3997, -4.5522, 4.5721, -3.7881, -3.0503, 10.4150, 8.9247, 5.1856, 4.9535, -0.6495, -4.8053, -4.6085, 3.9554, 1.7367, -0.4456, 6.5870, 2.2837, -5.1916, -0.4220, -5.2506, -0.1328, 10.1324, 0.4992, -5.8459, 0.7123, 1.3268, 3.2873, -3.5859, -3.8313) #END OF REQUIRED USER SPECIFICATION APPENDIX E (Continued)


This is a preview of a remote PDF: http://link.springer.com/content/pdf/10.3758%2FBRM.41.1.61.pdf

Gwowen Shieh. Detection of interactions between a dichotomous moderator and a continuous predictor in moderated multiple regression with heterogeneous error variance, Behavior Research Methods, 2009, 61-74, DOI: 10.3758/BRM.41.1.61