Application of SGT Family Distributions in Quasi Maximum Likelihood Estimation

Undergraduate Economic Review, Oct 2013

In the classical normal linear regression model, ordinary least squares estimators (OLS) will be consistent and achieve the Cramer-Rao lower bound for any unbiased estimators. This paper examines the impact of several other error distributions on the properties of the OLS estimators. Several different types of example data commonly available to students and researchers in economics are used to illustrate the impact of nonnormality, because, in application, the assumption of normality may not hold in empirical testing. Using maximum likelihood, I demonstrate that flexible probability density functions better model the residual distribution of different types of data, which suggests improvements in estimation accuracy. I find that this suggested increase of fit applies to almost all data types, with the scale of these likelihood improvements contingent upon data characteristics specific to individual data sets. I conclude that consideration of these distributions is essential for truly rigorous analysis and that parsimony applies when differences between estimators are not significant.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://digitalcommons.iwu.edu/cgi/viewcontent.cgi?article=1258&context=uer

Application of SGT Family Distributions in Quasi Maximum Likelihood Estimation

Application of SGT Family Distributions in Quasi Max imum Likelihood Estimation Samuel Dodini 0 1 0 This Article is brought to you for free and open access by The Ames Library, the Andrew W. Mellon Center for Curricular and Faculty Development, the Office of the Provost and the Office of the President. It has been accepted for inclusion in Digital Commons @ IWU by the faculty at Illinois Wesleyan University. For more information , please contact , USA 1 Brigham Young University - Utah , USA Recommended Citation - Cover Page Footnote A special thanks to Dr. James McDonald for helpful comments and MATLAB programming direction. This article is available in Undergraduate Economic Review: http://digitalcommons.iwu.edu/uer/vol10/iss1/5 Standard ordinary least squares (OLS) estimators in a linear regression framework minimize the sum of squared errors. These estimators will be the Best Linear Unbiased Estimators (BLUE) if the Gauss-Markov assumptions hold and will have the minimum variance of all unbiased estimators if the errors are normally distributed. In practice, many of these assumptions are violated. Heteroskedasticity is common in many cross-sectional data sets, as well as some sort of autocorrelation in time series data. While there are several methods of addressing the violation of these Gauss-Markov assumptions, such as generalized least squares, there are fewer rules of thumb to address non-normality in the residuals, which impacts the efficiency of OLS estimators. This can be especially important in areas of public policy in which billions of dollars depend on the choice of estimator. In essence, I ask the question, “What if there is a better estimator?” I compare the efficiency of OLS estimators to maximum likelihood estimators assuming the following error distributions: 1) Student's t Distribution (t) 2) Generalized Error Distribution (GED) 3) Inverse Hyperbolic Sine (IHS) 4) Generalized t (GT) 5) Skewed Generalized t (SGT) These estimators are often called quasi maximum likelihood or partially adaptive estimators as the regression parameters are estimated along with those of the approximating error distribution. These distributions can be related using the SGT tree relationship in Appendix A. Data To demonstrate the difference between these several error distributions and the comparative accuracy of their outcomes using quasi maximum likelihood estimation, I examine six separate data sets from the Wooldridge data set collection. These were chosen for their variety, reliable formatting, and workability and provide a diverse framework of possibilities for real world data examination. Summary statistics are provided in Appendix B. Each data set is homoskedastic with no autocorrelation, which isolates the error distribution as a varying factor. I first perform an OLS regression for my dependent variable (the first variable in each data set regressed on all the others) as an initial point of reference. Poorly Matching Residuals Below are the reported OLS residual graphs for each such regression. These consist of a smoothed histogram of the OLS residuals with an overlaid fitted normal distribution for reference. Notice the discrepancy between these assumed errors and the actual data residuals. Beauty CEO Salary Crime Rates The normal distribution does not approximate the actual residuals well due to rigidity issues in skewness or kurtosis, of which kurtosis seems to be the more egregious of the two. In these snapshots, there does not appear to be a specific pattern to these kurtosis issues from these six data sets. The Current Literature The essential theme of standard OLS regression theory suggests that, by the Central Limit Theorem, errors should be asymptotically normal, which may not be accurate in some specifications. Efromovich (2005) suggests a theoretical justification for the common fall-back of considering residuals as proxies for underlying regression errors. However, increased efficiency in computing and econometrics merits delving further into the true errors. Perhaps one of the first papers to examine non-normality in errors in linear regressions was Zeckhauser and Thompson (1970) , which examined maximum likelihood estimates using the three parameter power distribution made popular by Box and Tiao (1964) . They argue that the “Supposition [of normality] is often unwarranted and... significant gains in likelihood may be achieved when the regression technique allows for the more general class of error distributions,” (Zeckhauser and Thompson, 1970) . They attribute the inapplicability of the Central Limit Theorem to small sample size, non-normally distributed independent variables, and the presence of the nonrandom effects of human behavior. They also argue that using variance as a measure of efficiency loses its explanatory power when underlying errors diverge from normality, and is particularly important when facing error distributions with thicker tails. All these m (...truncated)


This is a preview of a remote PDF: https://digitalcommons.iwu.edu/cgi/viewcontent.cgi?article=1258&context=uer

Samuel Dodini. Application of SGT Family Distributions in Quasi Maximum Likelihood Estimation, Undergraduate Economic Review, 2013, Volume 10, Issue 1,