Comparative politics and the synthetic control method revisited: a note on Abadie et al. (2015) (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1186%2Fs41937-017-0004-9.pdf

Comparative politics and the synthetic control method revisited: a note on Abadie et al. (2015)

Klößner et al. Swiss Journal of Economics and Statistics (2018) 154:11 DOI 10.1186/s41937-017-0004-9 Swiss Journal of Economics and Statistics ORIGINAL ARTICLE Open Access Comparative politics and the synthetic control method revisited: a note on Abadie et al. (2015) Stefan Klößner1* , Ashok Kaul2,3,4 , Gregor Pfeifer5 and Manuel Schieler2,4 Abstract Recently, Abadie et al. (Am J Polit Sci 59:495–510, 2015) have expanded synthetic control methods by the so-called cross-validation technique. We find that their results are not being reproduced when alternative software packages are used or when the variables’ ordering within the dataset is changed. We show that this failure stems from the cross-validation technique relying on non-uniquely defined predictor weights. While the amount of the resulting ambiguity is negligible for the main application of Abadie et al. (Am J Polit Sci 59:495–510, 2015), we find it to be substantial for several of their robustness analyses. Applying well-defined, standard synthetic control methods reveals that the authors’ results are particularly driven by a specific control country, the USA. Keywords: Synthetic control methods, Cross-validation JEL Classification: C23, C52 Background As a tool for policy evaluation, Abadie and Gardeazabal (2003) have introduced so-called synthetic control methods (SCM). For estimating the development of the treated unit in absence of the treatment, the basic idea of SCM is to find suitable donor weights which describe how the treated unit is synthesized by a weighted mix of unaffected control units. In this context, “suitable” means that treated and synthetic unit should resemble each other as closely as possible prior to the treatment, both with respect to the outcome of interest and with respect to so-called economic predictors. The latter are variables of predictive power for explaining the outcome. The data-driven SCM approach searches for optimal predictor weights in order to grant more importance to economic predictors with better predictive power. Properties of the SCM estimator, like (asymptotic) unbiasedness, have been developed by Abadie et al. (2010), while Gardeazabal and VegaBayo (2017) find that the SCM estimator performs well as compared to alternative panel approaches. *Correspondence: Statistics and Econometrics, Saarland University, Bldg. C3 1, 66123 Saarbrücken, Germany Full list of author information is available at the end of the article 1 Over the last few years, many studies have applied SCM across several fields, e.g., Acemoglu et al. (2016) (political connections), Cavallo et al. (2013) (natural disasters), Gobillon and Magnac (2016) (enterprise zones), or Kleven et al. (2013) (taxation of athletes). Recently, the SCM approach has been expanded by Abadie et al. (2015) (German reunification) to incorporate cross-validation: the predictor weights, whose data in the training period (first part of the pre-treatment period) are used to find optimal donor weights for synthesizing the treated unit, are selected such that the out-of-sample error in the validation period (second part of the pre-treatment period) is minimized. When measuring the effect of the 1990 reunification on Germany’s GDP per capita using the software package R, Abadie et al. (2015) found the following predictor weights: 44.2% (GDP per capita), 24.5% (investment rate), 13.4% (trade openness), 10.7% (amount of schooling), 7.2% (inflation rate), and 0.1% (industry share of value added). These predictor weights led to Germany being synthesized by Austria (42%), the United States (22%), Japan (16%), Switzerland (11%), and the Netherlands (9%). © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Klößner et al. Swiss Journal of Economics and Statistics (2018) 154:11 Page 2 of 11 When trying to replicate these results using the software package Stata, however, we found different predictor weights: 84.5% (GDP), 4.5% (investment), 5.1% (trade), 4.2% (schooling), 0.5% (inflation), and 1.2% (industry). The corresponding synthetic Germany was slightly different from the one obtained by Abadie et al. (2015): it consisted of Austria (43%), the USA (22%), Japan (15%), Switzerland (11%), and the Netherlands (9%)1 . We had sorted the countries alphabetically, while Abadie et al. (2015) had used a different ordering2 . Although, in theory, the ordering should have no effect on the estimation results (neither should the respective software package), we recalculated all weights using the ordering that had been used by Abadie et al. (2015). Surprisingly, we got yet another set of predictor weights: 71.0% (GDP), 11.1% (investment), 7.9% (trade), 6.4% (schooling), 2.7% (inflation), and 0.9% (industry). The corresponding weights for the countries synthesizing Germany were much closer to, but still different from the values found by Abadie et al. (2015)3 . Closer inspection shows that the failure to reproduce the results of Abadie et al. (2015) is not due to software problems, but stems from the newly introduced cross-validation technique. In fact, all the above mentioned predictor weights deliver identical values for the cross-validation criterion, thus they are all equivalent solutions of the cross-validation approach. Hence, the cross-validation technique is (in most applications) not well-defined, since the predictor weights are not uniquely defined. As the cross-validation technique allows many different equivalent predictor weights, the results obtained by Abadie et al. (2015) are arbitrary in the sense that the authors could have obtained different results if they had used other software or organized the data differently. We therefore investigate the corresponding ambiguity by conducting large-scaled Monte Carlo studies. The variation of the estimated post-treatment development of West German GDP is very small, with all estimates being significantly above Germany’s actual GDP. Concerning several robustness studies of Abadie et al. (2015), however, we find quite large amounts of ambiguity, in particular for the so-called in-space placebo and leaveone-out studies. Developing a rule of thumb, we can show that the amount of ambiguity depends on the difference between the number of predictors and the number of donor units that obtain positive weights in the training period. In most applications, this difference is positive. Thus, using the cross-validation cannot be recommended and standard synthetic control methods should be applied instead. When doing so, we confirm the main result of Abadie et al. (2015), indicating a significant drop in West German GDP due to (...truncated)