Comparative politics and the synthetic control method revisited: a note on Abadie et al. (2015)
Klößner et al. Swiss Journal of Economics and Statistics (2018) 154:11
DOI 10.1186/s41937-017-0004-9
Swiss Journal of
Economics and Statistics
ORIGINAL ARTICLE
Open Access
Comparative politics and the synthetic
control method revisited: a note on Abadie
et al. (2015)
Stefan Klößner1* , Ashok Kaul2,3,4 , Gregor Pfeifer5 and Manuel Schieler2,4
Abstract
Recently, Abadie et al. (Am J Polit Sci 59:495–510, 2015) have expanded synthetic control methods by the so-called
cross-validation technique. We find that their results are not being reproduced when alternative software packages
are used or when the variables’ ordering within the dataset is changed. We show that this failure stems from the
cross-validation technique relying on non-uniquely defined predictor weights. While the amount of the resulting
ambiguity is negligible for the main application of Abadie et al. (Am J Polit Sci 59:495–510, 2015), we find it to be
substantial for several of their robustness analyses. Applying well-defined, standard synthetic control methods reveals
that the authors’ results are particularly driven by a specific control country, the USA.
Keywords: Synthetic control methods, Cross-validation
JEL Classification: C23, C52
Background
As a tool for policy evaluation, Abadie and Gardeazabal
(2003) have introduced so-called synthetic control methods (SCM). For estimating the development of the treated
unit in absence of the treatment, the basic idea of SCM
is to find suitable donor weights which describe how the
treated unit is synthesized by a weighted mix of unaffected
control units. In this context, “suitable” means that treated
and synthetic unit should resemble each other as closely
as possible prior to the treatment, both with respect to
the outcome of interest and with respect to so-called economic predictors. The latter are variables of predictive
power for explaining the outcome. The data-driven SCM
approach searches for optimal predictor weights in order
to grant more importance to economic predictors with
better predictive power. Properties of the SCM estimator, like (asymptotic) unbiasedness, have been developed
by Abadie et al. (2010), while Gardeazabal and VegaBayo (2017) find that the SCM estimator performs well as
compared to alternative panel approaches.
*Correspondence:
Statistics and Econometrics, Saarland University, Bldg. C3 1, 66123
Saarbrücken, Germany
Full list of author information is available at the end of the article
1
Over the last few years, many studies have applied
SCM across several fields, e.g., Acemoglu et al. (2016)
(political connections), Cavallo et al. (2013) (natural disasters), Gobillon and Magnac (2016) (enterprise zones),
or Kleven et al. (2013) (taxation of athletes). Recently, the
SCM approach has been expanded by Abadie et al. (2015)
(German reunification) to incorporate cross-validation:
the predictor weights, whose data in the training period
(first part of the pre-treatment period) are used to find
optimal donor weights for synthesizing the treated unit,
are selected such that the out-of-sample error in the validation period (second part of the pre-treatment period) is
minimized.
When measuring the effect of the 1990 reunification on Germany’s GDP per capita using the software
package R, Abadie et al. (2015) found the following predictor weights: 44.2% (GDP per capita), 24.5% (investment rate), 13.4% (trade openness), 10.7% (amount of
schooling), 7.2% (inflation rate), and 0.1% (industry share
of value added). These predictor weights led to Germany being synthesized by Austria (42%), the United
States (22%), Japan (16%), Switzerland (11%), and the
Netherlands (9%).
© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license, and indicate if changes were made.
Klößner et al. Swiss Journal of Economics and Statistics (2018) 154:11
Page 2 of 11
When trying to replicate these results using the software package Stata, however, we found different predictor
weights: 84.5% (GDP), 4.5% (investment), 5.1% (trade),
4.2% (schooling), 0.5% (inflation), and 1.2% (industry).
The corresponding synthetic Germany was slightly different from the one obtained by Abadie et al. (2015): it
consisted of Austria (43%), the USA (22%), Japan (15%),
Switzerland (11%), and the Netherlands (9%)1 . We had
sorted the countries alphabetically, while Abadie et al.
(2015) had used a different ordering2 . Although, in theory, the ordering should have no effect on the estimation results (neither should the respective software
package), we recalculated all weights using the ordering
that had been used by Abadie et al. (2015). Surprisingly, we got yet another set of predictor weights: 71.0%
(GDP), 11.1% (investment), 7.9% (trade), 6.4% (schooling),
2.7% (inflation), and 0.9% (industry). The corresponding weights for the countries synthesizing Germany were
much closer to, but still different from the values found by
Abadie et al. (2015)3 .
Closer inspection shows that the failure to reproduce the results of Abadie et al. (2015) is not due to
software problems, but stems from the newly introduced cross-validation technique. In fact, all the above
mentioned predictor weights deliver identical values for
the cross-validation criterion, thus they are all equivalent solutions of the cross-validation approach. Hence,
the cross-validation technique is (in most applications)
not well-defined, since the predictor weights are not
uniquely defined. As the cross-validation technique allows
many different equivalent predictor weights, the results
obtained by Abadie et al. (2015) are arbitrary in the sense
that the authors could have obtained different results
if they had used other software or organized the data
differently.
We therefore investigate the corresponding ambiguity by conducting large-scaled Monte Carlo studies. The
variation of the estimated post-treatment development
of West German GDP is very small, with all estimates
being significantly above Germany’s actual GDP. Concerning several robustness studies of Abadie et al. (2015),
however, we find quite large amounts of ambiguity, in
particular for the so-called in-space placebo and leaveone-out studies. Developing a rule of thumb, we can show
that the amount of ambiguity depends on the difference
between the number of predictors and the number of
donor units that obtain positive weights in the training
period. In most applications, this difference is positive.
Thus, using the cross-validation cannot be recommended
and standard synthetic control methods should be applied
instead. When doing so, we confirm the main result of
Abadie et al. (2015), indicating a significant drop
in West German GDP due to (...truncated)