Computationally intensive permutation testing

Statistical tests used in Time-series intervention models assume a long time-series of white-noise observations. Since these assumptions are seldom warranted in Synthetic Control Group designs, conventional significance tests are not available. In light of this obstacle, the exact significance of the post-intervention difference between a treated time series and its synthetic control must be calculated from a permutation test model.

Although the basic concepts are due to Fisher(1922a), the application of permutation tests and exact significance to synthetic control designs is due to Abadie and Gardeazabal (2003; Abadie, Diamond and Hainmueller, 2010). There are generally two types of placebo tests that can be used to calculate the exact significance of the post-intervention difference for Synthetic Control Group designs (1) in-sample and (2) in-time permutation tests. An “in-sample” permutation test is performed by iteratively reassigning the treated unit to each unit in the donor pool. The pre-intervention root mean-squared prediction error (RMSPE) for each donor pool iteration are then compared to that of the treated unit. If a non-treated unit exhibits an effect of equal or greater magnitude than the treated unit, it is dropped from the donor pool and the matching procedure restarts. If the new synthetic control group match suggests an effect of the same sign and similar magnitude, then one can assume that the estimated effect is not dependent on the contribution, or bias, of a particular state’s contribution to the synthetic control unit.

Although these types of permutation testing are unproblematic when a small number of predictor variables are used in a Synthetic Control Group model, they become exponentially more computing intensive as predictors are added.
There are two conventional approaches to the choice of predictor variables in synthetic control models. The first approach includes proxy variables for characteristics and factors thought to causally influence the outcome of interest. This requires (1) perfect knowledge of the causal nexus for the outcome of interest, and (2) the assumption that the relationship between the predictors and the outcome of interest is uniform across states, (3) years, and (4) offense categories. Alternatively, Abadie et al. (2010) describe a “data-driven” approach using pre-intervention levels of the outcome of interest as predictors. The data-driven approach does not assume a static relationship between predictors and the outcome of interest across states, time or offense categories. Further, it does not require perfect knowledge of the causal nexus of factors affecting the outcome of interest.
Our lab has adopted and extended the data-driven approach introduced by Abadie et al. (2010) , using pre-intervention levels of the outcome of interest as well as first-difference scores between the outcome of interest and the previous year’s lagged observation.

Thus, constructing a synthetic control group for an annual time-series spanning 1970-present requires the inclusion of 80+ predictor variables to avoid information loss. To further perform an in-sample permutation test described above, this match is repeated for each unit of the donor pool. The resulting procedure invloves trillions of computations, and the amount of physical memory and computing power required surpasses that of nearly all commercially available personal computers.

We therefore perform all Synthetic Control Group permutation tests on the University of California’s High Performance Computing Cluster. This unix system contains 7332 cores and over 1PB or physical memory.