Synthetic Control Group Models

To estimate the effect of an exogenous intervention on a treated unit, such as the implementation of a new criminal justice policy in a given state, a control unit is necessary. Comparing the treated unit’s (let’s say the state of California) time series of the outcome of interest (i.e. violent crime) to the population in which the treated unit is nested (i.e. national time-series) post-intervention would not yield interpretable findings, because it is unknown whether the difference in crime rates was caused by the intervention or some other factor. To navigate this obstacle, California’s crime rates would be compared to a weighted combination of other states chosen to optimally match California’s pre-intervention violent crime trends.

Counterfactual California is constructed using a donor pool composed of states that did not implement a similar criminal justice policy prior to, or shortly following the treated unit. The donor pool must be limited to units that did not experience a similar intervention in the timeframe of interest because comparison units are meant to approximate the counterfactual of California had the intervention not occurred. This is known as the Stable Unit Treatment Value Assumption.

By limiting the donor pool to states that have not experienced a similar intervention, the post-intervention difference between the treated and synthetic (or “counterfactual”) time-series’ can be interpreted as the causal effect of the intervention on the outcome of interest in the treated state.

Valid causal inference assumes, nevertheless, that the actual and “Counterfactual” time series have identical trends prior to the intervention and that the trends would have continued absent the intervention. In most instances, an ideal control time series is not available in nature. When an ideal control time series cannot be found, a synthetic control group can be constructed to approximate an ideal control time series as a weighted combination of the donor pool states.