Bayesian Spatial Synthetic Control

What does it take to estimate California's Prop 99 effect honestly?

Three estimators target the same number — California's average treatment effect on per-capita cigarette sales after Proposition 99 — under progressively weaker assumptions. Classical SCM forces donor weights onto the simplex; Bayesian horseshoe shrinks but does not constrain; Bayesian Spatial SAR additionally drops the SUTVA assumption that donors are unaffected by the treatment. This app lets you see the three approaches side by side and decide which differences are substantive and which are artifacts of the prior structure.

The three takeaways from the post, made interactive in the tabs below: the ATT is robust in sign across all three methods (−15 to −19 packs per person per year); the donor pool's shape is not robust (4 active donors → 23 → 27 as the prior relaxes); and SUTVA is empirically false for this case study (Nevada absorbs a −3.75 spillover, 16× the next state).

Simplex vs. horseshoe — why "four donors" is partly a constraint artifact

The classical synthetic control method concentrates 99% of the donor weight on Utah, Nevada, Montana, and Connecticut. The horseshoe prior, fit on the same data, spreads non-trivial posterior mass across 23 of 38 donors. The animation below cycles between the two pictures — the sparsity of Stage 1 is not entirely the data's fault; the optimizer is pushed there by the constraint.

Tab 2

Donor Weights

Toggle between the three stages' donor allocations. See which states carry mass under the simplex vs. the horseshoe vs. the SAR-corrected horseshoe.

Tab 3

Spillover & ρ

The SUTVA-failure evidence. See the Nevada spillover (−3.75 packs/capita, 16× larger than next) and dial ρ to feel what spatial autocorrelation does to the gap.

Tab 4

Cross-stage ATT

The post's headline comparison. A forest plot of the three ATT estimates with intervals. Hover any point for the standard error, interval bounds, and active-donor count.

Glossary (open a card if a term is unfamiliar)

ATT

Average treatment effect on the treated. The causal effect averaged over units that received the treatment, not the whole population. Here: California 1988–2000.

Donor pool

The 38 US states that did not change their tobacco taxes around 1988. Each is a candidate component of a synthetic California.

Simplex constraint

Donor weights are non-negative and sum to one. Interpretable, but acts as a heavy regulariser that pushes most weights to exactly zero.

Horseshoe prior

Heavy-tailed prior on donor weights. Most are shrunk toward zero, but a few can escape — sparsity by preference, not by constraint.

SUTVA

Stable unit treatment value assumption: a donor's outcome is unaffected by whether other units were treated. Fails if Californians drive to Nevada for cigarettes.

SAR (ρ)

Spatial autoregressive layer. Each unit's outcome depends on a weighted average of its neighbours' outcomes. Posterior mean ρ̂ = 0.223 here.

Spillover effect

Treatment effect on a donor unit. Assumed zero under classical SCM; emerges as a derived quantity in the SAR layer.

ESS

Effective sample size. Counts effectively independent MCMC draws. Rule of thumb: ESS > 200 for trustworthy interval estimates. Here ESS(ρ) = 3 at tutorial scale — intervals are illustrative.

Donor weights across the three stages

Two pictures of "which states make up the synthetic California". Toggle between the classical simplex and the Bayesian horseshoe and watch the active-donor count grow from 4 to 23. The SAR-corrected Stage 3 inherits the horseshoe weights but adds a spatial diffusion layer (covered in Tab 3); its headline metric is the same expanded donor pool — 27 donors with non-negligible posterior mass after the SAR layer integrates over ρ. None of this changes the sign of the ATT — but it shows how much of the classical "four donors" story is the simplex talking, not the data.

Stage

Stage 1 — Classical simplex (top-8 by weight) Stage 2 — Bayesian horseshoe (top-12 with 95% CrI)

Stage 1: 99.8% of the mass on Utah, Nevada, Montana, Connecticut. The remaining 34 donors are essentially zero.

Stage 3 (Bayesian Spatial SAR) reuses the horseshoe donor weights from Stage 2; see Tab 3 · Spillover & ρ for the SAR diffusion view.

Active donors (α > 0.01)

out of 38

Top weight

0.327 (Utah)

Top-4 share

97.5%

how concentrated is the recipe?

ATT under this stage

−18.46

packs/capita/year

What to look for

The simplex picture is sparse. Four bars dominate; everything else is essentially zero. The optimiser has no choice — non-negative weights summing to one are pushed to a corner of the simplex.
The horseshoe picture is broad. Twenty-three donors carry mean posterior weight above 0.01. But only Nevada's 95% credible interval excludes zero; every other top donor is statistically consistent with no contribution.
The ATT barely moves. −18.46 → −15.84. Whether your synthetic uses 4 donors or 23, the gap California → counterfactual stays large and negative.

Spillover effects on donor states — SUTVA fails for tobacco

The most novel result of the SAR layer is the per-donor spillover. The Sakaguchi–Tagawa pipeline computes each donor's average post-treatment effect by forward-simulating the SAR data-generating process with and without California's treatment, integrating over the posterior draws of ρ. The top-8 ranking is below.

Spillover ranking (top-8 donor states by absolute effect)

Orange bars denote negative spillovers (per-capita cigarette sales pushed down). Nevada receives −3.75 packs/capita/year — 16× larger than the next state, more than 2,900× larger than the smallest non-zero spillover.

Posterior mean ρ

0.223

spatial autocorrelation

95% credible interval

[0.168, 0.272]

bounded away from zero

ESS(ρ)

tutorial budget; paper uses 100k iter

Nevada spillover

−3.75

16× the next state

What if ρ were larger? Smaller? — Stage 3 gap explorer

Slide ρ from 0 (no spatial dependence, recovering Stage 2) to 0.6 (strong neighbour co-movement) and see how the SAR-corrected gap on California shifts. The default ρ = 0.223 corresponds to the post's posterior mean. The grey band is the Stage 2 (no SAR) credible band for reference.

Spatial parameter ρ 0.22

ρ = 0 ⇒ donor outcomes unaffected by neighbours (Stage-2 world). ρ ≈ 0.22 ⇒ posterior mean. Larger ⇒ more neighbour co-movement.

Effective Nevada exposure −3.75

Set Nevada's post-treatment exposure manually. The dashed orange band marks the Stage-3 gap that follows from this exposure.

The animation re-attributes a fraction of the synthetic counterfactual to neighbour-driven dynamics controlled by ρ. At ρ = 0 the curve matches Stage 2; at ρ = 0.6 the synthetic is dragged sharply by the surrounding donor signal. The closer ρ gets to 1, the less of the observed gap is "California's own response" and the more is "neighbours moving together".

What to look for

Nevada is the dominant spillover-receiver by an order of magnitude. This is the empirical signature of SUTVA failure: the only donor with a heavy contiguity link to California absorbs almost all of the spillover mass.
ρ ≈ 0.22 is modest but bounded away from zero. 95% CrI [0.168, 0.272]. A 1-unit change in the neighbour-averaged outcome moves a state's own outcome by 0.22 units. The simplest SUTVA story — "donors are unaffected" — is rejected.
The Stage 3 ATT (−16.59) is between Stages 1 and 2. Adding the SAR layer attributes a small portion of the gap to spillover diffusion, leaving slightly less for California's direct response.

The cross-stage ATT comparison — robust in sign, not in interval

These three estimates come straight from r_sc_bayes_spatial_att_comparison.csv in the post's folder — the headline cross-stage table (§8) made interactive. Toggle methods to compare. Hover any point for SE, interval bounds, and the number of active donors each estimator used.

Methods to display

Classical SCM Bayesian HS Bayesian Spatial SAR

What to look for

All three intervals lie below zero. Whatever you believe about the simplex or about SUTVA, Prop 99 reduced California consumption. The sign is robust to the prior structure.
The order of intervals tells a story. Classical is widest at the lower end, Bayesian HS is widest overall (the horseshoe propagates donor-weight uncertainty), Bayesian Spatial SAR looks suspiciously narrow.
Read the SAR interval cautiously. ESS(ρ) = 3 at tutorial scale. The interval [−16.78, −16.39] is based on too few effectively independent draws. The published paper achieves usable ESS by running 100,000 iterations.

Stage 1 trajectory — the classical Abadie 2010 baseline

For completeness, the underlying classical trajectory: California's actual per-capita cigarette sales (steel blue) vs. the tidysynth-built synthetic California (dashed teal). The two are visually indistinguishable until 1988, then California falls sharply below synthetic.

Connecting the three takeaways

Sign robustness (this tab). Three intervals, three priors, none crosses zero.
Donor shape non-robustness (Tab 2). 4 → 23 → 27 active donors as the prior relaxes.
SUTVA failure (Tab 3). ρ = 0.223 with Nevada absorbing 16× the next state's spillover.

The pedagogical arc of the post is that which states make up the synthetic California is a much weaker statement than what the gap is. The reader who lands on this post asking "did Prop 99 work?" gets a robust yes. The reader who asks "what does that synthetic look like?" gets a much more nuanced answer that depends on the prior.