What does it take to estimate California's Prop 99 effect honestly?
Three estimators target the same number — California's average treatment effect on per-capita cigarette sales after Proposition 99 — under progressively weaker assumptions. Classical SCM forces donor weights onto the simplex; Bayesian horseshoe shrinks but does not constrain; Bayesian Spatial SAR additionally drops the SUTVA assumption that donors are unaffected by the treatment. This app lets you see the three approaches side by side and decide which differences are substantive and which are artifacts of the prior structure.
The three takeaways from the post, made interactive in the tabs below: the ATT is robust in sign across all three methods (−15 to −19 packs per person per year); the donor pool's shape is not robust (4 active donors → 23 → 27 as the prior relaxes); and SUTVA is empirically false for this case study (Nevada absorbs a −3.75 spillover, 16× the next state).
Simplex vs. horseshoe — why "four donors" is partly a constraint artifact
The classical synthetic control method concentrates 99% of the donor weight on Utah, Nevada, Montana, and Connecticut. The horseshoe prior, fit on the same data, spreads non-trivial posterior mass across 23 of 38 donors. The animation below cycles between the two pictures — the sparsity of Stage 1 is not entirely the data's fault; the optimizer is pushed there by the constraint.
Donor Weights
Toggle between the three stages' donor allocations. See which states carry mass under the simplex vs. the horseshoe vs. the SAR-corrected horseshoe.
Spillover & ρ
The SUTVA-failure evidence. See the Nevada spillover (−3.75 packs/capita, 16× larger than next) and dial ρ to feel what spatial autocorrelation does to the gap.
Cross-stage ATT
The post's headline comparison. A forest plot of the three ATT estimates with intervals. Hover any point for the standard error, interval bounds, and active-donor count.
Glossary (open a card if a term is unfamiliar)
ATT
Donor pool
Simplex constraint
Horseshoe prior
SUTVA
SAR (ρ)
Spillover effect
ESS
Donor weights across the three stages
Two pictures of "which states make up the synthetic California". Toggle between the classical simplex and the Bayesian horseshoe and watch the active-donor count grow from 4 to 23. The SAR-corrected Stage 3 inherits the horseshoe weights but adds a spatial diffusion layer (covered in Tab 3); its headline metric is the same expanded donor pool — 27 donors with non-negligible posterior mass after the SAR layer integrates over ρ. None of this changes the sign of the ATT — but it shows how much of the classical "four donors" story is the simplex talking, not the data.
Stage
Stage 1: 99.8% of the mass on Utah, Nevada, Montana, Connecticut. The remaining 34 donors are essentially zero.
Stage 3 (Bayesian Spatial SAR) reuses the horseshoe donor weights from Stage 2; see Tab 3 · Spillover & ρ for the SAR diffusion view.
What to look for
- The simplex picture is sparse. Four bars dominate; everything else is essentially zero. The optimiser has no choice — non-negative weights summing to one are pushed to a corner of the simplex.
- The horseshoe picture is broad. Twenty-three donors carry mean posterior weight above 0.01. But only Nevada's 95% credible interval excludes zero; every other top donor is statistically consistent with no contribution.
- The ATT barely moves. −18.46 → −15.84. Whether your synthetic uses 4 donors or 23, the gap California → counterfactual stays large and negative.
Spillover effects on donor states — SUTVA fails for tobacco
The most novel result of the SAR layer is the per-donor spillover. The Sakaguchi–Tagawa pipeline computes each donor's average post-treatment effect by forward-simulating the SAR data-generating process with and without California's treatment, integrating over the posterior draws of ρ. The top-8 ranking is below.
Spillover ranking (top-8 donor states by absolute effect)
Orange bars denote negative spillovers (per-capita cigarette sales pushed down). Nevada receives −3.75 packs/capita/year — 16× larger than the next state, more than 2,900× larger than the smallest non-zero spillover.
What if ρ were larger? Smaller? — Stage 3 gap explorer
Slide ρ from 0 (no spatial dependence, recovering Stage 2) to 0.6 (strong neighbour co-movement) and see how the SAR-corrected gap on California shifts. The default ρ = 0.223 corresponds to the post's posterior mean. The grey band is the Stage 2 (no SAR) credible band for reference.
The animation re-attributes a fraction of the synthetic counterfactual to neighbour-driven dynamics controlled by ρ. At ρ = 0 the curve matches Stage 2; at ρ = 0.6 the synthetic is dragged sharply by the surrounding donor signal. The closer ρ gets to 1, the less of the observed gap is "California's own response" and the more is "neighbours moving together".
What to look for
- Nevada is the dominant spillover-receiver by an order of magnitude. This is the empirical signature of SUTVA failure: the only donor with a heavy contiguity link to California absorbs almost all of the spillover mass.
- ρ ≈ 0.22 is modest but bounded away from zero. 95% CrI [0.168, 0.272]. A 1-unit change in the neighbour-averaged outcome moves a state's own outcome by 0.22 units. The simplest SUTVA story — "donors are unaffected" — is rejected.
- The Stage 3 ATT (−16.59) is between Stages 1 and 2. Adding the SAR layer attributes a small portion of the gap to spillover diffusion, leaving slightly less for California's direct response.
The cross-stage ATT comparison — robust in sign, not in interval
These three estimates come straight from r_sc_bayes_spatial_att_comparison.csv in the post's
folder — the headline cross-stage table (§8) made interactive. Toggle methods to compare. Hover any point
for SE, interval bounds, and the number of active donors each estimator used.
Methods to display
What to look for
- All three intervals lie below zero. Whatever you believe about the simplex or about SUTVA, Prop 99 reduced California consumption. The sign is robust to the prior structure.
- The order of intervals tells a story. Classical is widest at the lower end, Bayesian HS is widest overall (the horseshoe propagates donor-weight uncertainty), Bayesian Spatial SAR looks suspiciously narrow.
- Read the SAR interval cautiously. ESS(ρ) = 3 at tutorial scale. The interval [−16.78, −16.39] is based on too few effectively independent draws. The published paper achieves usable ESS by running 100,000 iterations.
Stage 1 trajectory — the classical Abadie 2010 baseline
For completeness, the underlying classical trajectory: California's actual per-capita cigarette sales (steel blue) vs. the tidysynth-built synthetic California (dashed teal). The two are visually indistinguishable until 1988, then California falls sharply below synthetic.
Connecting the three takeaways
- Sign robustness (this tab). Three intervals, three priors, none crosses zero.
- Donor shape non-robustness (Tab 2). 4 → 23 → 27 active donors as the prior relaxes.
- SUTVA failure (Tab 3). ρ = 0.223 with Nevada absorbing 16× the next state's spillover.
The pedagogical arc of the post is that which states make up the synthetic California is a much weaker statement than what the gap is. The reader who lands on this post asking "did Prop 99 work?" gets a robust yes. The reader who asks "what does that synthetic look like?" gets a much more nuanced answer that depends on the prior.