One method, three doors: augsynth
The Augmented Synthetic Control Method builds a credible counterfactual for a treated
unit from a weighted "recipe" of untreated donors — then adds an outcome model that
removes leftover bias when the pre-treatment fit is imperfect. The
augsynth package exposes three entry points: single_augsynth
(one treated unit), multisynth (many units, staggered adoption), and
augsynth_multiout (one unit, many outcomes).
This lab reproduces the post's findings interactively, all client-side from a
precomputed results.json. We first prove the method recovers a
known effect on simulated data, show exactly where plain SCM fails and
augmentation saves it, then replicate Papaioannou (2021) on the euro area.
Does the method recover a known effect?
On simulated data the true treatment effect of unit C01 is a jump plus a gentle ramp we injected ourselves. The estimated treated-minus-synthetic gap (blue, with its conformal band) should track the true effect (white dashed) and sit at zero before treatment.
If an estimator cannot reproduce a known truth on simulated ground, do not trust it on real data.
Single & Suitability
Switch between a well-fit unit and one outside the donor hull. Watch plain SCM break and Ridge-ASCM rescue it.
Many Units
The pooled effect path from multisynth, on simulated data (vs truth) and on the 12 euro members.
Replication
Our per-country ASCM estimates vs the paper's reported TFP contributions, country by country.
Glossary (open a card if a term is unfamiliar)
Synthetic control
Augmentation (bias correction)
Pre-fit imbalance (scaled L2)
ATT
single_augsynth and the suitability test
One treated unit, fit against the donor pool. Unit C01 sits inside the donor hull, so plain SCM fits well and augmentation barely matters. Unit C05 was placed outside the hull on purpose — plain SCM cannot match its pre-period, and it takes the Ridge outcome model to recover the truth.
Actual vs synthetic control
—
Recovery error across all five treated units
Absolute distance between the estimate and the known truth, for plain SCM (orange) and Ridge-ASCM (teal). Augmentation helps most exactly where the fit is worst.
multisynth: many treated units at once
multisynth fits one synthetic control per treated unit and partially pools
them, returning a pooled average effect and per-unit effects. Toggle between the
simulated panel (where we can check against the known truth) and the
real euro-area panel of 12 members.
Pooled effect path
Per-unit and pooled effects with jackknife CIs
Point = estimate, bar = 95% jackknife confidence interval, orange tick = known truth. teal = excludes zero (significant); grey = includes zero.
Per-unit recovery: jackknife vs wild bootstrap
The jackknife interval (used above) excludes zero for every unit; the more conservative wild bootstrap, which also carries the counterfactual-estimation uncertainty, does not. Same estimates, different verdict — the inference method matters.
Replicating Papaioannou (2021)
Did the euro raise total factor productivity? We fit a synthetic control for each of the 12 founding members against 24 non-euro donors and compare our ASCM percentage effect (2000–2007) to the paper's reported contribution, country by country. Points on the 45-degree line agree.
ASCM vs the paper: TFP % contribution, 2000–2007
Hover a point for the exact pair. France and the Netherlands land almost on the line; Germany and Ireland diverge in magnitude but not in sign.
Country-by-country comparison
Greece and Portugal turn negative in 2008–17 in both our estimates and the paper — the post-crisis reversal.
Inference: is the effect real, or could it be noise?
A point estimate is only half the story. augsynth ships a small toolbox of
inference methods, and they do not all agree. This tab shows which headline results are
statistically significant, explains the three tools the tutorial uses, and lets you feel
what drives significance with a hands-on simulator.
Significance scoreboard (from the tutorial's results)
Each headline estimate with its confidence interval or p-value, and whether it is distinguishable from zero at the 5% level.
On simulated data, where we injected a real effect, every headline is significant. C05 (outside the donor hull) and the real euro-area pooled effect are honestly not.
Three tools, matched to three estimators
jackknife+ — single_augsynth
conformal — single_augsynth & augsynth_multiout
jackknife — multisynth (primary)
wild bootstrap — multisynth (conservative)
Simulator: what makes an effect significant?
An illustrative treated-minus-control gap: flat before adoption, shifted by the true effect after.
Move the sliders and watch the confidence interval widen or narrow and the verdict flip at the 5% line.
(A teaching model — the real augsynth uses jackknife / conformal / bootstrap, not this normal approximation.)
What this tab teaches
- Significance is not the point estimate. A large, accurate estimate can still be indistinguishable from zero if the interval is wide.
- Three things widen the interval: more noise, fewer pre-treatment periods (a worse-pinned counterfactual), and a poorer pre-fit.
- The inference method matters. The multisynth jackknife calls the pooled effect significant; the conservative wild bootstrap does not — on the same numbers.
- Be honest on real data. The euro-area pooled effect is near zero and not significant; we report it as such rather than dressing it up.