What does it mean to control for a variable?
The Frisch–Waugh–Lovell theorem says the coefficient on a variable in a multiple regression equals the slope you would get from a simple bivariate regression — after first removing every other control from both the outcome and that variable. This is the picture: the raw scatter (orange, left) gives a misleading slope; once income is partialled out (teal, right), the true positive coupon effect emerges.
The fog lifts — watch confounding disappear
The same 150 simulated stores, plotted two ways. The left panel is the raw sales-vs-coupons scatter — slope is wrong-signed because high-income neighborhoods get fewer coupons but spend more. The right panel morphs from raw to the FWL residualized version: as income is partialled out, the cloud reshapes and the slope flips to its true positive value.
Confounding Lab
Slide the true causal effect, the income effect, and the confounding link. Watch the naive and FWL slopes diverge — or coincide, when there is no confounder to remove.
Forest Plot
The post's headline coefficients — store, flights, and wages — side by side. Toggle outcomes and methods. Hover any point for SE, CI, and FE count.
Panel FE
Within-person residualization in action. Click the toggle to demean a panel by individual and see the slope jump from 0.03 (pooled) to 0.12 (within-person).
Glossary — open a card if a term is unfamiliar
FWL theorem
Confounder
Residualization
Omitted-variable bias (OVB)
Added-variable plot
Within (FE) demeaning
Simpson's paradox
fwl_plot()
Confounding Lab — when does the naive slope mislead?
The post simulates a store DGP with true β = +0.2 for coupons
and an income confounder (γ = +0.3 on sales, δ = −0.5 on coupons). Slide
those three knobs and watch the naive slope (orange) drift away from the
truth while the FWL slope (teal) stays anchored. When the confounding link
δ = 0, the naive and FWL slopes coincide — that's the regime where
"controlling for" does nothing.
What to look for
- Move δ to zero. The naive and FWL slopes converge — when the confounder does not predict the treatment, there is no bias to remove.
- Crank γ up. A stronger income effect on sales amplifies the OVB (γ × δ). The naive slope drifts further from truth; the FWL slope stays put.
- Flip the sign of β. Set β = -0.1, leave γ = 0.3 and δ = -0.5. The naive slope can still appear positive while the truth is negative — Simpson's paradox in action.
- OVB column. The predicted bias γ × δ matches the gap (naive − true) up to sampling noise. The mismatch shrinks as n grows.
Bias vs variance over many simulations
Single draws are noisy. Run the pipeline 100 times with fresh random draws (same parameters) to see whether the bias is systematic.
The post's three FWL stories — at a glance
The numbers below come from r_fwlplot/index.md §4, §6, §7. Each column is one empirical setting in the post; each row is a progressive "control" step. Watch the coefficient march from the naive value to the fully-controlled estimate — that march is FWL in action.
What to look for
- Store DGP (left). Naive β̂ = −0.093 (wrong sign!), + Income β̂ = +0.212 (close to truth +0.2), + Inc + Day β̂ = +0.222. The sign flip is the headline confounding result.
- Flights (center). Naive β̂ = −0.003 doubles to −0.006 once origin FE absorb airport-level delay levels. With 103 destinations partialled out as well, the SE balloons (CI now includes zero) — every cross-route comparison has been removed.
- Wages (right). Pooled exper effect is +0.105; with individual FE the within-person return jumps to +0.122. Ability-confounded pooled view underestimates the true return.
- Hover any point for the SE, the 95% CI, and the number of controls/FE that estimator used.
Outcomes
Methods (control steps)
Why does the Flights CI widen with destination FE?
Adding 103 destination dummies absorbs every cross-route comparison. The remaining variation is purely within-route: for flights on the same route, does longer-than-usual air time predict longer-than-usual delay? With so little variation left, the answer is noisy — the point estimate moves slightly (−0.006 → −0.007) but the SE jumps from 0.0005 to 0.0034. FE absorb confounding and identifying variation; the trade-off is always there.
Why does the Wages slope grow with individual FE?
Higher-ability workers earn more and tend to accumulate experience in higher-paying jobs. The raw cross-section confounds these two effects, and the resulting pooled slope (+0.105) is an attenuated blend. Within-person FE strip away ability and force the slope to identify off the same individual at two different points in their career — recovering the true return to experience (+0.122).
Panel FE — within-person residualization in pictures
A synthetic 60 × 8 wage panel (60 individuals, 8 years each) with strong unobserved ability and a homogeneous true within-person return to experience. Click the toggle to demean each individual's exper and lwage by their own mean — that is FWL applied to individual dummies. The pooled slope is shallow (≈ 0.03); the within slope is roughly 4× steeper (≈ 0.12), matching the post's §7.3 finding.
What to look for
- Raw mode. A wide fan of points — same experience level, very different wages. Each color is a person; ability differences spread them apart vertically.
- Within mode. Subtract every person's mean from both axes. The cloud collapses around zero; what's left is each person's deviation from their own typical career. The slope steepens.
- Why steeper? Higher-ability people earn more and happen to occupy more experienced positions on average. Pooled OLS confounds these. Demeaning removes the ability component, leaving only the within-person increment.
Connecting back to Tab 1
The morphing animation in Tab 1 and the toggle here do the same thing: they show the data before and after partialling-out. In Tab 1 it was a continuous confounder (income); here it is a categorical confounder (each individual's identity). Both are special cases of the same FWL recipe — and both produce the same picture: the raw slope is confounded; the residualized slope is the truth the regression table reports.