FWL Theorem (Stata) — Interactive Lab

What does `scatterfit y x, controls(z)` actually do?

The Frisch–Waugh–Lovell theorem is the engine inside Stata's scatterfit and reghdfe commands. It says the coefficient on a variable in a multiple regression equals the slope of a simple bivariate regression — after first removing every other control from both the outcome and that variable. This is the picture: the raw scatter (orange, left) gives a misleading slope; once income is partialled out via controls(income), the true positive coupon effect emerges (teal, right).

The fog lifts — watch confounding disappear

The same 150 simulated stores, plotted two ways. The left panel is the raw scatterfit sales coupons scatter — slope is wrong-signed because high-income neighborhoods get fewer coupons but spend more. The right panel morphs from raw to the FWL residualized version (scatterfit sales coupons, controls(income)): as income is partialled out, the cloud reshapes and the slope flips to its true positive value.

Tab 2

Confounding Lab

Slide the true causal effect, the income effect, and the confounding link. Watch the naive and FWL slopes diverge — or coincide, when there is no confounder to remove.

Tab 3

Forest Plot

The post's three Stata stories — store, NYC flights with fcontrols(), and wages with absorb() — side by side. Hover any point for SE, CI, and FE count.

Tab 4

Panel FE

Within-person residualization in action. Toggle the view to demean a panel by absorb(nr) and see the slope jump from 0.03 (pooled) to 0.12 (within-person).

Glossary — open a card if a term is unfamiliar

FWL theorem

Coefficient on X₁ in a multivariable regression equals the simple slope of residual Y on residual X₁, where both residuals come from regressing on the other controls. Two routes, one number — the engine behind reghdfe.

scatterfit

Stata package (Ahrens 2024) that draws the FWL-residualized scatter automatically — turning the FWL identity into a picture. Options: controls() for continuous confounders, fcontrols() for fixed effects, binned for large data.

reghdfe

Stata's high-dimensional fixed-effects estimator (Correia 2016). Uses iterative demeaning — a generalization of FWL — to absorb thousands of FE without inverting a giant matrix.

Confounder

A third variable correlated with both the treatment and the outcome. It distorts the raw slope. In the store DGP, income is the confounder linking coupons and sales.

Residualization

Replace each variable with the part not explained by the controls — the OLS residual. In Stata: regress y z; predict y_resid, residuals. Wipe the fog off the window before looking through it.

Omitted-variable bias (OVB)

If the confounder is omitted, the naive slope deviates from the truth by γ × δ — the confounder's effect on Y times its slope on the treatment. The bias is predictable.

Binned scatter

Group observations into quantile bins along x, plot the bin means. scatterfit y x, binned replaces an unreadable cloud with a few readable dots — essential for large datasets where R/Python FWL plots fail.

Within (FE) demeaning

Subtract each unit's own mean from every variable. What remains is within-unit variation — every time-invariant trait of that unit is scrubbed away in one stroke. In Stata: reghdfe ... , absorb(nr).

Confounding Lab — when does the naive slope mislead?

The post simulates a store DGP with true β = +0.2 for coupons and an income confounder (γ = +0.3 on sales, δ = −0.5 on coupons). Slide those three knobs and watch the naive slope (orange, what scatterfit sales coupons shows) drift away from the truth while the FWL slope (teal, what scatterfit sales coupons, controls(income) shows) stays anchored. When the confounding link δ = 0, the naive and FWL slopes coincide — that is the regime where "controlling for" does nothing.

Sample size n 200

More stores = tighter cloud, less sampling noise around the truth.

True effect β (coupons → sales) 0.20

The causal coefficient the FWL slope should recover.

γ (income → sales) 0.30

How strongly the confounder raises sales directly.

δ (income → coupons) -0.50

Slope of coupons on income. The confounding link. Set δ = 0 to switch off confounding entirely.

naive slope (β̂)

—

regress sales coupons

FWL slope (β̂)

—

regress sales coupons income

OVB = γ × δ

—

predicted bias

true β

0.200

held by the slider above

What to look for

Move δ to zero. The naive and FWL slopes converge — when the confounder does not predict the treatment, there is no bias to remove. controls(income) in scatterfit would change nothing.
Crank γ up. A stronger income effect on sales amplifies the OVB (γ × δ). The naive slope drifts further from truth; the FWL slope stays put.
Flip the sign of β. Set β = -0.1, leave γ = 0.3 and δ = -0.5. The naive slope can still appear positive while the truth is negative — Simpson's paradox in action.
OVB column. The predicted bias γ × δ matches the gap (naive − true) up to sampling noise. The mismatch shrinks as n grows. The post's measured OVB at the default settings is −0.148.

Bias vs variance over many simulations

Single draws are noisy. Run the pipeline 100 times with fresh random draws (same parameters) to see whether the bias is systematic.

The post's three FWL stories — at a glance

The numbers below come from stata_fwl/index.md §4, §7, §8 — measured with regress and reghdfe in Stata. Each outcome is one empirical setting; each row is a progressive "control" step handled by controls() or fcontrols() / absorb(). Watch the coefficient march from the naive value to the fully-controlled estimate — that march is FWL in action.

What to look for

Store DGP. Naive β̂ = −0.093 (wrong sign!), controls(income) gives +0.212 (close to truth +0.2), controls(income dayofweek) gives +0.222. The sign flip is the headline confounding result.
Flights (Stata sample, 5,000 obs). No-FE estimate is −0.005 with R²≈0. With fcontrols(origin_fe) the effect strengthens to −0.008; with fcontrols(origin_fe dest_fe) it jumps to −0.032 but 6 singleton routes are dropped and the SE widens (CI now includes zero).
Wages (4,360 person-years). Pooled exper effect is +0.105; with absorb(nr) the within-person return jumps to +0.122. Ability-confounded pooled view underestimates the true return.
Hover any point for the SE, the 95% CI, and the number of controls/FE that estimator used.

Outcomes

Store: coupons → sales Flights: air_time → delay Wages: exper → log(wage)

Methods (control steps)

Why does the Flights CI widen with `fcontrols(origin_fe dest_fe)`?

Adding ~100 destination dummies absorbs every cross-route comparison. The remaining variation is purely within-route: for flights on the same route, does longer-than-usual air time predict longer-than-usual delay? With so little variation left, the answer is noisy — the point estimate moves from −0.008 to −0.032 but the SE jumps from 0.003 to 0.027 (six singleton routes are dropped, N goes 5000 → 4994). Fixed effects absorb confounding and identifying variation; the trade-off is always there. This is exactly the kind of diagnostic scatterfit's fcontrols() makes visual.

Why does the Wages slope grow with `absorb(nr)`?

Higher-ability workers earn more and tend to accumulate experience in higher-paying jobs. The raw cross-section confounds these two effects, and the resulting pooled slope (+0.105) is an attenuated blend. reghdfe lwage exper expersq, absorb(nr) strips away ability and forces the slope to identify off the same individual at two different points in their career — recovering the true return to experience (+0.122). scatterfit lwage exper, fcontrols(nr) draws this picture directly.

Panel FE — within-person residualization in pictures

A synthetic 60 × 8 wage panel (60 individuals, 8 years each) with strong unobserved ability and a homogeneous true within-person return to experience. Click the toggle to demean each individual's exper and lwage by their own mean — that is FWL applied to individual dummies, exactly what scatterfit lwage exper, fcontrols(nr) does behind the scenes via reghdfe. The pooled slope is shallow (≈ 0.03); the within slope is roughly 4× steeper (≈ 0.12), matching the post's §8.3 finding.

View mode

Pooled raw (no FE) — scatterfit lwage exper Individual FE — scatterfit lwage exper, fcontrols(nr)

Same data, two viewing modes. Points are colored by individual id in the raw view.

current slope

—

OLS on (modified) (exper, lwage)

target (raw)

≈ 0.03

pooled exper coefficient

target (within)

≈ 0.12

individual-FE exper coefficient

N × T

60 × 8

synthetic panel

What to look for

Raw mode. A wide fan of points — same experience level, very different wages. Each color is a person; ability differences spread them apart vertically. This is the scatterfit lwage exper picture.
Within mode. Subtract every person's mean from both axes. The cloud collapses around zero; what's left is each person's deviation from their own typical career. The slope steepens. This is what fcontrols(nr) draws.
Why steeper? Higher-ability people earn more and happen to occupy more experienced positions on average. Pooled OLS confounds these. Demeaning removes the ability component, leaving only the within-person increment.

Connecting back to Tab 1

The morphing animation in Tab 1 and the toggle here do the same thing: they show the data before and after partialling-out. In Tab 1 it was a continuous confounder (income, removed via controls()); here it is a categorical confounder (each individual's identity, removed via fcontrols() / absorb()). Both are special cases of the same FWL recipe — and both produce the same picture: the raw slope is confounded; the residualized slope is the truth the regress / reghdfe table reports.

What does scatterfit y x, controls(z) actually do?

The fog lifts — watch confounding disappear

Confounding Lab

Forest Plot

Panel FE

Glossary — open a card if a term is unfamiliar

Confounding Lab — when does the naive slope mislead?

What to look for

Bias vs variance over many simulations

The post's three FWL stories — at a glance

What to look for

Outcomes

Methods (control steps)

Why does the Flights CI widen with fcontrols(origin_fe dest_fe)?

Why does the Wages slope grow with absorb(nr)?

Panel FE — within-person residualization in pictures

What to look for

Connecting back to Tab 1

What does `scatterfit y x, controls(z)` actually do?

Why does the Flights CI widen with `fcontrols(origin_fe dest_fe)`?

Why does the Wages slope grow with `absorb(nr)`?