What “controlling for” looks like as a scatter plot — the fwlplot package
Nagoya University (GSID)
June 11, 2026
Act I
“The effect of coupons on sales, controlling for income” is a relationship in many dimensions.
You cannot put it on a scatter plot. Or can you?
Same data, two views. Left: the raw coupons–sales scatter slopes the wrong way. Right: after partialling income out of both axes, the true positive effect appears.
fwl_plot(): “controlling for income” in one line of Rfeols() to six decimalsAct II
salescoupons (true effect \(+0.2\))income: rich areas get fewer coupons (\(-0.5\)) but buy more (\(+0.3\))Income opens a backdoor path coupons ← income → sales. Block it, or the naive slope is biased.
| Pair | correlation |
|---|---|
| coupons ↔︎ sales (raw) | −0.166 |
| income ↔︎ coupons | −0.709 |
| income ↔︎ sales | +0.500 |
A negative raw coupon–sales correlation, even though the true effect is positive — Simpson’s paradox in one table.
\[\hat\beta_1=(\tilde X_1'\tilde X_1)^{-1}\tilde X_1'\tilde Y,\qquad \tilde Y=M_{X_2}Y,\quad \tilde X_1=M_{X_2}X_1\]
\(M_{X_2}=I-X_2(X_2'X_2)^{-1}X_2'\) is the residual-maker: it strips the influence of the controls \(X_2\) from whatever it multiplies.
fwl_plot() residualizes both coupons and sales on income behind the scenes, then plots the residuals with the regression line on top.
| Term | Naive | Controlled |
|---|---|---|
| coupons | −0.0934 | +0.2123 |
| income | — | +0.3004 |
| R² | 0.028 | 0.321 |
Every number here is a visual feature of the fwl_plot() scatter: the slope, and how tight the cloud is.
Like comparing height for your age: judge each store’s coupons and sales relative to its income level.
0.212288
feols coefficient = manual FWL coefficient, to the sixth decimal
\[\text{bias}=\hat\gamma\times\hat\delta = 0.300\times(-0.494) = -0.148\]
\(\hat\gamma\) = income’s effect on sales; \(\hat\delta\) = slope of coupons on income. True \(+0.212\) plus bias \(-0.148\) ≈ the naive \(-0.093\).
Including airport (or person) fixed effects = partialling out each group’s mean from every variable.
A race handicap: convert raw times into “faster or slower than your own average,” then compare fairly.
No FE (left) → origin FE (centre) → origin + destination FE (right). Each step strips group means and shrinks the air-time spread toward within-route variation.
| Model | air_time | Within R² |
|---|---|---|
| No FE | −0.0031 | — |
| Origin FE | −0.0061 | 0.00058 |
| Origin + Dest FE | −0.0067 | 1.19e-5 |
Within-route, longer-than-usual air time predicts slightly less delay — possibly tail-wind days.
Raw pooled cross-section (left) vs. individual fixed-effects residualized scatter (right): log wage against experience.
0.122
within-person return to experience (R² 0.148 → 0.617 once individual FE are added)
Act III
Objection. If a residualized scatter recovers the “true” \(+0.212\), doesn’t FWL deliver the causal effect?
Response. No. FWL is an algebraic identity, not an identification strategy. The plot is honest only if you control for the right variables; partial out the wrong set and the scatter faithfully draws a biased number. And it holds exactly only for linear models.