MGWFER — Interactive Lab

A pedagogical companion to MGWFER: Causal Spatially Varying Coefficients via Panel Fixed Effects ↗ Back to the post

Why MGWFER? Why fixed effects in a spatial model?

Suppose you want to estimate how the relationship between covariates and an outcome varies across space — say, how education's effect on income changes from neighbourhood to neighbourhood. If some unobserved attribute of place (geographic amenities, persistent institutions, social norms) drives both the outcome and the covariate levels, your spatially varying coefficients absorb that contamination. The result: coefficient surfaces that look like local effects but are actually omitted variable bias.

MGWFER (Multiscale Geographically Weighted Fixed Effects Regression; Li & Fotheringham 2026) combines two ideas: a panel within-transformation that removes every time-invariant confounder, and Multiscale GWR that lets each covariate have its own bandwidth. In this lab you turn the dials yourself: slide a confounder coupling and watch a pooled estimator's coefficients explode; flip on the within-transformation and watch them snap back to truth; compare three local estimators on the post's own recovery metrics.

The backdoor path: spatial context as a hidden confounder

A standard regression assumes covariates are exogenous — they only act on the outcome. The paper's headline observation is that, in geography, this is rarely true. Wealthy regions invest more in transit; coastal regions have more tourism; old-industrial regions have higher unemployment. Place causes the covariate levels, and place also acts on the outcome. The animation below shrinks two coefficients as a penalty grows — a stand-in for the within transformation, which "shrinks" the influence of all time-invariant place attributes to exactly zero, but only in the L1 (LASSO-like) sense: it discards them completely, not asymptotically.

Read the orange L1 line as MGWFER and the blue L2 line as PMGWR's intercept absorption: PMGWR shrinks but never eliminates the confounder, MGWFER eliminates it cleanly. The vertical axis is "remaining bias" in a stylised sense.

Tab 2

Confounder Lab

Crank up the spatial-context coupling strength and watch the indirect-channel bias appear in real time. The post's δ_k parameter, as a slider.

Tab 3

Within vs Pooled

Same simulation, two estimators. Watch the within-transformation flip the sign of α̂ back to the truth. Run 100 sims for the bias-variance picture.

Tab 4

Recovery Forest

The post's own RMSE numbers, interactively. MGWR_cs / PMGWR / MGWFER across β₁, β₂, β₃, β₄, and the spatial-context surface α.

Glossary (open a card if a term is unfamiliar)

Spatial context (SC)
The bundle of time-invariant place attributes that affect both outcomes and covariate levels. In the simulation it is a smooth exponential surface; in real data it is everything you cannot measure about a county.
Indirect contextual effect (δ_k)
The strength of the link SC → x_k. When this is non-zero, OLS / MGWR recovers β_k + δ_k, not β_k. This is the slider in Tab 2.
Within transformation
Subtract each unit's time-series mean from each observation. The time-invariant confounder α_i vanishes. The MGWFER algorithm runs MGWR on this demeaned panel.
MGWR
Multiscale Geographically Weighted Regression: each covariate gets its own optimal bandwidth. Local effects can vary at different spatial scales.
PMGWR (pooled MGWR)
Naïve panel baseline. Stacks all NT observations and runs MGWR with an intercept. Cannot remove α_i. Produces β̂_1 surfaces anti-correlated with truth.
MGWFER (this paper)
Stage 1: within-transform + standardise + MGWR + back-transform. Stage 2: recover α_i with per-unit t-tests. The proposed estimator.
Intrinsic contextual effect
The unit-level α̂_i — the part of the outcome that the covariates cannot explain at this location. Stage 2 of MGWFER hands it back as a substantive surface, not a nuisance term.
Bandwidth
The kernel smoothness parameter. Cross-validation picks it. When data contain a fixed effect, the CV criterion is contaminated and picks the wrong bandwidth — see Tab 4.

Confounder Lab — turn the SC coupling knob yourself

The paper's data generating process is x_k = δ · sc + ν: each covariate is a noisy linear function of the unobserved spatial context. The coupling strength δ governs how badly a naïve estimator will be biased. The paper's default is δ = 0.05, which produces Cor(x_k, sc) ≈ 0.84. Slide δ below to see the bias smoothly grow with the coupling.

More observations ⇒ tighter estimates, but does not reduce the systematic bias from omitted SC.
All covariates share the same hidden confounder.
0 ⇒ no confounding (covariates are exogenous). Larger ⇒ indirect-channel bias dominates. The paper's setting maps to ~0.6 here.
Slide left for pooled (no within-transform); right for full within-transform (MGWFER).
covariates retained
out of
β̂ (pooled / shrunk)
contaminated by SC
β̂ (after within / post-OLS)
refit on retained set
true β
0.50
held fixed

What to look for

  • At δ = 0, the bias vanishes. No SC → no indirect channel → no need for the within-transformation. Both estimators recover the truth.
  • As δ grows, pooled β̂ drifts away from the true β. This is paper Eq. 8 in motion: β̂_k = β_k + δ_k. The drift is monotone.
  • After the within-transformation, post-OLS tracks truth closely. This is the MGWFER mechanism made visible on simulated 1-D data.
  • Try n = 50, δ = 1.5. Even with lots of data, large coupling makes the naïve estimator hopelessly biased. The within-transformation rescues it.

Within vs Pooled — the sign-flip on α̂

Same simulated DGP. The only difference: whether the within-transformation is applied before estimation. Pooled OLS stacks the panel and absorbs the confounder into the slopes; FE / within demeans first and recovers the truth. The paper's headline finding in Section 6 (Table 2) is that this single move flips β̂_1 from ≈ 6.1 (pooled) to ≈ 1.57 (FE) — a 4× reduction toward the true 1.5.

Capped at 300 so the "Run 100 sims" button finishes quickly.
Capped at 50 for the 100-sim run.
True β magnitude (held the same for pooled and within).
0 = SC affects only y · 1 = SC affects only the covariates · > 0.5 = the regime the paper is built for.

Within / FE estimator

Demeans by unit · removes time-invariant SC · MGWFER's Stage 1 mechanism

α̂
SE(α̂)
|I_y|
|I_d|
union
λ_y, λ_d

Pooled estimator

Stacks all NT obs · cannot remove α_i · the PMGWR / pooled-OLS baseline

α̂
SE(α̂)
|I_y|
|I_d|
union
λ_y, λ_d

Why does this happen?

  • The pooled estimator has nowhere to put SC's influence except into the slopes. When SC drives the covariate levels (Eqs. 40–43 of the paper), every coefficient absorbs a piece of δ_k.
  • The within-transformation removes the time-invariant part of SC entirely. Since SC is assumed time-invariant (the paper's Assumption 1), it vanishes by construction — no residual bias.
  • The variance penalty for using within is small. You lose one degree of freedom per unit; you gain identification.

Bias vs variance over many simulations

Single runs are noisy. Run the whole pipeline 100 times with fresh draws (same parameters, different ε and ν) to see whether the pooled bias is systematic — and whether the within estimator's noise is centred on truth.

Recovery forest — the post's own RMSE numbers

These numbers come straight from model_comparison.csv in the post's folder — the recovery RMSE (lower is better) of three local estimators against the known true coefficient surfaces. Toggle outcomes and methods to compare. Hover a point to see the bandwidth the estimator picked.

What to look for

  • Untoggle "alpha (sc)" first — its RMSE values (up to 25.6) dominate the axis. Once it is hidden, the β coefficient gap becomes visible: every MGWFER RMSE is at least an order of magnitude smaller than its PMGWR / MGWR_cs counterpart.
  • Hover the bandwidth bar chart below to see why PMGWR fails: every bandwidth collapses into 44–50, because under SC coupling every covariate looks like the same noisy proxy. MGWFER spreads to [50, 91, 116, 62] — recovering true process scales.
  • Look at β_4 (null effect). Truth is zero. PMGWR's RMSE is 1.86; MGWFER's is 0.14. That difference is exactly the indirect-channel bias mechanism.

Outcomes (coefficient or surface)

Methods

Why does PMGWR's α̂ surface invert?

The local intercept of PMGWR has to absorb whatever residual variation in y the slopes do not capture. With the indirect channel active, the slopes are systematically inflated to absorb SC's contribution to y. The intercept then receives a negative shift to balance the books. Result: PMGWR's intercept ranges over [-11.27, 10.04] when the truth is [2.07, 51.55]. The Pearson correlation with truth is 0.98 — the shape is right — but the magnitudes are wildly wrong.

MGWFER's Stage 2 reconstructs α̂_i directly from the unit means and the Stage 1 slopes (paper Eq. 30): α̂_i = ȳ_i − Σ_k β̂_bwk(u_i, v_i) · x̄_{ik}. The result is a near-perfect recovery: Corr ≈ 1.000, range [1.45, 51.62].

Connecting back to Tab 3

The Pooled-vs-Within contrast you just experimented with on a simulated panel is exactly the gap between PMGWR and MGWFER on the post's spatial simulation:

  • β₁ (the quadratic dome): PMGWR RMSE 2.30, MGWFER RMSE 0.18. PMGWR's correlation with truth is −0.46 — anti-correlated. MGWFER's is +0.82.
  • β₄ (the null): PMGWR RMSE 1.86 (against a true 0), MGWFER RMSE 0.14. PMGWR detects spurious spatial structure aligned with SC's column gradient.
  • α (spatial context): PMGWR RMSE 25.6, MGWFER RMSE 0.54. A 47-fold reduction. The post's Figure 5 is this number, mapped.

The takeaway from §8–§11 of the post is therefore visible twice: once on a controlled 1-D simulation where you set the truth (Tab 3), and once on the 225-unit × 3-period spatial panel that motivates the whole exercise.