Why spatial models? Crime does not stop at the border.
A neighborhood's crime rate depends not only on its own income and housing values but also on conditions in adjacent neighborhoods — through displacement (criminals move next door), diffusion (networks span borders), and shared exposure to risk factors. Ordinary least squares treats each neighborhood as independent and misses these spatial spillovers.
The Columbus crime data has Moran's I = 0.222 (p = 0.005),
confirming that OLS residuals cluster geographically. This app lets you turn
the dials yourself. Sweep ρ and λ in a SAR / SEM / SDM simulator, watch a
shock ripple through the spatial multiplier (I − ρW)−1,
and compare direct/indirect/total effects across the full eight-model
taxonomy on the post's actual estimates.
The shrinkage analogy — why ρ matters
Both ρ (spatial lag) and λ (spatial error) shrink the "extra" effect of a
shock as it propagates through neighbours. The animation below sweeps a
penalty knob (here, the spatial parameter) from zero to one. The
orange curve mimics the SAR-type
multiplier 1 / (1 − ρ) behaviour (amplifies); the
steel dashed curve shows the
complementary direction (compression).
Spillover Animation
Drop a shock on one cell of a 7×7 lattice. Watch the spatial multiplier (I − ρW)−1 propagate it. Vary ρ — see how a 0.43 vs 0.80 world differ.
SAR / SEM / SDM Simulator
Generate fake Columbus-like crime data with known ρ, λ, and θ. Estimate ρ̂ and λ̂ back from the sample. See how SAR and SEM compete on the same data.
Direct / Indirect / Total Effects
The post's headline table, as a forest plot. Toggle the eight models and the two regressors. See why SDM and SDEM win — and why SAR forces the wrong sign on HOVAL spillovers.
The three key takeaways this app is built around
- Spatial autocorrelation is real and substantively important. Moran's I = 0.222 (p = 0.005); ignoring it produces a misspecified OLS. The post's LM tests favour the error form (LM = 5.33 vs 3.40), but the full taxonomy is needed to disentangle local from global spillovers.
- SDM and SDEM are the preferred specifications. Both allow the indirect-to-direct ratio to differ across regressors. The SAR forces a constant ratio (≈ 0.75), which the data does not support for HOVAL. SDEM gives a significant negative spillover of neighbour income on local crime (θ̂W·INC = −1.20, p = 0.036).
- The total income effect is 40–55% larger than OLS. OLS estimates a coefficient of −1.60 per \$1,000. SDM and SDEM give total effects of −2.26 to −2.52 — meaning OLS understates the income–crime relationship by ignoring the substantial spillover from neighbouring tracts' wealth.
Glossary (open a card if a term is unfamiliar)
Spatial weight matrix W
Spatial autocorrelation
ρ (rho) — spatial lag of y
Wy in SAR. Measures global feedback: a shock to one tract propagates through the network and partially returns to itself. Estimated as 0.428 in the Columbus SAR.θ (theta) — spatial lag of X
WX in SLX / SDM / SDEM. A local spillover: neighbour income or housing affects this tract directly, without feedback. SDEM gives θ̂INC = −1.20 (p = 0.036).λ (lambda) — spatial error
Wu in SEM / SAC / GNS. Captures spatially correlated unobservables. Substantive at 0.562 in SEM; weakens to 0.166 in SAC when ρ is also fitted.Spatial multiplier (I − ρW)−1
Direct effect
Indirect (spillover) effect
SAR vs SEM
SDM vs SDEM
Spillover Animation — watch the multiplier propagate
A 7×7 lattice of "tracts" (49 cells, matching the Columbus n). Click any
cell to drop a unit shock on it. The animation iteratively spreads it
through a row-standardised rook spatial weight matrix at rate
ρ. After many steps, the system stabilises at
(I − ρW)−1 times the initial shock. Slide ρ to see
the difference between a weak (ρ = 0.2) and strong (ρ = 0.8) spatial world.
Click any cell to drop a unit shock there
Heat colour shows the steady-state response of crime in each tract. Brighter = stronger spillover received from the shocked cell.
What to look for
- At ρ = 0, the shock stays in the originating cell. No spillover. This is OLS.
- At ρ = 0.43 (the Columbus SAR estimate), the shock visibly bleeds into the 4 immediate neighbours, then their neighbours, decaying geometrically. The diagonal of (I − ρW)−1 is about 1.13 — the own-tract amplification.
- At ρ = 0.8 or above, the entire 7×7 grid lights up. Every tract is meaningfully affected by every other tract's shock. This is the regime where spatial models must be used.
- The multiplier amplification ≈
1 / (1 − ρ)tells you the average diagonal of (I − ρW)−1. At ρ = 0.43, this is ≈ 1.75. The post's §5.1 estimat impact reports this for INC: direct effect = −1.10, bare coefficient = −1.03, so the diagonal is about 1.07.
SAR / SEM / SDM Simulator — generate data, estimate it back
Below, we simulate fake Columbus-like data on a 7×7 lattice (n = 49) with known true parameters ρ (SAR-style feedback), λ (SEM-style error correlation), and θ (SLX-style covariate spillover). We then estimate ρ̂ and λ̂ back from the simulated sample to see what we recover. Slide ρ, λ, or θ and watch Moran's I shift — and watch the estimated parameters track (or fail to track) the truth.
Truth what we simulated
Estimates what we recovered
ρ̂ from a quick concentrated likelihood on a grid; λ̂ from the residual Moran-statistic mapping. Pedagogical, not as accurate as Stata's spregress, ml.
What to look for
- Set ρ = 0 and λ = 0: Moran's I should hover near zero. OLS is correct. No spatial model needed.
- Set ρ = 0.6, λ = 0: Strong SAR data. ρ̂ should track ρ. OLS β̂INC looks biased compared to the SDM β̂INC.
- Set ρ = 0, λ = 0.6: Strong SEM data. ρ̂ from a misspecified SAR will be biased (it will catch some of λ). This is exactly the LM-test motivation in §4.3 of the post.
- Set ρ = 0.4, λ = 0.4 simultaneously: The GNS regime. Both parameters become very hard to identify (large standard errors), reproducing the post's overparameterization warning in §8.3.
The post's headline numbers — interactively
These estimates come straight from the post — §5 (SAR, SEM), §6 (SLX, SDM), §8 (SDEM, SAC, GNS), and the §9.2 comparison table. The y-axis lists the eight spatial models. The facets are the three impact components: direct, indirect (spillover), and total. Toggle which models and which regressor (INC or HOVAL) to display.
What to look for
- Toggle OLS only: watch the indirect-effect facet collapse to zero. OLS has no spillover channel by construction. SEM is similar: zero indirect by construction (the error structure does not propagate to outcomes).
- Toggle SLX, SDM, SDEM together: for INC, all three give large negative indirect effects (−1.20 to −1.50). For HOVAL, all three give small positive insignificant indirect effects. This agreement across non-nested models is the §9.1 robustness argument.
- Switch the regressor to HOVAL: notice that SAR forces a negative indirect effect (−0.20), proportional to its direct effect. The SLX, SDM, and SDEM give positive indirect effects. The SAR's proportional-ratio constraint is doing harm here.
- Compare total effects: for INC, OLS gives −1.60. SDM/SDEM give −2.26 to −2.52. That is the 40–55% understatement the post highlights as the headline finding.
Regressor
Models to display
Why does SAR force a constant indirect/direct ratio?
In SAR, all spillovers come from the spatial multiplier (I − ρW)−1.
That matrix is the same for every regressor — so the ratio of off-diagonal
to diagonal terms is identical across INC and HOVAL. The bare coefficient
multiplies through both. In Columbus, this forces HOVAL's indirect effect
to be ≈ 0.69 × HOVAL's direct effect, with the same sign (negative). But
the SLX, SDM, and SDEM all estimate the HOVAL indirect freely — and find
it small and positive. The data prefers the unrestricted ratio.
Connecting back to Tab 3
The DGP simulator in Tab 3 generates data with known ρ, λ, θ. When you set θ ≠ 0, an OLS estimator hides the spillover entirely. When you set ρ ≠ 0, a SAR recovers the spillover but forces it to be proportional across regressors. The forest plot above is the real-data counterpart: every row shows what each estimator finds when the truth is unknown but spatial.
- INC total effect: OLS = −1.60; SAR = −1.86; SDM = −2.52; SDEM = −2.26. Spread of 0.92 across models.
- INC direct effect: all spatial models cluster between −0.94 and −1.10. The direct effect is robust to specification choice.
- INC indirect effect: 0 (OLS, SEM), −0.76 (SAR), −1.20 to −1.50 (SLX, SDM, SDEM). Identifying the indirect effect is the value-added of spatial econometrics.