Crime in Columbus neighborhoods — the SAR/SEM/SDM taxonomy
Nagoya University (GSID)
June 11, 2026
Act I
A tract’s crime depends on its own income and housing — and on what happens next door: displacement, diffusion, shared risk.
Treat 49 neighborhoods as 49 independent observations and you may bias every coefficient. So we must test that assumption.
\[I = \frac{N}{S_0}\cdot\frac{e' W e}{e' e}\]
On the OLS residuals, \(I = 0.222\) (\(z = 2.84\), \(p = 0.005\)): high-crime tracts neighbour high-crime tracts. The i.i.d.-error assumption fails.
Act II
CRIME: burglaries + vehicle thefts per 1,000 households (mean 35.1)INC (income, $1k; mean 14.4) and HOVAL (house value, $1k; mean 38.4)Row-standardized means each row of \(W\) sums to 1, so the spatial lag \(W y\) is the weighted average among a tract’s neighbours — about 4.8 of them.
\(W\cdot INC\) and \(W\cdot HOVAL\) are pre-computed in Mata, then entered as ordinary regressors in SLX/SDM/SDEM/GNS.
| Test | Statistic | \(p\) | Reading |
|---|---|---|---|
| LM-error (\(\lambda\)) | 5.33 | 0.021 | favours SEM |
| LM-lag (\(\rho\)) | 3.40 | 0.065 | weaker |
| Robust LM-error | 2.19 | 0.139 | survives |
| Robust LM-lag | 0.26 | 0.612 | fades |
Anselin’s rule favours SEM here — but the full taxonomy will tell a subtler story.
\[y = \rho W y + X\beta + \varepsilon\]
\(\rho = 0.428\) (\(z = 3.49\), \(p < 0.001\)): a contagion channel. Because \(W y\) is endogenous, we fit by maximum likelihood, not OLS.
\[y = X\beta + u, \quad u = \lambda W u + \varepsilon\]
\(\lambda = 0.562\) (\(z = 4.23\), \(p < 0.001\)). Spatial dependence is a nuisance in the error — so the SEM yields zero spillover effects by construction.
\[y = X\beta + W X\theta + \varepsilon\]
| Direct | Indirect | Total | |
|---|---|---|---|
| INC | −1.10*** | −1.40** | −2.50*** |
| HOVAL | −0.29*** | +0.21 | −0.08 |
\(W\cdot INC = -1.40\) (\(p = 0.016\)): richer neighbours mean less own crime. No spatial multiplier, so \(\theta\) is the indirect effect.
\[y = \rho W y + X\beta + W X\theta + \varepsilon\]
Two spillover channels at once: global feedback through \(\rho W y\) and local spillover through \(W X\theta\).
The SDM nests SAR (\(\theta = 0\)), SLX (\(\rho = 0\)), and SEM (common-factor) — so we can test down from it.
| Direct | Indirect | Total | |
|---|---|---|---|
| INC | −1.03*** | −1.50* | −2.52*** |
| HOVAL | −0.28*** | +0.22 | −0.07 |
Direct effects stay near the other models; the indirect income effect is larger than in SAR (\(-0.76\)) because SDM counts both channels.
Only \(\rho\) is indispensable; \(\theta\) and \(\lambda\) survive on economics, not on a Wald test.
Act III
−1.20
SDEM income spillover, \(W\cdot INC\) (\(z = -2.10\), \(p = 0.036\)) — significant even after a spatial error term
| Effect | OLS | SAR | SEM | SLX | SDM | SDEM | GNS |
|---|---|---|---|---|---|---|---|
| INC direct | −1.60 | −1.10 | −0.94 | −1.10 | −1.03 | −1.05 | −1.03 |
| INC indirect | 0 | −0.76 | 0 | −1.40 | −1.50 | −1.20 | −1.37 |
| INC total | −1.60 | −1.86 | −0.94 | −2.50 | −2.52 | −2.26 | −2.40 |
Direct effects are stable everywhere; the spillover is where the models disagree — and SLX/SDM/SDEM/GNS all say it is large and negative.
40–55%
SDM/SDEM total income effect (\(-2.3\) to \(-2.5\)) vs the OLS estimate of \(-1.60\)
| Param | SDM | SDEM | GNS |
|---|---|---|---|
| \(\rho\) | 0.40** | — | 0.32 (p=.74) |
| \(\lambda\) | — | 0.40** | 0.15 (p=.88) |
| \(W\cdot INC\) | −0.58 | −1.20** | −0.69 (p=.68) |
Seven spatial parameters chasing 49 observations: \(\rho\), \(\lambda\), and the \(\theta\)’s are only weakly identified together, so every one goes insignificant.
Objection. You’ve shown neighbours’ income predicts crime — surely that’s a policy lever.
Response. It is a robust spatial pattern, not a causal effect.
Treat \(-1.20\) as a well-measured association, not a treatment effect.