Cross-Sectional Spatial Regression in Stata

Crime in Columbus neighborhoods — the SAR/SEM/SDM taxonomy

0.428SAR spatial lag, rho
−1.20income spillover, SDEM
40–55%larger total effect vs OLS

Carlos Mendez

Nagoya University (GSID)

June 11, 2026

The Tension

Act I

Crime does not respect neighborhood boundaries — but OLS pretends it does

A tract’s crime depends on its own income and housing — and on what happens next door: displacement, diffusion, shared risk.

Treat 49 neighborhoods as 49 independent observations and you may bias every coefficient. So we must test that assumption.

OLS residuals cluster in space: Moran’s I = 0.222, p = 0.005

\[I = \frac{N}{S_0}\cdot\frac{e' W e}{e' e}\]

On the OLS residuals, \(I = 0.222\) (\(z = 2.84\), \(p = 0.005\)): high-crime tracts neighbour high-crime tracts. The i.i.d.-error assumption fails.

Where we’re going

  • The lab: 49 tracts, a Queen-contiguity weight matrix \(W\)
  • The eight-model taxonomy, from OLS up to GNS
  • Three spatial channels: \(\rho W y\), \(W X\theta\), \(\lambda W u\)
  • Direct vs indirect (spillover) effects — and which model to trust

The Investigation

Act II

The lab: 49 Columbus tracts linked by a Queen-contiguity matrix \(W\)

  • OutcomeCRIME: burglaries + vehicle thefts per 1,000 households (mean 35.1)
  • CovariatesINC (income, $1k; mean 14.4) and HOVAL (house value, $1k; mean 38.4)
  • Neighbours — Queen contiguity: share a border or a vertex; row-standardized

Row-standardized means each row of \(W\) sums to 1, so the spatial lag \(W y\) is the weighted average among a tract’s neighbours — about 4.8 of them.

One W matrix, two lines of Mata, and the spatial lags are ready

spmatrix fromdata W = v*, normalize(row) replace
mata: spmatrix_matafromsp(W_mata, id_vec, "W")
gen double W_INC = .
mata: st_store(., "W_INC", W_mata * inc)   // neighbours' avg income
gen double W_HOVAL = .
mata: st_store(., "W_HOVAL", W_mata * hoval)

\(W\cdot INC\) and \(W\cdot HOVAL\) are pre-computed in Mata, then entered as ordinary regressors in SLX/SDM/SDEM/GNS.

Eight models, three switches: turn \(\rho\), \(\theta\), \(\lambda\) on or off

The three channels

  • \(\rho W y\) — neighbours’ outcomes feed back (lag of \(y\))
  • \(W X\theta\) — neighbours’ covariates matter (lag of \(X\))
  • \(\lambda W u\) — neighbours’ errors correlate (spatial error)

The nested family

  • OLS — none on
  • SAR \(\rho\) · SEM \(\lambda\) · SLX \(\theta\)
  • SDM \(\rho{+}\theta\) · SDEM \(\theta{+}\lambda\) · SAC \(\rho{+}\lambda\)
  • GNS — all three on

LM tests point first to the error model — but it’s only a hint

Test Statistic \(p\) Reading
LM-error (\(\lambda\)) 5.33 0.021 favours SEM
LM-lag (\(\rho\)) 3.40 0.065 weaker
Robust LM-error 2.19 0.139 survives
Robust LM-lag 0.26 0.612 fades

Anselin’s rule favours SEM here — but the full taxonomy will tell a subtler story.

SAR: a tract’s crime depends directly on its neighbours’ crime, \(\rho = 0.428\)

\[y = \rho W y + X\beta + \varepsilon\]

\(\rho = 0.428\) (\(z = 3.49\), \(p < 0.001\)): a contagion channel. Because \(W y\) is endogenous, we fit by maximum likelihood, not OLS.

SEM: spatial dependence hides in the errors, \(\lambda = 0.562\)

\[y = X\beta + u, \quad u = \lambda W u + \varepsilon\]

\(\lambda = 0.562\) (\(z = 4.23\), \(p < 0.001\)). Spatial dependence is a nuisance in the error — so the SEM yields zero spillover effects by construction.

SLX: neighbours’ income lowers your crime — a local spillover of \(-1.40\)

\[y = X\beta + W X\theta + \varepsilon\]

Direct Indirect Total
INC −1.10*** −1.40** −2.50***
HOVAL −0.29*** +0.21 −0.08

\(W\cdot INC = -1.40\) (\(p = 0.016\)): richer neighbours mean less own crime. No spatial multiplier, so \(\theta\) is the indirect effect.

SDM: combine lag-of-\(y\) and lag-of-\(X\) — the general-purpose hub

\[y = \rho W y + X\beta + W X\theta + \varepsilon\]

Two spillover channels at once: global feedback through \(\rho W y\) and local spillover through \(W X\theta\).

The SDM nests SAR (\(\theta = 0\)), SLX (\(\rho = 0\)), and SEM (common-factor) — so we can test down from it.

In the SDM the income spillover swells to \(-1.50\) once feedback is counted

Direct Indirect Total
INC −1.03*** −1.50* −2.52***
HOVAL −0.28*** +0.22 −0.07

Direct effects stay near the other models; the indirect income effect is larger than in SAR (\(-0.76\)) because SDM counts both channels.

Test down from the SDM: only the lag-of-\(y\) refuses to be dropped

Cannot drop

  • SLX restriction \(\rho = 0\)
  • LR \(\approx 7.4\), 1 df — rejected
  • the global feedback is real

Can’t reject dropping

  • SAR restriction \(\theta = 0\) (LR \(\approx 2.0\))
  • SEM common factor (LR \(\approx 4.0\))
  • but power is thin at \(n = 49\)

Only \(\rho\) is indispensable; \(\theta\) and \(\lambda\) survive on economics, not on a Wald test.

The Resolution

Act III

A $1,000 rise in neighbours’ income cuts your crime by 1.20

−1.20

SDEM income spillover, \(W\cdot INC\) (\(z = -2.10\), \(p = 0.036\)) — significant even after a spatial error term

The four \(\theta\)-models agree: income spillovers are large and negative

Effect OLS SAR SEM SLX SDM SDEM GNS
INC direct −1.60 −1.10 −0.94 −1.10 −1.03 −1.05 −1.03
INC indirect 0 −0.76 0 −1.40 −1.50 −1.20 −1.37
INC total −1.60 −1.86 −0.94 −2.50 −2.52 −2.26 −2.40

Direct effects are stable everywhere; the spillover is where the models disagree — and SLX/SDM/SDEM/GNS all say it is large and negative.

Ignore spillovers and you understate income’s total effect by 40–55%

40–55%

SDM/SDEM total income effect (\(-2.3\) to \(-2.5\)) vs the OLS estimate of \(-1.60\)

Turn on all three channels and the GNS collapses into noise

Param SDM SDEM GNS
\(\rho\) 0.40** 0.32 (p=.74)
\(\lambda\) 0.40** 0.15 (p=.88)
\(W\cdot INC\) −0.58 −1.20** −0.69 (p=.68)

Seven spatial parameters chasing 49 observations: \(\rho\), \(\lambda\), and the \(\theta\)’s are only weakly identified together, so every one goes insignificant.

Does the spillover make this causal? No — it disciplines description, not identification

Objection. You’ve shown neighbours’ income predicts crime — surely that’s a policy lever.

Response. It is a robust spatial pattern, not a causal effect.

Treat \(-1.20\) as a well-measured association, not a treatment effect.

Let the data choose the channel — but never forget your neighbours.