Regional Inequality from Outer Space

Nighttime lights become a global income map — and reveal an N-shaped Kuznets curve

0.102light elasticity · lights → income

0.925predicted vs observed · calibration fit

0.071ethnic Gini · strongest driver

Carlos Mendez

Nagoya University (GSID)

July 8, 2026

The Problem

Act I

Almost every country reports one GDP number — and nothing about its insides

A government can tell you its national GDP, but rarely the income of each province inside it.

Two countries with the same national income can look completely different on the inside — one a single booming capital ringed by poor hinterlands, the other broadly shared. Without subnational data, that gap is invisible.

The idea: let satellites do the accounting — brighter places are, on average, richer

Lessmann and Seidel (2017) use nighttime light as a stand-in for income: electricity, roads, and activity all glow, so brightness correlates with output where statistics do not exist.

Their pipeline, rebuilt here in Python end to end:

Predict regional GDP from light + a few controls
Construct population-weighted inequality indices from those predictions
Estimate how regional inequality moves with development

Where we’re going

Calibration — turn light into income; how good is the fit?
Construction — five inequality indices from scratch; why population weights matter
The curve — an N-shaped regional Kuznets relationship
Drivers & robustness — ethnic inequality, and a spatial-HAC stress test

The Calibration

Act II

The lab: 5,258 region-years calibrate the model; 180 countries get measured

Calibration sample — 1,504 subnational regions in 81 countries that have both observed GDP and a light reading (5,258 region-years)
Country panel — 180 countries, 1992–2012, each carrying inequality indices built from its regions
Two units, kept straight — predict at the region level, measure inequality at the country level

Mean regional Gini $= 0.064$ (SD $0.033$, max $0.163$): most countries are internally fairly equal, with a long unequal tail.

Light becomes income through a calibrated elasticity, net of national income and geography

\[y_r = \beta_0 + \beta_1 \ell_r + \beta_2 g_c + \gamma' X_r + \mu_g + \tau_s + \varepsilon_r\]

A region’s log income $y_r$ = baseline, plus elasticity $\beta_1$ on its log light $\ell_r$, plus a near-one-for-one adjustment $\beta_2$ for its country’s income $g_c$, plus geography $X_r$, world-region $\mu_g$ and satellite $\tau_s$ effects. The number we care about is $\beta_1$.

The calibrated light elasticity is 0.102, with regional income tracking national income one-for-one

0.102

random-effects light elasticity (col 7) · national-GDP elasticity $= 0.889$ · matches the paper exactly

The predictions hug the 45° line across four orders of magnitude of income

Predicted vs observed log regional GDP per capita, 5,258 region-years. The scatter tracks the 45° line from the poorest regions to the richest — the calibration generalises rather than fitting one income band ($r = 0.925$).

The Construction

Act II

Five inequality indices, built from scratch and weighted by population

\[\bar y = \frac{\sum_i w_i y_i}{\sum_i w_i}, \qquad p_i = \frac{w_i}{\sum_j w_j}, \qquad r_i = \frac{y_i}{\bar y}\]

Each country’s regional incomes $y_i$ and populations $w_i$ collapse to one number: the Gini, three generalized-entropy indices GE($-1$)/GE($0$)/GE($1$), and the coefficient of variation — every region counting in proportion to its people.

Population weights are not cosmetic — they correlate only 0.75 with equal weights

Population-weighted vs equal-weight Gini across country-years. Most points sit below the 45° line: weighting lowers measured inequality (mean gap $-0.0034$) because tiny income-extreme regions lose influence (corr $= 0.75$).

The Result

Act III

With country and period fixed effects, the regional Kuznets curve is an N — not a single hump

\[\text{GINIW}_{ct} = \beta_1 \ln Y_{ct} + \beta_2 (\ln Y_{ct})^2 + \beta_3 (\ln Y_{ct})^3 + \alpha_c + \delta_t + u_{ct}\]

Cubic term	Estimate	Sign
$\beta_1$ (linear)	$0.293$	rises with early development
$\beta_2$ (quadratic)	$-0.032$	then bends down
$\beta_3$ (cubic)	$0.001$	faint upturn at the very top

$N = 879$, 180 countries, 5-year periods. Same sign pattern across all five indices — the N is not an artefact of the Gini.

Three development phases, one descriptive association

Regional inequality (net of period effects) against log development, with the fitted cubic overlaid. The curve rises to a gentle peak near $3,000 per capita, declines through middle income, and ticks faintly upward at the top.

Early development — activity concentrates; inequality rises
Middle income — lagging regions catch up; inequality falls
The very richest — agglomeration re-concentrates; a faint rise

The Drivers

Act III

Beyond income, ethnic inequality is the strongest driver — by far

0.071

ethnic-Gini coefficient ($p < 0.001$, $N = 844$) · vs a mean regional Gini of $0.064$

Ranked side by side: ethnicity towers; farmland pulls the other way

Determinants of regional inequality. Ethnic inequality (0.071) dwarfs resource rents (+0.018), aid (+0.015), and trade (+0.005); arable land (−0.053) is the largest equalizing force.

Robustness

Act III

Allowing neighbours to share shocks doubles the standard error — the elasticity still holds

Conley spatial-HAC standard errors for the clean light elasticity ($\beta = 0.190$). The confidence interval widens with the radius — SE rises from 0.013 (iid) to 0.026/0.034/0.037 at 1,000/2,500/5,000 km — while the point estimate stays fixed and far from zero.

Inference	SE	$t \approx$
Naive (iid)	$0.013$	$14$
Conley 1,000 km	$0.026$	$7$
Conley 5,000 km	$0.037$	$5$

The point estimate $\beta = 0.190$ never moves; only the honest uncertainty grows.

Does machine-assembled satellite data make this causal? No

Objection. You absorbed country and period effects and survived a spatial-HAC test — surely development causes this inequality path?

Response. No. The lights→GDP step is a prediction model, not a structural relationship; the Kuznets and determinant regressions are within-country associations conditional on the FE, not causal effects. The income figures are predictions — accurate on average, wrong for any single unusual region.

You can now see inequality inside a country with no statistical office.

Predict income from light · weight by people · let the curve bend twice.

Inference	SE	\(t \approx\)
Naive (iid)	\(0.013\)	\(14\)
Conley 1,000 km	\(0.026\)	\(7\)
Conley 5,000 km	\(0.037\)	\(5\)

Cubic term	Estimate	Sign
\(\beta_1\) (linear)	\(0.293\)	rises with early development
\(\beta_2\) (quadratic)	\(-0.032\)	then bends down
\(\beta_3\) (cubic)	\(0.001\)	faint upturn at the very top