Multiscale Geographically Weighted Regression

Spatially varying economic convergence across Indonesia’s 514 districts

0.762MGWR R² vs 0.214 global

44districts per local window

149 / 514significant catching-up (29%)

Carlos Mendez

Nagoya University (GSID)

July 8, 2026

The Tension

Act I

One national convergence number forces 514 different places onto a single line

A standard convergence test asks: do poorer regions catch up to richer ones? For Indonesia it answers with one coefficient: \(\beta = -0.195\).

But Indonesia spans 17,000 islands across 5,000 km of ocean. Should Sumatra and Papua really share the same catching-up rate?

The global fit is so weak it explains barely a fifth of growth

Global \(\beta\)-convergence regression: log GDP per capita 2010 vs growth 2010–2018. One orange OLS line through 514 districts.

Where we’re going

The data: 514 Indonesian districts, initial income and subsequent growth
The problem: a single coefficient hides spatial heterogeneity
The method: GWR, then MGWR — each variable at its own spatial scale
The payoff: R² jumps from 0.214 to 0.762, and only 29% of districts truly converge

The Investigation

Act II

The lab: 514 districts, one outcome, one predictor, a 5,000-km archipelago

Outcome \(g_i\) — GDP growth rate, 2010–2018 (mean 0.39, range \(-2.05\) to \(+2.06\))
Predictor \(\ln(y_{i,2010})\) — log GDP per capita in 2010 (range 7.17 to 13.44)
Geography — 514 districts across 17,000 islands, from dense Java to remote Papua

Cross-section from the QuaRCS repository; convergence means a negative slope on initial income.

The naked-eye maps already show geography is organized, not random

Two-panel choropleth: (a) log GDP per capita 2010 and (b) growth rate 2010–2018. Patterns cluster — they are not scattered.

\(\beta\)-convergence is one global slope on initial income

\[g_i = \alpha + \beta \cdot \ln(y_{i,2010}) + \varepsilon_i\]

A negative \(\beta\) means poorer districts grow faster — the gap shrinks. Here \(\beta = -0.195\), but it is a single number for the whole country.

GWR lets every district keep its own slope and intercept

\[g_i = \alpha(u_i, v_i) + \beta(u_i, v_i) \cdot \ln(y_{i,2010}) + \varepsilon_i\]

Now \(\alpha\) and \(\beta\) are functions of location \((u_i, v_i)\) — one regression per place.

GWR’s flaw: one bandwidth for all variables — MGWR removes it

GWR

One bandwidth \(h\) for all coefficients
Intercept and slope forced to vary at the same spatial scale

MGWR

A separate bandwidth per variable
Back-fitting cycles each variable, optimizing \(h\) while others are held fixed

The “multiscale” point: baseline conditions may vary over large regions while convergence speed shifts sharply between neighbours.

Four lines fit MGWR on standardized variables in Python

y = gdf["g"].values.reshape((-1, 1)); X = gdf[["ln_gdppc2010"]].values
coords = list(zip(gdf["COORD_X"], gdf["COORD_Y"]))

Zy = (y - y.mean(0)) / y.std(0)          # standardize: required for MGWR
ZX = (X - X.mean(0)) / X.std(0)          # makes bandwidths comparable

selector = Sel_BW(coords, Zy, ZX, multi=True, spherical=True)
bw = selector.search()                    # back-fitting finds per-variable h
results = MGWR(coords, Zy, ZX, selector, spherical=True).fit()

MGWR picks a tight window of 44 districts — about 8.6% of the sample

Variable	Bandwidth	\(ENP_j\)	Adj. \(t(95\%)\)
Intercept	44	26.81	3.13
Convergence slope	44	25.27	3.11

Both variables converge on the same window here; with more covariates MGWR could assign each a very different scale.

The intercept reveals an east–west growth gradient

MGWR intercept map (bandwidth = 44). Western districts negative; eastern districts positive.

Catching-up is intense in western Sumatra, absent across most of the country

MGWR convergence-coefficient map. Deep blue (\(-1.74\)) = strong catching-up; light pink = no convergence.

The local slope ranges from −1.74 to +0.42 — nowhere near a single −0.195

−1.74 → +0.42

range of the local convergence coefficient \(\hat\beta(u_i,v_i)\) · global OLS reports just \(-0.195\)

The Resolution

Act III

Going local triples the explained variance and slashes AICc by 500

Metric	Global OLS	MGWR
\(R^2\)	0.214	0.762
Adj. \(R^2\)	0.212	0.736
AICc	1341.25	838.41
Bandwidth	all (514)	44

Adjusted \(R^2\) of 0.736 already nets out the 52 effective parameters — the gain is real, not overfitting.

Only 149 of 514 districts truly converge — and none diverge

Significance map: blue = significant catching-up (149 districts), grey = not significant (365), no significant divergence.

Indonesia’s apparent national convergence is concentrated in 29% of districts

29%

of districts (149 / 514) show statistically significant catching-up — the rest are flat

Did MGWR just overfit its way to a high R²? No

Objection. A model with 52 effective parameters can always beat a 2-parameter line on \(R^2\).

Response. The adjusted \(R^2\) (0.736) already penalizes those parameters, and AICc — which explicitly penalizes complexity — falls by over 500. The bandwidth of 44 is data-selected, not tuned to inflate fit. The gain reflects genuine spatial structure, not flexibility.

MGWR does not identify causes — it maps where a relationship lives

Objection. Does a high local \(\beta\) prove poorer districts caused faster growth there?

Response. No. MGWR is descriptive: it shows where the income–growth association is strong, not why. The bivariate model omits human capital, infrastructure, and institutions — extending it is the natural next step.

Let geography, not a single national coefficient, tell you where catching-up happens.