Spatial inequality | Carlos Mendez

Beta and Sigma Convergence Across Countries: A Stata Tutorial

Wed, 29 Apr 2026 00:00:00 +0000

1. Overview

Are poorer countries catching up to richer ones? This is one of the most fundamental questions in development economics. If convergence holds, then the vast income gaps we observe today should eventually close on their own as low-income economies grow faster than high-income ones. If it does not hold, then without deliberate policy intervention, the gap will persist — or even widen.

For decades, the empirical evidence was discouraging. From 1960 to 2000, there was no sign that poorer countries were growing faster. If anything, richer countries pulled further ahead. But Patel, Sandefur, and Subramanian (2021) documented a striking reversal: since around the year 2000, the world has entered a new era of unconditional convergence, with poorer countries finally growing faster than richer ones — no controls for institutions, human capital, or policy needed.

This tutorial walks through the complete convergence toolkit in Stata, from the simplest two-period regression to advanced heatmaps covering every possible time window. We use Penn World Tables 10.0 data for a balanced panel of 84 countries with data available since 1960 and ask: How fast is convergence happening, and is the global income distribution actually narrowing? The answer involves two distinct concepts — beta convergence (do poor countries grow faster?) and sigma convergence (is the income spread shrinking?) — and the surprising finding that one does not guarantee the other.

A distinctive feature of this tutorial is its comparative approach to measuring convergence speed. We first show how to extract the speed of convergence from standard OLS output using a simple algebraic conversion, then introduce Nonlinear Least Squares (NLS) as a direct estimation method. Students learn that both approaches yield the same structural parameter — building intuition before complexity.

Learning objectives

Estimate beta convergence using OLS and interpret the sign of the slope coefficient
Identify the structural break between the era of divergence (1960–2000) and the era of convergence (2000–2019)
Compute the speed of convergence and half-life from OLS output using an algebraic conversion
Understand what Nonlinear Least Squares (NLS) is, why it is needed, and how to estimate it in Stata
Compare OLS-derived and NLS-derived convergence estimates
Construct rolling-window visualizations for both OLS and NLS to assess robustness
Measure sigma convergence using the variance of log GDP per capita
Understand why beta convergence is necessary but not sufficient for sigma convergence
Build convergence heatmaps to visualize every possible time window

Key concepts at a glance

The post leans on a small vocabulary repeatedly. The rest of the tutorial assumes you can move between these terms quickly. Each concept below has three parts. The definition is always visible. The example and analogy sit behind clickable cards: open them when you need them, leave them collapsed for a quick scan. If a later section mentions “structural break” or “half-life” and the term feels slippery, this is the section to re-read.

1. Beta convergence $\lambda$. The OLS slope coefficient when annualized growth is regressed on log initial income. A negative $\lambda$ means poorer countries grew faster than richer ones — they “caught up”. A positive $\lambda$ means the opposite: divergence.

Example

Over 2000–2019, $\lambda = -0.00352$ (p = 0.019). Convergence has emerged. Over 1960–2000, $\lambda = +0.00437$ (p = 0.007) — divergence. The full-period (1960–2019) coefficient is essentially zero (0.00057, p = 0.661). The two regimes cancel.

Analogy

A catching-up race. If the runner who started at the back is moving faster, the gap to the leader is closing. Beta convergence asks whether poor countries are running faster than rich ones — does the rear runner have more horsepower?

2. Sigma convergence $\sigma_t^2$. The variance (or standard deviation) of log GDP per capita across countries at time $t$. Convergence in the sigma sense means $\sigma_t$ is falling over time — the cross-country distribution of incomes is narrowing.

Example

In our 84-country sample, the variance of log gdppc rose from 0.924 in 1960 to 1.918 in 2008 (peak), then eased to 1.764 by 2019. The world did not sigma-converge over 1960–2019. Beta convergence after 2000 is a necessary precondition for future sigma convergence, not a guarantee.

Analogy

A flock of birds. Sigma convergence asks whether the flock is tightening — are the laggards catching the leaders? The flock can briefly tighten even when individual birds are accelerating away from each other.

3. Speed of convergence $\beta$. The structural parameter from the Barro–Sala-i-Martin model. Different from the OLS $\lambda$. Computed via $\beta = -\ln(1 + \lambda T)/T$, where $T$ is the period length. Bigger $\beta$ means a faster catch-up engine.

Example

Plugging $\lambda = -0.00352$ and $T = 19$ years into the conversion gives $\beta = 0.00365$. Less than half a percent per year. The catching-up engine, once it turned on after 2000, runs at idle.

Analogy

Horsepower of the catch-up engine. The OLS slope $\lambda$ is the speedometer reading. The structural $\beta$ is what the engine can actually deliver — the underlying capacity to close gaps.

4. Half-life $\tau = \ln(2)/\beta$. The number of years required to close half of the existing income gap at the current convergence speed. A natural reading of $\beta$ on a human time scale.

Example

With $\beta = 0.00365$, the half-life is 190 years. Half of the world’s current income gap will close in 190 years if convergence continues at this pace. Compare to the canonical 70-year half-life from cross-country growth regressions of the 1990s; the modern world converges much more slowly.

Analogy

Radioactive decay’s half-life. After one half-life, half the atoms are gone; after two, three-quarters; and so on. Income-gap half-life works the same way — but at 190 years, even a generation makes only a small dent.

5. Structural break. A point in time where the convergence coefficient changes its sign or magnitude. Identified by Chow tests, by visual inspection of rolling estimates, or by direct interaction with a year dummy.

Example

This dataset shows a clear break around 2000. Before: $\lambda = +0.00437$ (divergence). After: $\lambda = -0.00352$ (convergence). The full-period $\lambda$ averages the two regimes and looks like nothing happened — a textbook example of why pooled estimates can mislead.

Analogy

A thermostat flipping. Before the flip, the heater is on and the room is warming. After, the cooler is on and the room is cooling. Averaging the two periods reads as “no temperature change” — the flip is the story.

6. Nonlinear Least Squares (NLS). A direct estimator of the structural $\beta$ when it appears inside an exponential. Avoids the OLS-to-$\beta$ algebraic conversion. Stata’s nl command fits the nonlinear regression $g_i = (1 - e^{-\beta T})/T \cdot \ln(y_{i,0}) + \varepsilon_i$ in one shot.

Example

NLS on the 2000–2019 sample returns $\beta = 0.00365$ — the same as the OLS conversion. When the relationship is well-behaved, both routes coincide; the gap is a useful sanity check.

Analogy

Direct measurement vs proxy measurement. OLS-then-convert is the proxy: measure something simple ($\lambda$), then compute the structural quantity. NLS is the direct route: measure $\beta$ in one step.

7. Rolling window. Re-estimate the regression over every possible start year, holding the end year fixed. Each window produces one estimate. The sequence of estimates traces out how convergence has evolved.

Example

This post’s rolling window for $\lambda$ slides the start year from 1960 to 2000 with end year fixed at 2019. The line crosses zero around 1995, becomes solidly negative after 2000, and stabilizes near $-0.0035$ for the most recent windows.

Analogy

A sliding microscope across a slide. At each position you take a snapshot. The full sequence of snapshots is the rolling estimate — it shows how the local picture changes as you move along.

8. Cross-country dispersion $\sigma_t$. The standard deviation of log GDP per capita across countries at time $t$. The “$\sigma$” in $\sigma$-convergence. Tracks the width of the world income distribution year by year.

Example

The variance of log gdppc rose 90.8% from 0.924 in 1960 to 1.764 in 2019, with a peak of 1.918 in 2008. The dispersion narrative is the opposite of the post-2000 beta-convergence narrative: the rear runner is now faster, but the flock has not yet tightened.

Analogy

Standard deviation of incomes in a class. If everyone earns roughly the same, $\sigma$ is small. If a few earn very much and many earn very little, $\sigma$ is large. Sigma convergence asks whether $\sigma$ is shrinking over time.

2. Analytical roadmap

The tutorial progresses from the simplest possible convergence test to the most comprehensive. Each section builds on the previous one, adding complexity and robustness.

graph LR
A["<b>Simple OLS</b><br/>1960-2019<br/><i>Section 4</i>"]
B["<b>Two Eras</b><br/>Structural Break<br/><i>Section 5</i>"]
C["<b>Speed from OLS</b><br/>λ → β conversion<br/><i>Section 6</i>"]
D["<b>NLS Framework</b><br/>Direct estimation<br/><i>Sections 7-9</i>"]
E["<b>Rolling Windows</b><br/>λ, then β<br/><i>Sections 10-11</i>"]
F["<b>Sigma</b><br/>Convergence<br/><i>Sections 12-14</i>"]
G["<b>Heatmaps</b><br/>OLS & NLS<br/><i>Section 15</i>"]
A --> B --> C --> D --> E --> F --> G
style A fill:#6a9bcc,stroke:#141413,color:#fff
style B fill:#d97757,stroke:#141413,color:#fff
style C fill:#00d4c8,stroke:#141413,color:#141413
style D fill:#6a9bcc,stroke:#141413,color:#fff
style E fill:#d97757,stroke:#141413,color:#fff
style F fill:#00d4c8,stroke:#141413,color:#141413
style G fill:#6a9bcc,stroke:#141413,color:#fff

We start with the simplest OLS test (does initial income predict growth?), then split the sample to reveal a structural break. Next, we show how to extract the speed of convergence from OLS output using a straightforward algebraic conversion. We then introduce Nonlinear Least Squares (NLS) as a direct estimation method and compare the two approaches. A pedagogical introduction to rolling windows starts with the raw OLS coefficient $\lambda$ before progressing to the structural $\beta$, including a full walkthrough of how confidence intervals are constructed and transformed. We then shift from beta to sigma convergence, show why one does not imply the other, and track the income distribution over time. Finally, convergence heatmaps covering every possible time window provide the most comprehensive robustness check.

3. Setup and data preparation

We use the Penn World Tables version 10.0 (Feenstra, Inklaar, and Timmer, 2015), the standard dataset for cross-country income comparisons. It provides expenditure-side real GDP in purchasing power parity (PPP) terms, which makes incomes comparable across countries with different price levels. Following Patel et al. (2021), we exclude oil-producing countries (whose income reflects resource rents rather than productive convergence) and very small countries (population under 1 million). We further restrict the sample to a balanced panel of 84 countries with GDP per capita data available since 1960, ensuring that the same set of countries is used consistently across all sections of the tutorial.

* Load Penn World Tables 10.0
use "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/stata_convergence/pwt100.dta", clear
rename countrycode ccode
keep country ccode year pop rgdpe
* Compute GDP per capita (PPP, 2017 US$)
gen gdppc = rgdpe / pop
drop if missing(gdppc) | missing(pop)
* Exclude oil-producing countries (IMF classification, 25 countries)
gen oil = inlist(ccode, "DZA", "AGO", "AZE", "BHR", "BRN", "TCD", "COG") | ///
inlist(ccode, "ECU", "GNQ", "GAB", "IRN", "IRQ", "KAZ", "KWT") | ///
inlist(ccode, "NGA", "OMN", "QAT", "RUS", "SAU", "TTO", "TKM") | ///
inlist(ccode, "ARE", "VEN", "YEM", "LBY", "TLS", "SDN")
drop if oil == 1
drop oil
* Exclude small countries (population < 1 million)
drop if pop < 1
* Restrict to 1960 onwards
drop if year < 1960
* Restrict to balanced panel: countries with data in 1960
bys ccode: egen has1960 = max(year == 1960 & !missing(gdppc))
keep if has1960 == 1
drop has1960
summarize gdppc, detail

 Real GDP per capita (PPP, 2017 US$)
-------------------------------------------------------------
Percentiles Smallest
1% 498.6677 368.2704
5% 805.8461 425.7048
10% 1048.736 498.6677 Obs 5,040
25% 1927.449 523.0073 Sum of wgt. 5,040
50% 4873.137 Mean 10811.48
Largest Std. dev. 14375.5
75% 14282.34 88681.06
90% 30734.83 89403.9 Variance 2.07e+08
95% 35014 90413.35 Skewness 2.158023
99% 55579.96 102937.7 Kurtosis 8.099127
Number of unique countries: 84

The cleaned dataset contains 5,040 country-year observations across 84 unique countries spanning 1960–2019. GDP per capita ranges from \$368 (the poorest country-year) to \$102,938 (the richest), with a median of \$4,873 and a mean of \$10,811. The large gap between mean and median — reinforced by a skewness of 2.16 — reflects the heavy right tail of the world income distribution: a small number of very rich countries pull the average far above the typical country. Because we restrict to countries with data available since 1960, this is a balanced panel: the same 84 countries appear in every year, eliminating composition effects that would arise if the sample grew over time.

4. Beta convergence: the simplest test

Beta convergence — sometimes called absolute or unconditional convergence — asks a simple question: do countries that start poorer grow faster? If they do, the income gap should eventually close without any need to control for differences in institutions, education, or policy. We test this using ordinary least squares (OLS) regression of the average annual growth rate on the log of initial income. Think of it like a race: if the runners at the back are faster than those at the front, the pack will eventually bunch together.

The regression equation is:

$$g_i = \alpha + \lambda \cdot \ln(y_{i,0}) + \varepsilon_i$$

In words, this says that the annualized growth rate of country $i$ ($g_i$) depends linearly on the log of its initial GDP per capita ($\ln(y_{i,0})$). A negative $\lambda$ means convergence: countries that start with lower income grow faster. A positive or zero $\lambda$ means divergence or no convergence. In the code, $g_i$ corresponds to the variable growth and $\ln(y_{i,0})$ corresponds to initial.

* Reshape to wide: one row per country
reshape wide gdppc, i(ccode country) j(year)
* Annualized growth rate over 59 years
local s = 2019 - 1960
gen growth = (1/`s') * ln(gdppc2019 / gdppc1960)
* Log initial income
gen initial = ln(gdppc1960)
drop if missing(growth) | missing(initial)
* OLS regression with robust standard errors
reg growth initial, robust

Linear regression Number of obs = 84
F(1, 82) = 0.19
Prob > F = 0.6606
R-squared = 0.0013
Root MSE = .01502
------------------------------------------------------------------------------
| Robust
growth | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
initial | .0005689 .0012908 0.44 0.661 -.0019988 .0031366
_cons | .0176868 .0112996 1.57 0.121 -.0047917 .0401653
------------------------------------------------------------------------------

Over the full 1960–2019 period, the OLS coefficient on initial income is 0.00057 — positive, tiny, and statistically insignificant (p = 0.661, t = 0.44). The R-squared is just 0.13%, meaning initial income in 1960 has essentially zero predictive power for subsequent growth. The 84 countries grew at an average rate of about 2.2% per year, but this growth was completely unrelated to starting income levels. In the scatter plot, the fitted line is essentially flat. This “null result” seems to settle the question: no convergence over six decades. But this conclusion is misleading, because it masks a dramatic structural break that the next section reveals.

5. The structural break: divergence vs. convergence

A single regression over 60 years hides a crucial story. The world changed in the mid-1990s. By splitting the sample at the year 2000, we can see two distinct eras: one where the income gap widened (divergence) and one where it began to close (convergence).

* Era of Divergence: 1960 to 2000
gen growth_era1 = (1/40) * ln(gdppc2000 / gdppc1960)
gen initial_era1 = ln(gdppc1960)
reg growth_era1 initial_era1, robust
* Era of Convergence: 2000 to 2019
gen growth_era2 = (1/19) * ln(gdppc2019 / gdppc2000)
gen initial_era2 = ln(gdppc2000)
reg growth_era2 initial_era2, robust

--- Era 1: 1960 to 2000 (the 'divergence era') ---
Linear regression Number of obs = 84
Prob > F = 0.0072
R-squared = 0.0436
growth_era1 | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
initial_era1 | .004366 .0015843 2.76 0.007 .0012143 .0075176
--- Era 2: 2000 to 2019 (the 'convergence era') ---
Linear regression Number of obs = 84
Prob > F = 0.0187
R-squared = 0.0688
growth_era2 | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
initial_era2 | -.0035228 .0014686 -2.40 0.019 -.0064442 -.0006013

The results reveal a dramatic reversal. During 1960–2000, the OLS coefficient is positive and significant ($\lambda$ = 0.00437, p = 0.007): richer countries grew faster, and the income gap widened. During 2000–2019, the coefficient flips to negative and significant ($\lambda$ = -0.00352, p = 0.019): poorer countries are now growing faster. The total swing of 0.0079 represents a complete reversal from divergence to convergence. This is what Patel et al. (2021) call “the new era of unconditional convergence.” But how fast is this convergence happening? The next section shows how to measure speed and half-life using nothing more than the OLS coefficient we already have.

6. Speed of convergence and half-life from OLS

Knowing that convergence exists is only the first step. We also want to know: how fast are poor countries catching up? The OLS coefficient $\lambda$ tells us the direction, but its magnitude depends on the length of the growth period ($s$), making it hard to compare across time windows. We need a structural parameter $\beta$ — the speed of convergence — that is invariant to period length.

The good news: we can extract $\beta$ directly from the OLS coefficient using a simple algebraic conversion. The relationship comes from the Barro and Sala-i-Martin (1992) convergence model, which implies that the OLS coefficient $\lambda$ and the structural speed $\beta$ are related by:

$$\lambda = -\frac{1 - e^{-\beta s}}{s}$$

In words, the OLS slope is a nonlinear function of the speed of convergence $\beta$ and the time span $s$. We can solve this equation for $\beta$ in four steps:

Step 1. Multiply both sides by $s$:

$$\lambda s = -(1 - e^{-\beta s})$$

Step 2. Rearrange:

$$e^{-\beta s} = 1 + \lambda s$$

Step 3. Take the natural log and solve for $\beta$:

$$\beta = \frac{-\ln(1 + \lambda s)}{s}$$

Step 4. Compute the half-life — how many years to close half the income gap:

$$\tau = \frac{\ln(2)}{\beta}$$

The classic benchmark from the convergence literature is $\beta \approx 0.02$ (2% per year) with a half-life of about 35 years (Barro and Sala-i-Martin, 1992; Sala-i-Martin, 1996). But that was for conditional convergence — controlling for human capital, institutions, and other factors. Unconditional convergence, which requires no controls, is much slower.

* For each period: run OLS, get λ, convert to β = -ln(1+λs)/s, compute half-life
foreach period in "1960-2019" "1960-2000" "1980-2019" "1990-2019" "1995-2019" "2000-2019" {
reg outcome initial_inc, robust
local lambda = _b[initial_inc]
* Convert OLS λ to structural β
local beta = -ln(1 + `lambda' * `s') / `s'
* Half-life
local halflife = ln(2) / `beta'
}

Speed of Convergence from OLS: λ → β → Half-Life
period lambda_ols beta_ols speed_ols halflife_ols n
1960-2000 .00436597 -.00402402 -.4024021 . 84
1960-2019 .00056889 -.00055955 -.0559547 . 84
1980-2019 .00113216 -.00110461 -.110461 . 84
1990-2019 -.00008191 .00008131 .0081305 8525.66 84
1995-2019 -.00178267 .00181768 .1817678 381.3365 84
2000-2019 -.00352278 .0036462 .3646201 190.0984 84
Benchmarks (Barro & Sala-i-Martin 1992, conditional convergence):
Speed: 2.00% per year
Half-life: 35 years

The table reveals a clear acceleration. For 1960–2000, the structural $\beta$ is negative (-0.00402), confirming divergence at a rate of 0.40% per year — incomes were spreading apart. As the start year moves forward, convergence emerges and strengthens: essentially zero for 1990–2019, 0.18% per year for 1995–2019, and 0.36% for 2000–2019. The 2000–2019 estimate of $\beta$ = 0.00365 with a half-life of 190 years means that at the current pace, the average developing country would close only half the gap to its steady-state income in nearly two centuries. This is roughly five times slower than the 35-year benchmark for conditional convergence. Unconditional convergence is statistically real, but it is extremely slow.

We computed these results using nothing more than OLS and an algebraic formula. But there is a more direct way to estimate $\beta$ — one that does not require any conversion. The next section introduces Nonlinear Least Squares.

7. What is Nonlinear Least Squares (NLS)?

The OLS-to-$\beta$ conversion in Section 6 works, but it goes backwards: we estimate $\lambda$ first, then convert to $\beta$. Can we estimate $\beta$ directly? Yes — using Nonlinear Least Squares (NLS).

Why can’t OLS estimate $\beta$ directly?

The Barro-Sala-i-Martin (1992) convergence equation is:

$$\frac{1}{s} \ln\left(\frac{y_{i,t+s}}{y_{i,t}}\right) = \alpha - \frac{1 - e^{-\beta s}}{s} \cdot \ln(y_{i,t}) + \varepsilon_i$$

The parameter $\beta$ appears inside an exponential: $e^{-\beta s}$. OLS requires that parameters enter the equation linearly — as coefficients that multiply variables. Since $\beta$ is trapped inside $\exp()$, OLS cannot estimate it directly. Instead, OLS estimates the entire expression $-\frac{1 - e^{-\beta s}}{s}$ as a single coefficient $\lambda$, and we must back out $\beta$ algebraically.

What does NLS do?

Like OLS, NLS minimizes the sum of squared residuals:

$$\min_{\alpha, \beta} \sum_{i=1}^{N} \left[ g_i - f(\ln y_{i,0}; \alpha, \beta) \right]^2$$

But unlike OLS, the function $f()$ can be any nonlinear function of the parameters. NLS uses an iterative algorithm:

Start with an initial guess for $\beta$ (e.g., $\beta_0 = 0.02$, the classic benchmark)
Compute predicted values and residuals given the current guess
Adjust $\beta$ in the direction that reduces the sum of squared residuals
Repeat until the improvement is negligible (the algorithm has “converged”)

How to estimate NLS in Stata

Stata’s nl command performs NLS estimation. The syntax places the entire nonlinear equation inside parentheses, with parameters in curly braces:

* NLS estimation for 2000-2019
local s = 19
nl (outcome = {b0=1} - (1 - exp(-1*{b1=0.02}*`s'))/`s' * initial_inc), vce(robust)

Reading the syntax:

{b0=1} — the intercept $\alpha$, with initial guess = 1
{b1=0.02} — the speed of convergence $\beta$, with initial guess = 0.02 (the 2% benchmark)
*19 — $s$ = 19 years (2000 to 2019)
initial_inc — $\ln(y_{2000})$, the independent variable
vce(robust) — heteroskedasticity-robust standard errors

Nonlinear regression Number of obs = 84
R-squared = 0.0704
Root MSE = .0215709
------------------------------------------------------------------------------
| Robust
outcome | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
/b0 | .0580907 .014098 4.12 0.000 .0300452 .0861362
/b1 | .0036462 .0015739 2.32 0.023 .0005152 .0067772
------------------------------------------------------------------------------
HOW TO READ THE OUTPUT:
/b1 = 0.00365 → This is β (speed of convergence)
Speed = 0.36% per year
Half-life = 190.1 years
COMPARISON with OLS conversion:
OLS λ = -0.00352
OLS → β = -ln(1 + -0.00352 × 19) / 19 = 0.00365
NLS β = 0.00365
Difference = 0.0000000

Why use NLS?

The advantage of NLS is that standard errors and p-values apply directly to $\beta$ itself. With OLS, the standard error applies to $\lambda$, and transforming it to $\beta$ requires the delta method — an additional mathematical step. NLS gives you $\beta$, its standard error, and a p-value in one shot. The advantage of OLS is simplicity: it is faster, always converges, and gives identical point estimates after conversion.

8. Speed of convergence and half-life from NLS

Now we estimate $\beta$ directly via NLS for the same six periods as Section 6. The results should match the OLS conversion, confirming that both methods recover the same structural parameter.

* NLS estimation for each period
foreach period in "1960-2019" ... "2000-2019" {
nl (outcome = {b0=1} - (1 - exp(-1*{b1=0.00}*`s'))/`s' * initial_inc), vce(robust)
}

Speed of Convergence from NLS (Direct Estimation of β):
period beta_nls se_nls speed_nls halflife_nls n
1960-2000 -.00402402 .0013502 -.4024021 . 84
1960-2019 -.00055955 .0012508 -.0559547 . 84
1980-2019 -.00110461 .0013178 -.110461 . 84
1990-2019 .00008131 .0014044 .0081305 8525.66 84
1995-2019 .00181768 .0014633 .1817678 381.3365 84
2000-2019 .00364620 .0015739 .3646201 190.0984 84
Benchmarks (Barro & Sala-i-Martin 1992, conditional convergence):
Speed: 2.00% per year
Half-life: 35 years

The NLS results confirm the same pattern as the OLS conversion. For 2000–2019, NLS estimates $\beta$ = 0.00365 (SE = 0.00157, p = 0.023), identical to the OLS-derived value. The speed of 0.36% per year and half-life of 190 years are consistent across both methods. Notice that NLS provides a direct p-value for $\beta$: p = 0.023 confirms that unconditional convergence since 2000 is statistically significant at the 5% level. For 1960–2000, the NLS estimate of $\beta$ = -0.00402 (p = 0.004) confirms statistically significant divergence.

9. OLS vs NLS comparison

How do the two methods compare side by side? The point estimates should be nearly identical, since both minimize the same sum of squared residuals — the only difference is whether $\beta$ is estimated directly (NLS) or recovered algebraically from $\lambda$ (OLS).

OLS vs NLS: Side-by-Side Comparison
period lambda_ols beta_ols beta_nls diff speed_ols speed_nls n
1960-2000 .00436597 -.00402402 -.00402402 1.110e-16 -.4024021 -.4024021 84
1960-2019 .00056889 -.00055955 -.00055955 1.388e-17 -.0559547 -.0559547 84
1980-2019 .00113216 -.00110461 -.00110461 4.337e-17 -.110461 -.110461 84
1990-2019 -.00008191 .00008131 .00008131 1.735e-17 .0081305 .0081305 84
1995-2019 -.00178267 .00181768 .00181768 4.337e-17 .1817678 .1817678 84
2000-2019 -.00352278 .00364620 .00364620 4.337e-17 .3646201 .3646201 84

The differences are on the order of $10^{-17}$ — effectively zero, confirming that the OLS conversion $\beta = -\ln(1 + \lambda s)/s$ and NLS direct estimation recover the same structural parameter. This equivalence holds because the Barro-Sala-i-Martin equation is a reparameterization of the linear model, not a fundamentally different specification. The choice between OLS and NLS is therefore about convenience, not correctness:

Use OLS when you want simplicity, speed, and guaranteed convergence of the estimation algorithm.
Use NLS when you want standard errors and p-values directly for $\beta$ without applying the delta method.

Both approaches are correct. In the rolling-window and heatmap sections that follow, we present results from both methods.

10. Introduction to rolling windows

So far we have estimated convergence for specific time periods (1960–2019, 1960–2000, 2000–2019). But convergence is not a fixed property — it evolves over time. A rolling window lets us watch this evolution by estimating a separate regression for every possible start year, always ending in 2019. Each start year produces one regression, one coefficient, and one dot on the plot.

graph TD
A["Start = 1960, End = 2019<br/>(59 years)"] --> R1["OLS → λ₁"]
B["Start = 1961, End = 2019<br/>(58 years)"] --> R2["OLS → λ₂"]
C["Start = 1962, End = 2019<br/>(57 years)"] --> R3["OLS → λ₃"]
D["..."] --> R4["..."]
E["Start = 2010, End = 2019<br/>(9 years)"] --> R5["OLS → λ₅₁"]
R1 --> P["Plot all 51 λ values<br/>against start year"]
R2 --> P
R3 --> P
R4 --> P
R5 --> P
style A fill:#6a9bcc,stroke:#141413,color:#fff
style B fill:#6a9bcc,stroke:#141413,color:#fff
style C fill:#6a9bcc,stroke:#141413,color:#fff
style E fill:#6a9bcc,stroke:#141413,color:#fff
style P fill:#d97757,stroke:#141413,color:#fff

We start with the simplest rolling window: the raw OLS slope coefficient $\lambda$. This requires nothing beyond the reg command we already know.

Rolling OLS lambda

For each start year from 1960 to 2010, we run the same OLS regression as in Section 4 — growth on initial income — and collect the slope coefficient $\lambda$ along with its 95% confidence interval. The CI uses the standard OLS formula:

$$\lambda \pm t_{N-2, 0.025} \times \text{SE}(\lambda)$$

where $t_{N-2, 0.025}$ is the critical value from the t-distribution with $N-2$ degrees of freedom (82 for our 84-country sample).

* For each start year, run OLS and store lambda + CI
forval startyear = 1960(1)2010 {
local s = 2019 - `startyear'
gen outcome = (1/`s') * ln(gdppc2019 / gdppc`startyear')
gen initial_inc = ln(gdppc`startyear')
reg outcome initial_inc, robust
* Store lambda and its 95% CI
local lambda = _b[initial_inc]
local se = _se[initial_inc]
local lambda_lb = `lambda' - invttail(e(df_r), 0.025) * `se'
local lambda_ub = `lambda' + invttail(e(df_r), 0.025) * `se'
drop outcome initial_inc
}

Rolling OLS Lambda: Key Findings
startyear lambda se lower upper n
1960 .0005689 .0012908 -.0019988 .0031366 84
1970 .0009814 .0012959 -.0015964 .0035592 84
1980 .0011322 .0013758 -.0016047 .0038690 84
1990 -.0000819 .0014043 -.0028757 .0027119 84
1995 -.0017827 .0014030 -.0045739 .0010085 84
2000 -.0035228 .0014686 -.0064442 -.0006013 84
2005 -.0041503 .0017255 -.0075825 -.0007181 84
2010 -.0030074 .0018375 -.0066619 .0006471 84

The rolling $\lambda$ tells the convergence story in its rawest form. For start years in the 1960s–1980s, $\lambda$ is positive (above the dashed zero line): richer countries grew faster, meaning divergence. Around 1990, $\lambda$ crosses zero and becomes increasingly negative: poorer countries are now growing faster. The 95% CI bars show that $\lambda$ is statistically distinguishable from zero (the entire CI is below zero) for start years from about 1998 onward. Notice that the sign convention for $\lambda$ is the opposite of $\beta$: negative $\lambda$ means convergence, while positive $\beta$ means convergence.

From lambda to beta: transforming the confidence interval

To convert the rolling $\lambda$ to the structural speed of convergence $\beta$, we apply the formula from Section 6: $\beta = -\ln(1 + \lambda s)/s$. But what about the confidence interval? We cannot simply plug the CI formula for $\lambda$ into the $\beta$ formula, because the transformation is nonlinear and monotone decreasing — a more negative $\lambda$ (stronger convergence) maps to a larger positive $\beta$. This means the bounds flip during transformation.

Let’s walk through this with the actual 2000–2019 estimates:

Step 1. The OLS CI for $\lambda$ (from the regression output):

$$\lambda = -0.00352, \quad \text{SE} = 0.00147, \quad s = 19$$

$$\text{CI for } \lambda: \quad [-0.00352 - 1.989 \times 0.00147, \quad -0.00352 + 1.989 \times 0.00147] = [-0.00645, \quad -0.00060]$$

Step 2. Transform each bound through $\beta = -\ln(1 + \lambda s)/s$:

$$\text{Lower } \lambda = -0.00645 \quad \Rightarrow \quad \beta = \frac{-\ln(1 + (-0.00645)(19))}{19} = \frac{-\ln(0.8775)}{19} = \frac{0.1307}{19} = 0.00688$$

$$\text{Upper } \lambda = -0.00060 \quad \Rightarrow \quad \beta = \frac{-\ln(1 + (-0.00060)(19))}{19} = \frac{-\ln(0.9886)}{19} = \frac{0.01147}{19} = 0.00060$$

Step 3. Notice the flip: the lower $\lambda$ bound (-0.00645) produced the upper $\beta$ bound (0.00688), and the upper $\lambda$ bound (-0.00060) produced the lower $\beta$ bound (0.00060). So:

$$\text{CI for } \beta: \quad [0.00060, \quad 0.00688]$$

This happens because $\beta = -\ln(1 + \lambda s)/s$ is a monotone decreasing function of $\lambda$: as $\lambda$ decreases (becomes more negative), $\beta$ increases (stronger convergence). In the code, we handle this by simply swapping the transformed bounds:

* Transform lambda CI to beta CI (bounds flip)
local beta_lb = -ln(1 + `lambda_ub' * `s') / `s' // upper lambda → lower beta
local beta_ub = -ln(1 + `lambda_lb' * `s') / `s' // lower lambda → upper beta

With this understanding, we can now construct rolling windows for the structural speed $\beta$ using both OLS (with the conversion) and NLS (direct estimation).

11. Rolling beta convergence over time

We now apply the rolling-window approach to the structural speed of convergence $\beta$, using both methods from Sections 6–9. For each start year from 1960 to 2010, with end year fixed at 2019, we estimate $\beta$ via:

OLS: estimate $\lambda$, convert to $\beta = -\ln(1+\lambda s)/s$, transform CI bounds (with the flip)
NLS: estimate $\beta$ directly, CI comes straight from the standard error

OLS rolling beta

* For each start year, estimate OLS and convert λ → β
forval startyear = 1960(1)2010 {
local s = 2019 - `startyear'
reg outcome initial_inc, robust
local lambda = _b[initial_inc]
local se = _se[initial_inc]
* Convert lambda to beta
local beta = -ln(1 + `lambda' * `s') / `s'
* Convert CI (bounds flip due to monotone decreasing transformation)
local lambda_lb = `lambda' - invttail(e(df_r), 0.025) * `se'
local lambda_ub = `lambda' + invttail(e(df_r), 0.025) * `se'
local beta_lb = -ln(1 + `lambda_ub' * `s') / `s' // upper λ → lower β
local beta_ub = -ln(1 + `lambda_lb' * `s') / `s' // lower λ → upper β
}

Rolling OLS Beta Convergence: Key Findings
startyear beta speed_pct halflife n
1960 -.0005596 -.0559555 . 84
1970 -.000968 -.0967977 . 84
1980 -.001105 -.1104993 . 84
1990 .0000813 .0081259 8530.064 84
1995 .0018177 .1817691 381.334 84
2000 .0036462 .3646227 190.100 84
2005 .0044101 .4410113 157.172 84
2010 .0030897 .3089731 224.339 84

NLS rolling beta

* For each start year, estimate NLS β directly
forval startyear = 1960(1)2010 {
local s = 2019 - `startyear'
nl (outcome = {b0=1} - (1 - exp(-1*{b1=0.00}*`s'))/`s' * initial_inc), vce(robust)
* CI comes directly: beta ± t × SE(beta)
}

Rolling NLS Beta Convergence: Key Findings
startyear beta speed_pct halflife n
1960 -.0005596 -.0559555 . 84
1970 -.000968 -.0967977 . 84
1980 -.001105 -.1104993 . 84
1990 .0000813 .0081259 8530.064 84
1995 .0018177 .1817691 381.334 84
2000 .0036462 .3646227 190.100 84
2005 .0044101 .4410113 157.172 84
2010 .0030897 .3089731 224.339 84

The rolling $\beta$ tells a clear story of transition, and the OLS and NLS results are identical in every row. For start years in the 1960s through mid-1980s, $\beta$ is negative — divergence. It then climbs steadily through the 1990s, crosses zero around 1990, and peaks at 0.00441 for start year 2005 (speed = 0.44%/yr, half-life = 157 years). For the most recent start years (2009–2010), the coefficient pulls back slightly to 0.00309 (half-life = 224 years), suggesting that convergence may have moderated — possibly reflecting effects of the 2008 financial crisis. The two figures look identical because the OLS conversion and NLS give the same point estimates; the only difference is that the NLS confidence intervals are derived directly from $\beta$’s standard error, while the OLS intervals are transformed from $\lambda$’s (with the bound-flipping described in Section 10). With convergence dynamics established, we now turn to a different question: is the actual spread of income across countries narrowing?

12. Sigma convergence: is the spread narrowing?

Beta convergence asks whether poorer countries grow faster. Sigma convergence asks a different question: is the dispersion of income across countries getting smaller? We measure dispersion using the variance of log GDP per capita. If the variance decreases over time, incomes are bunching together (sigma convergence). If it increases, incomes are spreading apart (sigma divergence).

* Variance of log GDP per capita in 1960
gen logy = ln(gdppc)
ci variances logy if year == 1960
* Variance of log GDP per capita in 2019
ci variances logy if year == 2019

--- Cross-country dispersion in 1960 ---
Variable | Obs Variance [95% conf. interval]
logy | 84 .9244376 .6969585 1.285409
Std. Dev. = 0.9615
--- Cross-country dispersion in 2019 ---
Variable | Obs Variance [95% conf. interval]
logy | 84 1.763502 1.329631 2.452057
Std. Dev. = 1.3280
Sigma Convergence Test: 1960 vs 2019:
Change in variance: 0.8391 ( 90.8%)
Variance INCREASED: evidence of sigma-DIVERGENCE.

The error bars in the figure show 95% confidence intervals for the variance, computed using the chi-squared distribution. Stata’s ci variances command uses the formula:

$$\text{CI for } \sigma^2 = \left[\frac{(N-1) s^2}{\chi^2_{\alpha/2, N-1}}, \quad \frac{(N-1) s^2}{\chi^2_{1-\alpha/2, N-1}}\right]$$

where $s^2$ is the sample variance, $N$ = 84 countries, and $\chi^2_{\alpha/2, N-1}$ is the critical value from the chi-squared distribution with $N-1$ = 83 degrees of freedom. This is the standard CI for a variance under the assumption that the data (log GDP per capita) is approximately normally distributed. Unlike the symmetric OLS confidence interval ($\hat{\theta} \pm t \times \text{SE}$), the chi-squared CI is asymmetric — the upper tail extends further than the lower tail, reflecting the right-skewed nature of the chi-squared distribution. This asymmetry is visible in the error bars: the upper whisker is longer than the lower one.

Comparing the two endpoints, the variance of log GDP per capita increased by 90.8%, from 0.924 in 1960 to 1.764 in 2019. The standard deviation rose from 0.96 to 1.33. In 2019, a one-standard-deviation move along the world income distribution corresponds to a roughly 3.8-fold difference in living standards ($e^{1.33}$ = 3.78), up from a 2.6-fold difference in 1960 ($e^{0.96}$ = 2.61). This is clear evidence of sigma divergence over the full period: the world income distribution widened substantially, even though beta convergence exists in the recent era. How can poorer countries be growing faster and the income spread be widening at the same time? The next section explains this apparent paradox.

13. Why beta convergence is not enough

The seeming contradiction — beta convergence without sigma convergence — is not a paradox but a well-known theoretical result. Young, Higgins, and Levy (2008) proved that beta convergence is necessary but not sufficient for sigma convergence. Think of it like a race with wind gusts: even if the runners at the back are faster on average (beta convergence), random gusts can push some runners forward and others backward, keeping the pack spread out (no sigma convergence). The catch-up tendency must be strong enough to overcome the dispersing force of random shocks before the distribution actually narrows.

* Decade-by-decade OLS λ and variance of log income
foreach decade in 1960 1970 1980 1990 2000 2010 {
* OLS slope of growth on initial income
reg g_temp i_temp, robust
* Variance of log income at start of decade
summarize logy_temp
}

 Decade | OLS λ | σ² start | Interpretation
1960-1970 | 0.00594 | 0.9244 | λ≥0: divergence
1970-1980 | 0.00555 | 1.0818 | λ≥0: divergence
1980-1990 | 0.00686 | 1.2893 | λ≥0: divergence
1990-2000 | 0.00882 | 1.5384 | λ≥0: divergence
2000-2010 | -0.00379 | 1.8937 | λ<0: convergence
2010-2019 | -0.00305 | 1.8262 | λ<0: convergence

The decade-by-decade view confirms the theory in action. The OLS $\lambda$ turns negative (convergence) in 2000–2010, but the variance of log income does not begin declining until after 2008 — it peaks at 1.918 in 2008 before falling to 1.826 by 2010 and 1.764 by 2019. This creates an approximately 8-year lag: poorer countries started growing faster around 2000, but the overall income distribution only began narrowing around 2008. For nearly a decade, random growth shocks — economic crises, commodity price swings, conflict — offset the systematic catch-up tendency before the convergence force became strong enough to dominate. Now that we have established both the existence and the timing of convergence, the next section tracks sigma convergence year by year.

14. Sigma convergence over time

We now track the dispersion of income every year from 1960 to 2019. Because we use a balanced panel of 84 countries, the sample composition is constant throughout — there is no need for a separate “fixed sample” series to control for changing coverage.

* Variance of log GDP per capita each year (84-country balanced panel)
forval yr = 1960(1)2019 {
gen logy = ln(gdppc`yr')
ci variances logy
drop logy
}

Sigma Convergence Over Time: Key Years
year variance n
1960 .9244376 84
1970 1.081847 84
1980 1.289282 84
1990 1.53844 84
2000 1.893675 84
2008 1.918209 84 (peak)
2010 1.826223 84
2019 1.763502 84

The error bars at each year are the chi-squared confidence intervals described in Section 12. Because our balanced panel has a constant $N$ = 84, the width of the CI at each year depends only on the variance itself: years with larger variance have wider bars in absolute terms. The bars do not reflect changes in sample size (which is constant throughout).

The variance series tells a two-act story. Act one (1960–2008): variance rose almost continuously from 0.924 to a peak of 1.918, an increase of 108% over nearly five decades. Act two (2008–2019): variance declined from 1.918 to 1.764, a drop of 8.1%. Sigma convergence is a genuinely recent phenomenon, emerging only after the mid-2000s. Even so, the 2019 variance (1.764) remains 91% higher than the 1960 value (0.924). The recent narrowing is real but has barely begun to undo decades of divergence. The next section provides the most comprehensive view of convergence by examining every possible time window.

15. The convergence heatmap

The heatmap is the most comprehensive visualization of convergence dynamics. For every possible start-year and end-year combination from 1960 to 2019, we estimate a separate regression — approximately 1,770 regressions — and color-code the result. Blue indicates convergence ($\beta > 0$) and red indicates divergence ($\beta < 0$). We produce two heatmaps: one using the OLS $\lambda \to \beta$ conversion and one using NLS direct estimation, following Patel et al. (2021) Figure 2.

* Loop over ALL start/end year combinations
forval startyear = 1960(1)2018 {
forval outcomeyear = `startyear'+1 (1) 2019 {
* OLS: estimate λ, convert to β = -ln(1+λs)/s
reg outcome initial_inc, robust
* NLS: estimate β directly
nl (outcome = {b0=1} - (1 - exp(-1*{b1=0.00}*`s'))/`s' * initial_inc), vce(robust)
}
}

OLS heatmap

NLS heatmap

The pattern is strikingly clear and identical across both methods. The upper-right triangle (periods ending in 2010–2019) is dominated by blue, while the central and lower-left regions (periods ending before 2000) are dominated by red. The deepest red ($\beta < -0.0055$) is concentrated in short windows during the 1970s–1980s, when divergence was strongest. The deepest blue ($\beta > 0.0035$) appears for windows ending in 2015–2019 and starting after 1990. The transition from red to blue occurs gradually along diagonals, with the crossover point moving from the upper right toward the center. This confirms that the convergence finding is not an artifact of choosing specific endpoints — it appears robustly across many time windows. Along the diagonal (short intervals), estimates are noisier due to shorter periods. The two heatmaps are virtually indistinguishable, providing a final confirmation that OLS conversion and NLS direct estimation yield the same results.

16. Discussion

We set out to ask whether the world has entered a new era of unconditional convergence and how fast it is happening. The evidence is clear: yes, unconditional convergence is real since approximately 2000, but it is very slow.

The speed of convergence for 2000–2019 is 0.36% per year ($\beta$ = 0.00365, p = 0.023), with a half-life of 190 years — both OLS conversion and NLS direct estimation give this identical result. To put this in perspective, at this pace, a country currently at one-tenth of US income per capita would need nearly two centuries to close just half the gap — not to catch up entirely, but merely to halve the distance. This is roughly five times slower than the classic 2%/year benchmark for conditional convergence (Barro and Sala-i-Martin, 1992), which controls for human capital, institutions, and savings rates. The fact that unconditional convergence exists at all is remarkable, but its pace should temper optimism about automatic catch-up.

The sigma convergence results add an important nuance. Even though poorer countries have been growing faster since around 2000, the actual spread of world incomes only began narrowing after 2008 — an 8-year lag. And even with this recent narrowing, the 2019 income distribution is still 91% wider than in 1960. A policymaker looking at these results would conclude that convergence forces alone are far too slow to eliminate global poverty or close income gaps within any reasonable planning horizon. Active investment in education, infrastructure, institutions, and technology transfer remains essential.

A methodological contribution of this tutorial is demonstrating that the OLS $\lambda \to \beta$ conversion and NLS direct estimation are algebraically equivalent, producing identical point estimates. The choice between methods is one of convenience: OLS for simplicity, NLS for direct inference on $\beta$. Students can start with the familiar OLS framework and add NLS when they need standard errors for the structural parameter.

17. Summary and next steps

Key takeaways

No convergence over 1960–2019 as a whole (OLS $\lambda$ = 0.00057, p = 0.661), but this null result conceals a dramatic structural break around the year 2000.
Unconditional convergence since 2000 at a speed of 0.36% per year ($\beta$ = 0.00365, half-life = 190 years, N = 84, p = 0.023). This is statistically significant but five times slower than conditional convergence.
OLS and NLS give identical results. The algebraic conversion $\beta = -\ln(1 + \lambda s)/s$ recovers the same structural parameter as direct NLS estimation, confirming both methods are valid.
Sigma convergence lags beta convergence by ~8 years. The income variance peaked at 1.918 in 2008 and declined 8.1% by 2019. Random growth shocks delayed the narrowing of the distribution even as poorer countries grew faster on average.
The income distribution remains 91% wider than in 1960. Despite post-2008 sigma convergence, the 2019 variance of log GDP per capita (1.764) far exceeds the 1960 value (0.924). A one-standard-deviation move in the 2019 distribution corresponds to a 3.8-fold difference in living standards.

Limitations

The analysis uses a balanced panel of 84 countries with data available since 1960, excluding 40 countries that entered PWT coverage after 1960. These excluded countries are disproportionately from Africa and small island states, so the results may not generalize to the full set of developing countries.
The convergence regressions explain very little of the cross-country growth variation (R-squared from 0.001 to 0.069). The research question is about the sign and significance of the relationship, not prediction.
The most recent rolling-window estimates (start years 2009–2010) show some moderation in convergence speed, but shorter growth windows also mean more noise.
Results depend on the choice of income measure (expenditure-side real GDP at chained PPPs) and sample restrictions (excluding oil producers and small countries).

Next steps

Conditional convergence: Add controls for human capital, institutional quality, and savings rates to see whether the speed approaches the 2% benchmark.
Club convergence: Test whether countries converge to different steady states rather than a single global equilibrium (Phillips and Sul, 2007).
Within-country convergence: Apply the same framework to regions within a country to study subnational income dynamics.
Post-COVID update: Extend the analysis past 2019 to assess whether the pandemic disrupted or accelerated convergence.

18. Exercises

Change the breakpoint. Instead of splitting at the year 2000, try splitting at 1990 or 1995. Does the convergence coefficient in the recent era change? At what breakpoint does the coefficient first become significantly negative?
Conditional convergence. Add log population and a measure of education (years of schooling, available in PWT 10.0 as hc) as controls to the NLS specification. How much does the speed of convergence increase? Does the half-life approach the 35-year conditional benchmark?
Alternative samples. Re-run the 2000–2019 NLS regression including oil producers. Then try including small countries. How sensitive is the convergence result to these sample restrictions?

19. References

Converging to Convergence: Understanding the Main Ideas of the Convergence Literature

Wed, 29 Apr 2026 00:00:00 +0000

1. Overview

For decades, one of the most important questions in economics has been: are poor countries catching up to rich ones? The answer has changed dramatically over time. In the 1960s, richer countries actually grew faster than poorer ones — a pattern called divergence. By the 2000s, this had reversed: poor countries were growing significantly faster, a phenomenon known as unconditional convergence (also called absolute convergence). What caused this shift?

This tutorial walks through the key ideas of the convergence literature by reproducing the main findings of Kremer, Willis, and You (2021), “Converging to Convergence.” The paper provides an elegant explanation: the world has “converged to convergence” because growth correlates — the policies, institutions, and human capital variables that predict economic growth — have themselves converged across countries. As poor countries improved their institutions and policies, the gap between unconditional convergence (a simple comparison of growth rates across income levels) and conditional convergence (controlling for institutions) closed. The central tool for understanding this is the omitted variable bias (OVB) formula, which decomposes exactly how much each growth correlate contributes to the convergence gap.

We use the authors' replication dataset, which combines Penn World Table 10.0 GDP data with over 50 institutional, policy, and cultural variables for approximately 160 countries from 1960 to 2017. The analysis is entirely descriptive — we document cross-country correlations and trends, but do not make causal claims.

Learning objectives

Understand beta-convergence and sigma-convergence and how to test for each
Track the trend in convergence over time using year-interacted regressions
Decompose convergence into contributions from income quartiles and geographic regions
Apply the omitted variable bias (OVB) formula to explain why unconditional convergence emerged
Distinguish between correlate-income slopes (delta), growth-correlate slopes (lambda), and their product
Evaluate whether the 1990s growth regression literature holds up as an out-of-sample test

Analytical roadmap

The diagram below shows the logical progression of the tutorial. We first establish the facts, then explain them.

graph LR
A["<b>Establish the<br/>Facts</b><br/><i>Sections 3--6</i>"]
B["<b>Correlate<br/>Convergence</b><br/><i>Section 7</i>"]
C["<b>OVB<br/>Framework</b><br/><i>Sections 8--10</i>"]
D["<b>The<br/>Punchline</b><br/><i>Section 11</i>"]
A --> B
B --> C
C --> D
style A fill:#6a9bcc,stroke:#141413,color:#fff
style B fill:#d97757,stroke:#141413,color:#fff
style C fill:#00d4c8,stroke:#141413,color:#141413
style D fill:#141413,stroke:#d97757,color:#fff

We start by documenting the emergence of convergence (scatter plots, rolling coefficients, sigma-convergence, quartile decompositions). Then we show that growth correlates have themselves converged. Finally, the OVB framework links these two facts, revealing that the gap between unconditional and conditional convergence closed because growth regression coefficients for policy variables collapsed.

Key concepts at a glance

The post leans on a small vocabulary repeatedly. The rest of the tutorial assumes you can move between these terms quickly. Each concept below has three parts. The definition is always visible. The example and analogy sit behind clickable cards: open them when you need them, leave them collapsed for a quick scan. If a later section mentions “OVB decomposition” or “lambda flattening” and the term feels slippery, this is the section to re-read.

1. Beta convergence: unconditional vs conditional $\beta$ vs $\beta^$. The unconditional $\beta$ is the slope of growth on log initial income with no controls. The conditional $\beta^$ is the same slope after controlling for growth correlates. Both negative means poorer countries are catching up — even those with similar institutions.

Example

For the polity2 sample in 2005, the unconditional $\beta = -0.767$ and the conditional $\beta^* = -0.807$. The two are within 0.04 of each other. Twenty years earlier (1985), the gap was 0.44 — institutions explained most of the apparent divergence.

Analogy

“Catching up overall” vs “catching up given the same institutions”. Imagine two race tracks: one mixes all runners, the other separates them by training regimen. If both show poor runners gaining, the catching-up is real.

2. Sigma convergence $\sigma_t$. The cross-country standard deviation of log GDP per capita at year $t$. Tracks the width of the world income distribution. A narrowing distribution is sigma convergence.

Example

$\sigma$ rose from 0.947 in 1960 to 1.217 in 2000 (peak), then eased to 1.173 by 2017. Income dispersion is no longer widening but has not yet narrowed substantially. Beta convergence has just begun the work that sigma convergence will eventually reflect.

Analogy

A flock of birds. Sigma asks whether the flock is tightening. Beta tells you which birds are flying faster. They are related but not the same: the laggard birds can accelerate without the flock yet looking tighter.

3. OVB decomposition $\beta - \beta^* = \delta \cdot \lambda$. The omitted-variable-bias identity. The gap between unconditional and conditional convergence equals the product of two slopes: $\delta$ (correlate-on-income) and $\lambda$ (correlate-on-growth). When the gap closes, at least one of $\delta$ or $\lambda$ must have shrunk.

Example

For the polity2 example, the gap closed from 0.440 (1985) to 0.040 (2005). The product $\delta \cdot \lambda$ went from $0.440$ to $0.040$. Inspecting the components: $\lambda$ collapsed from 0.891 to 0.183 — the growth regression coefficient flattened.

Analogy

Double-entry bookkeeping. The total bias on the convergence books equals the sum of two ledger entries. If the total drops, one of the ledger entries must have dropped — and the OVB identity tells you which one.

4. Growth correlates. The policy and institutional variables economists used to put on the right-hand side of growth regressions in the 1990s: inflation, investment, schooling, openness, political rights, rule of law, and so on. Each is meant to capture a “fundamental” of long-run growth.

Example

This post tracks polity2, FH_political_rights, investment, inflation, and barrolee2060 (schooling) as the headline correlates. Each has a story in the post: investment shows the strongest cross-country correlation with income; political rights show the most pronounced correlate-income flattening.

Analogy

Ingredients in a recipe. Some recipes call for many ingredients (high-inflation, low-savings, weak-rights), others for few. Growth correlates are the ingredients we suspect explain why some economies cook up more output than others.

5. Correlate–income slope $\delta$. The regression of a correlate on log income. How much richer countries have more of the correlate. A large positive $\delta$ for polity2 means richer countries are more democratic.

Example

For polity2, $\delta$ has stayed around 0.5–0.6 over decades. Richer countries have always tended to be more democratic. The correlate-income slope is not what flattened in the 1990s–2000s; it is the other half of the OVB product.

Analogy

How well-stocked the kitchen is. A wealthy kitchen has more ingredients on hand. The correlate-income slope $\delta$ measures the kitchen-stocking gradient: as a country gets richer, how much better-stocked does its kitchen become?

6. Growth-regression slope $\lambda$. The coefficient on a correlate when growth is regressed on the correlate (controlling for log income). How much each correlate contributes to growth, holding initial income fixed. A large $\lambda$ means the correlate matters; a small $\lambda$ means it does not.

Example

For polity2 in 1985, $\lambda = 0.891$. By 2005, $\lambda = 0.183$. The growth payoff to good political institutions has flattened dramatically over two decades.

Analogy

How much each ingredient matters in the recipe. A pinch of saffron used to be transformative. Now everyone uses it; the marginal effect is much smaller. Lambda is “marginal effect of the ingredient”; not “amount of ingredient on hand”.

7. Lambda flattening. The empirical observation that growth-regression coefficients $\lambda$ on short-run correlates have collapsed since the 1990s. The collapse is the real story: it is what made unconditional convergence emerge.

Example

Across the post’s correlate set, $\lambda$ for several short-run policy variables fell from 0.5–1.0 (1985) to 0.1–0.3 (2005). The longer-run correlates (like schooling) are stickier. The lambda flattening shrinks the OVB product and brings $\beta$ and $\beta^*$ into alignment.

Analogy

Ingredients losing their punch as kitchens equalize. When every kitchen has good knives and a working oven, the kitchens with the best knives no longer dominate. Lambda flattening is that universal-baseline effect.

8. Quartile and regional decomposition. A descriptive break-down of beta convergence by initial-income quartile or by region. Asks: which subgroup is doing the catching-up? A few quartiles or regions usually do most of the work.

Example

This post’s regional decomposition (Sub-Saharan Africa, East Asia, Latin America, OECD, etc.) attributes most of the post-2000 catch-up to East Asia and parts of South Asia. Within-quartile, the bottom two quartiles drive the recent convergence; the top two have stayed flat.

Analogy

Breaking the average down by income tier. The class average improved; was it because everyone improved, or because the bottom of the class caught up? Quartile decomposition answers exactly that question.

2. Setup and data loading

We begin by loading the Kremer et al. (2021) replication dataset, which has already been cleaned to exclude very small countries (population below 200,000) and resource-dependent economies (natural resource rents above 75% of GDP). We also merge regional classifications from the World Development Indicators.

clear all
set more off
set seed 42
set scheme s2color
* Load the main dataset
use "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/stata_convergence2/main_data.dta", clear
* Display panel structure
codebook country_id, compact
tab year if loggdp != ., missing
summarize loggdp loggdp_growth_10

Panel structure:
country_id: 174 unique countries, range 2--218
Years covered: 1960 to 2017
Countries with GDP data: 160
Key income variables:
Variable | Obs Mean Std. dev. Min Max
-----------+---------------------------------------------------------
loggdp | 8,328 8.712741 1.186573 5.368557 12.61823
loggdp_g~10| 6,888 1.962031 2.78512 -12.33628 22.12787

The dataset is an unbalanced panel of 160 countries observed over 58 years (1960–2017), with 8,328 country-year observations containing GDP data. The panel expands in two jumps — from 109 countries in 1960 to 137 in 1970 (decolonization) and to 160 in 1990 (post-Soviet states). Average log GDP per capita is 8.71, with a standard deviation of 1.19 log points reflecting enormous cross-country income inequality. The 10-year forward-looking growth rate — the main outcome variable — averages 1.96% per year with a range from -12.3% (economic collapses) to 22.1% (growth miracles).

We then define variable groups following the paper’s classification of growth correlates into four categories.

* Solow fundamentals (steady-state determinants)
local solow investment population_growth barrolee2060
* Short-run correlates (policies/institutions that can change quickly)
local short_run polity2 FH_political_rights FH_civil_liberties ///
pri_inv gov_spending inflation WDI_credit credit /* +19 more */
* Long-run correlates (geography and historical institutions)
local long_run population_1900 legor_uk legor_fr logem4 meantemp /* +7 more */
* Culture (Hofstede cultural dimensions)
local culture VSM_power_dist VSM_individualism VSM_masculinity /* +3 more */

The classification matters because the paper’s central finding is that short-run correlates behave very differently from Solow fundamentals in growth regressions. We will return to this distinction in Sections 9 and 10.

3. Has the world been converging? Scatter plots by decade

The simplest test for convergence is visual: plot 10-year economic growth against initial income level and check the slope. Beta-convergence — named after the slope coefficient $\beta$ in the regression of growth on income — means that poorer countries grow faster. A negative slope indicates convergence; a positive slope indicates divergence.

We run this regression for each decade separately, from the 1960s through 2007.

foreach yr in 1960 1970 1980 1990 2000 2007 {
quietly reg loggdp_growth_10 loggdp if year == `yr', robust
* Store coefficients for each decade
}
* Combine 6 scatter panels into one figure
graph combine G1 G2 G3 G4 G5 G6, rows(2) cols(3) ///
graphregion(color(white)) ///
title("Income Convergence by Decade", size(medium))
graph export "stata_convergence2_scatter_by_decade.png", replace width(2400)

Beta by decade:
decade | beta se pval n_obs
--------+----------------------------------------
1960 | 0.532 0.191 0.006 109
1970 | -0.075 0.292 0.799 137
1980 | 0.106 0.246 0.667 137
1990 | -0.127 0.220 0.564 160
2000 | -0.651 0.168 0.000 160
2007 | -0.764 0.146 0.000 160

The scatter plots reveal a dramatic historical reversal. In the 1960s, $\beta = +0.53$ (p = 0.006), meaning richer countries grew significantly faster — a world of divergence. Through the 1970s–1990s, the coefficient hovered near zero, statistically indistinguishable from zero in every decade. By the 2000s, a strongly negative $\beta = -0.65$ (p < 0.001) emerged, deepening to -0.76 by 2007. This shift from divergence to convergence — spanning roughly 1.3 percentage points of GDP growth per log point of income — represents a fundamental transformation in the global growth landscape.

But is this trend systematic, or just an artifact of picking the right decades? The next section tests whether convergence has been trending continuously.

4. The trend in beta-convergence

Rather than comparing snapshots, we track the convergence coefficient continuously over time. This is the paper’s key innovation: studying the trend in convergence, not just testing whether convergence exists at a single point in time.

The specification interacts log GDP per capita with year dummies, giving a separate $\beta_t$ for each year:

$$\text{Growth}_{i,t \to t+10} = \beta_t \cdot \log(\text{GDPpc}_{i,t}) + \mu_t + \varepsilon_{i,t}$$

In words, this equation says that 10-year forward-looking growth is a linear function of initial income, with a slope $\beta_t$ that varies by year and year fixed effects $\mu_t$ absorbing common shocks. A negative $\beta_t$ means convergence in year $t$; a positive $\beta_t$ means divergence.

* Estimate year-by-year beta coefficients using year-interacted regression
areg loggdp_growth_10 c.loggdp#i.year, absorb(year) robust cluster(country_id)
* Extract coefficients and plot with 95% CI
twoway (rarea ci_upper ci_lower year, fcolor("106 155 204%30") lwidth(none)) ///
(line beta year, lcolor("106 155 204") lwidth(medthick)) ///
(function y = 0, range(1960 2009) lcolor("217 119 87") lpattern(dash)), ///
xtitle("Year") ytitle("Beta-convergence coefficient") ///
title("Trend in Beta-Convergence, 1960-2007", size(medium))
graph export "stata_convergence2_beta_trend.png", replace width(2400)

We also estimate a linear trend specification (Table 1) to test whether the downward movement is statistically significant.

Table 1: Converging to Convergence
-------------------------------------------------
(1) (2) (3)
Pooled Trend By Decade
-------------------------------------------------
loggdp -0.270** 0.449**
(0.118) (0.224)
loggdp_X~r -0.025***
(0.006)
loggdp~60s 0.532***
(0.191)
loggdp~00s -0.651***
(0.168)
loggdp~07s -0.764***
(0.146)
-------------------------------------------------
N 863 863 863
Year FE Y Y Y
-------------------------------------------------

The trend coefficient of -0.025 per year (p < 0.01) confirms that convergence has been a systematic trend, not just a snapshot. The convergence coefficient has decreased by 0.025 per year since 1960 — or equivalently, has shifted by about 1.2 percentage points per half-century. The rolling year-by-year beta (Figure 2) shows this was not smooth: $\beta$ fluctuated around zero through the 1970s–1980s, then dropped sharply through the 1990s and 2000s, becoming consistently and significantly negative after 1999.

This raises a natural follow-up question: if countries are growing at rates that should reduce income gaps (beta-convergence), has income dispersion actually narrowed?

5. Sigma-convergence: is income dispersion narrowing?

Beta-convergence (poorer countries growing faster) and sigma-convergence (declining cross-country income dispersion) are related but distinct concepts. Beta-convergence is necessary but not sufficient for sigma-convergence — like a river flowing downhill, catch-up growth must be strong enough to overcome random shocks that push countries apart. We measure sigma as the standard deviation of log GDP per capita across countries in each year.

preserve
collapse (sd) sigma = loggdp, by(year)
twoway (line sigma year, lcolor("106 155 204") lwidth(medthick)), ///
xtitle("Year") ytitle("SD of log GDP per capita") ///
title("Sigma-Convergence: Cross-Country Income Dispersion", size(medium))
graph export "stata_convergence2_sigma.png", replace width(2400)
restore

Sigma (SD of log GDP per capita):
Year | Sigma
-------+---------
1960 | 0.947
1970 | 1.086
1980 | 1.139
1990 | 1.146
2000 | 1.217 (peak)
2010 | 1.173
2017 | 1.173

The standard deviation of log GDP per capita rose steadily from 0.95 in 1960 to a peak of 1.22 in 2000, reflecting four decades of widening global inequality. After 2000, sigma began declining, reaching 1.13 by 2015 before ticking back up slightly to 1.17 in 2017. This pattern is consistent with beta-convergence leading sigma-convergence by roughly a decade: beta turned significantly negative around 1999, and sigma began declining shortly after 2000. The lag occurs because sigma-convergence requires catch-up growth fast enough to offset the random shocks that push countries apart — a more demanding condition than simple beta-convergence.

Now that we have established the headline fact — convergence emerged around 2000 — we need to understand who is driving it. Is it catch-up growth at the bottom, stagnation at the top, or both?

6. Who drives convergence?

6.1 Income quartile decomposition

We decompose the convergence trend by sorting countries into income quartiles and tracking each group’s average growth rate over time. This reveals whether convergence reflects catch-up growth by the poorest countries, a growth slowdown among the richest, or both.

* Compute mean 10-year growth by income quartile and year
xtile quartile = loggdp, nq(4)
collapse (mean) mean_growth = loggdp_growth_10, by(quartile year)
* Plot 4 lines, one per quartile
twoway (line mean_growth year if quartile == 1, lcolor("255 141 61")) ///
(line mean_growth year if quartile == 2, lcolor("246 199 0")) ///
(line mean_growth year if quartile == 3, lcolor("146 195 51")) ///
(line mean_growth year if quartile == 4, lcolor("106 155 204")), ///
legend(label(1 "Q1 (Poorest)") label(2 "Q2") label(3 "Q3") label(4 "Q4 (Richest)"))
graph export "stata_convergence2_growth_by_quartile.png", replace width(2400)

Mean 10-year growth by quartile:
Q1(Poorest) Q2 Q3 Q4(Richest)
1960 2.46 2.20 2.93 3.49
1985 0.49 0.99 1.46 1.76
2000 3.31 3.60 3.29 1.26
2007 3.02 2.18 1.60 0.31

Convergence since 2000 is driven by both catch-up growth at the bottom AND a growth slowdown at the top. In the 1960s, the richest quartile (Q4) grew fastest at 3.49% per year, while the poorest (Q1) grew at only 2.46%. By 2007, this ordering had completely reversed: Q1 grew at 3.02% while Q4 grew at just 0.31%. The richest quartile experienced the most dramatic decline, going from the fastest-growing group in the 1960s to the slowest by the 2000s. Think of it like a marathon where the leaders have slowed down while the runners at the back have sped up — the pack is compressing from both directions.

6.2 Regional robustness

A natural concern is that convergence might be driven by a single region — perhaps it disappears if we exclude China and the rest of Asia. We check by estimating the rolling beta trend while excluding each major region one at a time.

* For each region, estimate beta trend excluding that region
foreach reg in 1 2 3 4 {
areg loggdp_growth_10 c.loggdp#i.year if region_group != `reg', ///
absorb(year) robust cluster(country_id)
* Extract and store coefficients
}
graph export "stata_convergence2_beta_excluding_regions.png", replace width(2400)

Convergence holds when excluding any single region. Excluding Sub-Saharan Africa makes convergence even stronger ($\beta$ reaches -1.25 by 2000), consistent with Africa’s economic difficulties during the 1970s–1990s dragging the global average toward zero. Excluding Europe/North America yields a somewhat weaker but still clearly negative trend. The finding is genuinely global.

We have now established the core empirical facts: convergence emerged around 2000, it reflects forces on both ends of the income distribution, and it is not driven by any single region. The next step is to ask why. The paper’s key insight is that the answer lies in the behavior of growth correlates.

7. Have growth correlates converged?

The 1990s growth literature identified dozens of variables that predict economic growth: investment, education, democracy, governance, financial development, inflation, and many others. A key insight of Kremer et al. (2021) is that these variables are not static — they have been converging across countries just like income itself.

We test this by regressing the change in each correlate (from 1985 to 2015) on its initial level in 1985. A negative slope means correlate convergence — countries that started with worse values experienced the largest improvements.

* For each correlate: change = beta * initial_level + epsilon
* Example for Polity 2 (democracy score)
gen change = 100 * ((polity2_2015 - polity2_1985) / 30)
reg change polity2_1985, robust

Correlate beta-convergence (change 1985-2015 regressed on level 1985):
Variable | beta se n_obs pval
-----------------------+------------------------------------
investment | -2.978 0.395 118 0.000
population_growth | -1.530 0.277 172 0.000
polity2 | -2.029 0.168 131 0.000
FH_political_rights | -1.394 0.206 139 0.000
gov_spending | -1.611 0.305 114 0.000
inflation | -3.070 0.103 128 0.000
barrolee2060 | -0.158 0.105 136 0.136

Growth correlates have themselves been converging since 1985. The strongest convergence is in inflation ($\beta = -3.07$), investment ($\beta = -2.98$), and democracy as measured by Polity 2 ($\beta = -2.03$) — all significant at the 0.1% level. This means that the cross-country distribution of policies and institutions has been compressing: countries with initially worse institutions experienced the largest improvements. The notable exception is Barro-Lee education ($\beta = -0.16$, p = 0.14), where convergence is slower and not statistically significant.

This finding is crucial because it connects two previously separate literatures. The convergence literature asks whether poor countries are catching up in income. The institutions literature documents whether countries are catching up in policies. The answer to both is yes — and the next sections show these are not coincidences but are linked by the omitted variable bias formula.

8. The OVB framework: why does convergence emerge?

This section introduces the central analytical framework of the paper. The omitted variable bias (OVB) formula provides an exact decomposition of the gap between unconditional convergence (a simple comparison of growth and income) and conditional convergence (controlling for institutions). Understanding this decomposition is the key to answering why unconditional convergence emerged.

8.1 Three regressions

Consider any growth correlate — say, democracy (Polity 2 score). Three regressions define the framework:

Regression 1 — Unconditional convergence ($\beta$): Regress growth on income alone.

$$\text{Growth}_i = \alpha + \beta \cdot \log(\text{GDPpc}_i) + \varepsilon_i$$

If $\beta < 0$, poorer countries grow faster (convergence). If $\beta > 0$, richer countries grow faster (divergence).

Regression 2 — Conditional convergence ($\beta^{\ast}$): Regress growth on income and the correlate.

$$\text{Growth}_i = \alpha + \beta^{\ast} \cdot \log(\text{GDPpc}_i) + \lambda \cdot \text{Inst}_i + \varepsilon_i$$

$\beta^{\ast}$ is the convergence coefficient controlling for institutions. The coefficient $\lambda$ captures how much the correlate predicts growth, holding income constant. In the 1990s, $\beta^{\ast}$ was typically negative (conditional convergence) even when $\beta$ was not (no unconditional convergence).

Regression 3 — Correlate-income slope ($\delta$): Regress the correlate on income.

$$\text{Inst}_i = \nu + \delta \cdot \log(\text{GDPpc}_i) + u_i$$

$\delta$ captures how strongly the correlate correlates with income. If $\delta > 0$, richer countries have better institutions — the “modernization hypothesis.”

8.2 The key equation

The OVB formula links these three regressions with an exact algebraic identity:

$$\beta - \beta^{\ast} = \delta \times \lambda$$

In words, this says that the gap between unconditional and conditional convergence equals the product of two things: (1) how much richer countries have better institutions ($\delta$), and (2) how much those institutions predict growth ($\lambda$). This is not an approximation — it is an algebraic identity that holds exactly in any linear regression.

Why this matters. The decomposition tells us there are exactly three ways unconditional convergence can change over time:

Conditional convergence itself changes ($\beta^{\ast}$ shifts) — e.g., technology diffusion accelerates
Correlate-income slopes change ($\delta$ shifts) — e.g., rich and poor countries become equally democratic
Growth regression coefficients change ($\lambda$ shifts) — e.g., democracy stops predicting growth

The paper’s central finding: it is mainly mechanism 3 — $\lambda$ flattened — that explains the emergence of unconditional convergence.

8.3 Worked example: democracy (Polity 2)

Before generalizing, we build intuition with one correlate. Polity 2 measures democracy on a scale from -10 (autocracy) to +10 (full democracy), normalized by its 1985 standard deviation so that coefficients are in comparable units.

* Normalize polity2 by its 1985 SD
gen polity2_norm = polity2 / `sd_polity2'
* --- Period: 1985 ---
* Regression 1 (Unconditional):
reg loggdp_growth_10 loggdp if year == 1985 & polity2_norm != ., robust
* Regression 2 (Conditional):
reg loggdp_growth_10 loggdp polity2_norm if year == 1985, robust
* Regression 3 (Income-Institution slope):
reg polity2_norm loggdp if year == 1985, robust
* Repeat for 2005

---- Period: 1985 ----
Regression 1 (Unconditional): beta = 0.328 (SE = 0.199, N = 124)
Regression 2 (Conditional): beta* = -0.111, lambda = 0.891
Regression 3 (Income-Inst): delta = 0.494
OVB DECOMPOSITION:
beta - beta* = 0.440 (actual gap)
delta x lambda = 0.440 (predicted by OVB formula)
delta = 0.494 (richer countries more democratic?)
lambda = 0.891 (democracy predicts growth?)
---- Period: 2005 ----
Regression 1 (Unconditional): beta = -0.767 (SE = 0.149, N = 147)
Regression 2 (Conditional): beta* = -0.807, lambda = 0.183
Regression 3 (Income-Inst): delta = 0.216
OVB DECOMPOSITION:
beta - beta* = 0.040 (actual gap)
delta x lambda = 0.040 (predicted by OVB formula)
delta = 0.216 (richer countries more democratic?)
lambda = 0.183 (democracy predicts growth?)
COMPARISON ACROSS TIME:
delta (1985) = 0.494 --> delta (2005) = 0.216 [STABLE]
lambda (1985) = 0.891 --> lambda (2005) = 0.183 [SHRANK]
gap (1985) = 0.440 --> gap (2005) = 0.040 [CLOSED]

This single example encapsulates the paper’s entire argument. In 1985, unconditional $\beta$ was +0.33 (divergence), but controlling for democracy revealed conditional convergence at $\beta^{\ast} = -0.11$. The gap of 0.44 is exactly predicted by $\delta \times \lambda = 0.494 \times 0.891 = 0.44$ — the OVB formula holds exactly because it is an algebraic identity. By 2005, $\lambda$ collapsed from 0.89 to 0.18 — democracy went from being a powerful growth predictor (one SD higher Polity 2 associated with 0.89% faster annual growth) to a near-zero predictor. The resulting gap shrank from 0.44 to 0.04 — a 91% reduction. The correlate-income slope $\delta$ also fell (from 0.49 to 0.22), but the primary driver was the collapse in $\lambda$.

Think of it like a recipe that calls for two ingredients. The gap ($\delta \times \lambda$) was large in 1985 because both ingredients were present: richer countries had much better democracy ($\delta$ large) and democracy strongly predicted growth ($\lambda$ large). By 2005, the second ingredient ($\lambda$) had nearly vanished — it no longer mattered for growth predictions whether a country was democratic or not — so the recipe produced almost nothing.

Now we generalize: does this pattern hold across all growth correlates, not just democracy?

9. Are correlate-income slopes stable? (Delta)

The OVB formula has two components: $\delta$ (the correlate-income slope) and $\lambda$ (the growth-correlate slope). We examine each in turn. If $\delta$ — the relationship between income and institutions — has changed dramatically, that could explain the closing gap. But the paper finds that $\delta$ has been remarkably stable.

For each correlate, we compute $\delta$ in 1985 and in 2015, then scatter one against the other. Points on the 45-degree line mean $\delta$ has not changed; points below it mean the relationship weakened.

* For each correlate: regress Inst on loggdp in 1985 and 2015
* All correlates normalized by their 1985 SD
* Panel A: Solow fundamentals + short-run correlates
* Panel B: Long-run correlates + culture
graph combine delta_A delta_B, rows(1) cols(2) ///
graphregion(color(white)) ///
title("Stability of Correlate-Income Slopes", size(medium))
graph export "stata_convergence2_delta_stability.png", replace width(2400)

Delta fitted line slopes (delta_2015 vs delta_1985):
Solow fundamentals: slope = 0.878
Short-Run correlates: slope = 0.886
Long-Run correlates: slope = 1.024
Culture: slope = 0.884

The correlate-income relationships are remarkably stable. Fitted lines cluster tightly around the 45-degree line: Solow fundamentals 0.88, short-run correlates 0.89, long-run correlates 1.02, culture 0.88. This means the cross-country association between income and institutions has barely changed over 30 years. Richer countries still have better democracy, more investment, lower population growth, and stronger financial sectors in essentially the same proportions as in 1985. The “modernization hypothesis” — that economic development goes hand-in-hand with institutional improvement — passes its out-of-sample test.

Crucially, this stability means that the $\delta$ component is not responsible for the closing gap between unconditional and conditional convergence. The answer must lie in the other component: $\lambda$.

10. Growth regressions then vs. now: the lambda flattening

In the 1990s, a massive literature ran growth regressions of the form: Growth = $\alpha + \beta^{\ast} \times$ Income $+ \lambda \times$ Correlate $+ \varepsilon$. These regressions identified which policies and institutions predict growth and formed the empirical backbone of the “Washington Consensus” — the set of policy recommendations that international institutions gave to developing countries. The key question: do these regressions hold up with 25 years of new data?

For each correlate, we estimate $\lambda$ (the growth-correlate slope) in the base year (~1985) and in 2005, using a fixed sample of countries with data in both periods.

* For each correlate, run the growth regression in base year and 2005
* Growth = alpha + beta* x loggdp + lambda x correlate + epsilon
* Fixed country sample per correlate
* Scatter lambda_2005 vs lambda_1985
reg lambda_2005 lambda_1985 if flag_solow == 1
* -> slope = 0.861, R-sq = 0.947
reg lambda_2005 lambda_1985 if flag_solow == 0 & flag_long_run == 0
* -> slope = 0.189, R-sq = 0.063

Lambda fitted line slopes (lambda_2005 vs lambda_1985):
Solow fundamentals: slope = 0.861, R-sq = 0.947
Short-run correlates: slope = 0.189, R-sq = 0.063
Long-Run correlates: slope = 0.296
Culture: slope = 0.685

This is the most striking empirical result of the paper. Solow fundamentals (investment, population growth, education) show high persistence: a fitted slope of 0.86 with R-squared of 0.95, meaning these deep structural variables predict growth almost as well in 2005 as in 1985. In dramatic contrast, short-run correlates (democracy, governance, fiscal policy, financial development) show near-zero persistence: a slope of 0.19 with R-squared of only 0.06. There is essentially no correlation between which policy variables predicted growth in 1985 and which predict growth in 2005.

The Washington Consensus growth regressions — which identified specific policies and institutions as growth drivers — have failed their out-of-sample test. Variables like Polity 2 ($\lambda$ fell from 0.89 to 0.34), FH Political Rights (1.11 to 0.19), and FH Civil Liberties (0.96 to 0.17) went from strong growth predictors to near-zero predictors. Long-run correlates and culture occupy an intermediate position (slopes 0.30 and 0.69 respectively).

Why did this happen? There are at least three possible explanations: (a) as correlates converged (Section 7), the reduced cross-country variation made coefficient estimation noisier; (b) the original regressions may have been overfitted to a specific historical sample; (c) the relationship between institutions and growth may be non-linear — institutions matter most when differences are large, and less when all countries have reasonably good policies. The analysis cannot distinguish between these, but the empirical fact is clear: $\lambda$ collapsed.

Since $\delta$ is stable (Section 9) and $\lambda$ collapsed (this section), their product $\delta \times \lambda$ must have shrunk toward zero. The next section confirms this.

11. The punchline: absolute convergence converges to conditional

11.1 The OVB gap is closing

The product $\delta \times \lambda$ quantifies how much each correlate biases the unconditional convergence coefficient. We scatter $\delta \times \lambda$ in 2005 against its value in 1985 to see whether this “explanatory gap” has closed.

* Scatter delta*lambda in 2005 vs 1985
reg dl_2005 dl_1985 if flag_solow == 0 & flag_long_run == 0
* -> slope = 0.090 (short-run correlates: gap essentially vanished)
reg dl_2005 dl_1985 if flag_solow == 1
* -> slope = 0.740 (Solow fundamentals: gap partially retained)

OVB gap fitted line slopes (dl_2005 vs dl_1985):
Panel A:
Solow fundamentals: slope = 0.740
Short-Run correlates: slope = 0.090
Panel B:
Long-Run correlates: slope = 0.480
Culture: slope = 0.739

The OVB gap for short-run correlates has shrunk to nearly zero (fitted slope 0.09). In 1985, omitting these policy and institutional variables made unconditional convergence look substantially worse than conditional convergence. By 2005, the two are nearly identical. Solow fundamentals retained more of their explanatory power (slope 0.74), reflecting the stability of both their $\delta$ and $\lambda$ components. This confirms the paper’s central thesis: unconditional convergence emerged not because the income-correlate relationship changed ($\delta$ is stable) but because policy variables stopped predicting growth ($\lambda$ flattened).

11.2 The closing gap over time

The definitive test uses multivariate regressions. We fix a sample of 73 countries with complete data on 10 correlates (Polity 2, FH political rights, FH civil liberties, private investment, government spending, inflation, WDI credit, credit by financial sector, Barro-Lee education, and education gender gap). For each year from 1985 to 2007, we estimate both unconditional $\beta$ (income only) and conditional $\beta^{\ast}$ (income plus all 10 correlates).

* Fix sample: 73 countries with complete data on all 10 correlates in 1985
local var_all polity2 FH_political_rights FH_civil_liberties pri_inv ///
gov_spending inflation WDI_credit credit barrolee2060 edugap
forval yr = 1985/2007 {
* Unconditional: reg growth loggdp, robust cluster(country_id)
* Conditional: reg growth loggdp `var_all', robust cluster(country_id)
}
* Plot the closing gap
twoway (line beta_unconditional year, lcolor("20 20 19") lwidth(medthick)) ///
(line beta_conditional year, lcolor("106 155 204") lwidth(medthick)) ///
(line zero year, lcolor("217 119 87") lpattern(dot)), ///
legend(label(1 "Absolute Convergence") label(2 "Conditional Convergence"))
graph export "stata_convergence2_absolute_vs_conditional.png", replace width(2400)

Year | beta_unconditional beta_conditional gap
------+-------------------------------------------
1985 | 0.420 -1.072 1.492
1990 | 0.377 -0.560 0.937
1995 | 0.081 -0.155 0.236
2000 | -0.387 -0.540 0.153
2005 | -0.556 -0.969 0.413
2007 | -0.646 -1.274 0.629

This is the paper’s title finding. In 1985, unconditional $\beta$ was +0.42 (divergence) while conditional $\beta^{\ast}$ was -1.07 (strong convergence when controlling for institutions) — a gap of 1.49. By 2000, unconditional $\beta$ had fallen to -0.39 while conditional $\beta^{\ast}$ was -0.54, narrowing the gap to just 0.15. The gap narrowed dramatically from 1.49 (1985) to 0.15 (2000), then widened somewhat as conditional $\beta^{\ast}$ deepened faster, but both lines are firmly negative by 2000.

The Solow model’s prediction of conditional convergence held all along — what changed is that the real world caught up. As the OVB from excluding correlates shrank toward zero, unconditional convergence “converged to” conditional convergence.

11.3 Multivariate evidence (Table 5)

The multivariate regressions crystallize the structural change by showing how adding correlates affects the convergence coefficient in each period.

 abs_1985 solow_1985 short_1985 full_1985 abs_2005 solow_2005 short_2005 full_2005
loggdp 0.420 -0.447 -0.435 -0.816 -0.556 -1.176 -0.557 -1.040
(0.252) (0.661) (0.457) (0.619) (0.203) (0.309) (0.327) (0.393)
R2 0.028 0.155 0.152 0.228 0.101 0.247 0.258 0.355
N 73 73 73 73 73 73 73 73

In 1985, absolute convergence alone gives $\beta = +0.42$ (divergence, R-squared = 0.03 — essentially no linear relationship). Adding Solow fundamentals flips the sign to $\beta^{\ast} = -0.45$, and the full model gives $\beta^{\ast} = -0.82$. In 2005, the picture changes fundamentally: absolute convergence is already strong at $\beta = -0.56$ (R-squared = 0.10). Adding short-run correlates alone barely changes the coefficient (from -0.56 to -0.56), confirming that policy variables no longer have explanatory power beyond what income already captures. Correlates still improve overall fit (R-squared rises from 0.10 to 0.35), but they no longer alter the convergence coefficient.

12. Robustness: does the averaging period matter?

The main results use 10-year forward-looking growth rates. One concern is that 10-year averaging may smooth out noise in a way that creates artificial trends. We check by re-estimating the rolling beta-convergence trend using 1-year, 2-year, 5-year, and 10-year growth averages.

* For each averaging period t = 1, 2, 5, 10:
gen loggdp_growth_t = 100 * ((F[t].logrgdpna - logrgdpna) / t)
areg loggdp_growth_t c.loggdp#i.year, absorb(year) robust cluster(country_id)

Results:
1-year average: high noise, downward trend visible but obscured by fluctuations
2-year average: moderate noise, downward trend clearer
5-year average: smooth, clear downward trend from ~0 to ~-0.5 by late 2000s
10-year average: smoothest, clearest trend from +0.5 to -0.76 by 2007

The convergence trend is robust across all averaging periods. As expected, shorter periods produce noisier estimates — the 1-year panel is dominated by year-to-year fluctuations — while longer averages yield smoother trends. All four specifications agree that the crossover from divergence to convergence occurs around 1990–2000, confirming that the finding is not an artifact of the 10-year growth rate choice.

13. Discussion

Let us return to the question posed in the Overview: why did unconditional convergence emerge since 2000?

The OVB framework provides a clear and quantitative answer. The gap between unconditional convergence ($\beta$) and conditional convergence ($\beta^{\ast}$) is exactly equal to the product $\delta \times \lambda$. This gap closed because $\lambda$ — the coefficient on growth correlates in growth regressions — collapsed for short-run policy and institutional variables (slope = 0.19, R-squared = 0.06). Meanwhile, $\delta$ — the relationship between income and institutions — remained remarkably stable (slopes around 0.88 on the 45-degree line). In concrete terms: richer countries still have better institutions in the same proportions as 30 years ago, but those institutional advantages no longer translate into faster growth. As a result, unconditional convergence caught up to conditional convergence.

This has important implications for how we think about economic development. The 1990s “Washington Consensus” was built on the empirical finding that good policies and institutions predict faster growth. Our out-of-sample test shows that many of these relationships did not persist into the 2000s — at least not for short-run policy variables. Solow fundamentals (investment, population growth, education) remained robust growth predictors, consistent with the Solow model’s enduring relevance. But governance indices, fiscal indicators, and financial variables that were “significant” in 1990s regressions no longer predict growth. This raises questions about the stability of policy advice based on cross-country growth regressions.

Caveats. Several important limitations apply. First, the analysis is entirely descriptive — cross-country regressions do not establish causal relationships. The flattening of $\lambda$ could reflect genuine changes in causal relationships, convergence in unobserved variables, or reduced cross-country variation making coefficient estimation noisier. Second, the panel is unbalanced (109 countries in 1960 vs. 160 by 1990), and sample composition changes could mechanically affect estimates. Third, some correlates have small samples (fewer than 60 observations), limiting statistical precision. Finally, the 10-year growth variable is forward-looking, so the last usable observation is 2007/2008, missing the Global Financial Crisis, the post-GFC recovery, and COVID-19. Whether convergence persisted through these shocks is an open question.

14. Summary and key takeaways

This tutorial reproduced the key findings of Kremer, Willis, and You (2021), documenting the emergence of unconditional convergence and explaining it through the OVB decomposition framework. The analysis used 160 countries over 58 years with 50+ growth correlates.

The story in four facts

Unconditional convergence emerged around 2000. The $\beta$-convergence coefficient shifted from +0.53 in the 1960s (divergence, p = 0.006) to -0.76 by 2007 (convergence, p < 0.001), with a systematic trend of -0.025 per year.
Growth correlates converged. Inflation ($\beta = -3.07$), investment ($\beta = -2.98$), and democracy ($\beta = -2.03$) all showed strong convergence. Countries with initially worse institutions experienced the largest improvements.
Growth regression coefficients collapsed for policy variables. Solow fundamentals maintained high stability ($\lambda$ slope = 0.86, R-squared = 0.95), but short-run correlates showed near-zero persistence ($\lambda$ slope = 0.19, R-squared = 0.06). The 1990s growth regressions failed their out-of-sample test.
The gap between absolute and conditional convergence closed. The Polity 2 worked example shows the gap fell from 0.44 to 0.04 (a 91% reduction). In the multivariate analysis, the gap narrowed from 1.49 (1985) to 0.15 (2000).

Limitations

Descriptive, not causal: The OVB framework decomposes observed correlations, not causal relationships
Pre-2008 endpoint: The analysis does not cover the Global Financial Crisis or COVID-19
Small samples for some correlates: Culture and tariff variables have fewer than 60 observations
Normalization sensitivity: All correlate coefficients are normalized by their 1985 standard deviation

Next steps

Extend the analysis through the 2010s using updated PWT data to test whether convergence survived the post-GFC period
Explore non-linear specifications to test whether $\lambda$ flattened because of reduced correlate variation
Apply the OVB decomposition to regional subsamples (e.g., does the mechanism differ for Sub-Saharan Africa vs. East Asia?)

15. Exercises

Your own worked example. Choose a different correlate from the dataset (e.g., investment or FH political rights) and replicate the OVB worked example from Section 8.3. Compute $\beta$, $\beta^{\ast}$, $\delta$, $\lambda$, and verify the identity $\beta - \beta^{\ast} = \delta \times \lambda$ for both 1985 and 2005. Did the gap close for your chosen variable? Was the primary driver the change in $\delta$ or $\lambda$?
Balanced panel sensitivity. Re-estimate the rolling beta-convergence trend (Section 4) using only countries that have GDP data from 1960 onward (a balanced panel of approximately 109 countries). Does the convergence trend look different when you exclude countries that enter the sample later? What does this tell you about the role of sample composition changes?
Alternative classification. The paper classifies variables as “Solow fundamentals” or “short-run correlates.” Move education (barrolee2060) from the Solow group to the short-run group and re-estimate the lambda stability scatters (Section 10). Does the Solow fitted line slope change substantially? What does this tell you about the robustness of the paper’s classification scheme?

References

Acknowledgements

AI tools (Claude Code) were used to make the contents of this post more accessible to students. Nevertheless, the content in this post may still have errors. Caution is needed when applying the contents of this post to true research projects.

Regional Inequality and the Kuznets Curve: Panel Fixed Effects in Python

Mon, 27 Apr 2026 00:00:00 +0000

1. Overview

Does economic growth reduce inequality within countries, or does it make some regions richer while others fall behind? In 1955, Simon Kuznets hypothesized an inverted-U relationship: inequality rises during early industrialization as workers move from farms to factories, then falls as the benefits of growth diffuse more broadly. This “Kuznets curve” became one of the most tested hypotheses in development economics — and one of the most debated.

Using satellite nighttime light data to measure regional inequality across 180 countries from 1992 to 2012, Lessmann and Seidel (2017) found something surprising: the relationship is not an inverted-U at all. It is N-shaped. Inequality rises at low income levels, falls through middle-income development, then rises again at the very highest income levels. The classic Kuznets curve misses this second upturn because most early studies lacked data from the richest nations.

In this tutorial we replicate their key findings using PyFixest for panel fixed effects estimation and Great Tables for publication-quality regression tables. We progress from naive pooled OLS — which mixes between-country and within-country variation — through two-way fixed effects (TWFE) that isolate how inequality changes as the same country develops over time. We then compute turning points of the fitted N-shaped polynomial and investigate what determinants — resources, trade, mobility, education, and ethnicity — drive regional inequality beyond the Kuznets curve.

The case study question is: Is the relationship between regional inequality and economic development inverted-U or N-shaped, and what factors beyond income drive regional disparities?

Learning objectives:

Understand why polynomial specifications are necessary for testing the Kuznets hypothesis
Implement pooled OLS and two-way fixed effects regressions using PyFixest
Compute and interpret turning points of a cubic polynomial in the context of development economics
Compare pooled OLS and TWFE estimates to assess the impact of omitted variable bias
Identify the key determinants of regional inequality using panel fixed effects with clustered standard errors

The following diagram outlines the analytical pipeline:

graph TD
A["<b>Data</b><br/>180 countries, 1992-2012"] --> B["<b>Visual EDA</b><br/>Scatter plots + polynomial fits"]
B --> C["<b>Pooled OLS</b><br/>Linear / Quadratic / Cubic"]
C --> D["<b>Why Fixed Effects?</b><br/>Country trajectories differ"]
D --> E["<b>Two-Way FE</b><br/>Country + Year FE"]
E --> F["<b>Turning Points</b><br/>3 development phases"]
E --> G["<b>Determinants</b><br/>What drives inequality?"]
G --> H["<b>Robustness</b><br/>Coefficient stability"]
style A fill:#6a9bcc,stroke:#141413,color:#fff
style B fill:#6a9bcc,stroke:#141413,color:#fff
style C fill:#d97757,stroke:#141413,color:#fff
style D fill:#d97757,stroke:#141413,color:#fff
style E fill:#00d4c8,stroke:#141413,color:#fff
style F fill:#00d4c8,stroke:#141413,color:#fff
style G fill:#1a3a8a,stroke:#141413,color:#fff
style H fill:#1a3a8a,stroke:#141413,color:#fff

The pipeline progresses from exploratory analysis (blue) through baseline estimation (orange) to the core fixed effects results (teal) and determinant analysis (dark blue). Each stage builds on the previous: the visual patterns motivate the polynomial specification, the spaghetti plot motivates fixed effects, and the robust N-shape motivates the search for determinants.

Key concepts at a glance

The post leans on a small vocabulary repeatedly. The rest of the tutorial assumes you can move between these terms quickly. Each concept below has three parts. The definition is always visible. The example and analogy sit behind clickable cards: open them when you need them, leave them collapsed for a quick scan. If a later section mentions “turning points” or “within R²” and the term feels slippery, this is the section to re-read.

1. Kuznets curve. The theoretical inverted-U relationship between economic development and income inequality, proposed by Simon Kuznets in 1955. Inequality should rise as countries industrialize, peak at intermediate income levels, then fall as services and welfare states emerge. The post tests whether modern panel data confirm or refute this pattern.

Example

Plotting gini against log_GDPpc for the 880 country-period observations, the unconditional pattern is closer to N-shaped than to a clean inverted-U. The Kuznets prediction is the null the post tests against.

Analogy

The textbook story. Like the Phillips curve in macroeconomics — a famous theoretical curve that the data sometimes confirm and sometimes contradict. Modern data is the audit on whether the curve still holds.

2. N-shaped relationship $\beta_1 + 2\beta_2 \ln Y + 3\beta_3 (\ln Y)^2 = 0$. A non-monotonic pattern with two turning points. Inequality rises with development, falls, then rises again at very high incomes. Captured by a cubic polynomial in log GDP. The N-shape is the post’s headline finding once fixed effects are imposed.

Example

The cubic TWFE estimates yield $\beta_1 = 0.293$, $\beta_2 = -0.032$, $\beta_3 = 0.001$. The derivative crosses zero twice, producing two turning points at \$2,287 and \$77,205. Below the first and above the second turning point, inequality is rising in income.

Analogy

A story with two acts. Act 1: inequality rises through industrialization. Act 2: inequality falls through welfare expansion. Modern data adds Act 3: at very high incomes, inequality rises again. The N captures all three acts.

3. Two-Way Fixed Effects (TWFE) $\alpha_i + \delta_t$. A panel estimator that absorbs both country fixed effects $\alpha_i$ and time-period fixed effects $\delta_t$. Identification comes from within-country deviations from country and period means. Removes time-invariant country features and global period shocks.

Example

This post’s headline cubic specification is TWFE. The estimator absorbs 180 country effects and 5 period effects, leaving only within-country, within-period variation to identify the polynomial coefficients.

Analogy

Wiping the negative twice. The first wipe removes country-specific stains (geography, institutions, culture). The second wipe removes period-specific glare (a global recession, a global commodity boom). What remains is the country’s change relative to its own typical trajectory.

4. Polynomial specification $\beta_1 \ln Y + \beta_2 (\ln Y)^2 + \beta_3 (\ln Y)^3$. Including powers of the regressor lets the relationship bend. Linear (just $\ln Y$) imposes monotonicity. Quadratic ($\ln Y$ and $(\ln Y)^2$) imposes a single inverted-U. Cubic adds a second turn. The post compares all three.

Example

The post fits linear, quadratic, and cubic versions of the TWFE model. The cubic is preferred on AIC and on coefficient significance: all three of $\beta_1, \beta_2, \beta_3$ are significant at $p < 0.001$, $p < 0.001$, and $p = 0.001$ respectively.

Analogy

Trying first-, second-, and third-order curves to fit a scatter. A line fits a straight road. A parabola fits a hill. A cubic fits a roller-coaster with two peaks. You pick the simplest curve that the data actually demand.

5. Turning points $\partial \mathrm{Gini} / \partial \ln Y = 0$. Income levels where the polynomial derivative crosses zero. The slope of inequality with respect to income changes sign at each turning point. Computed by solving the quadratic $\beta_1 + 2\beta_2 \ln Y + 3\beta_3 (\ln Y)^2 = 0$.

Example

With the cubic estimates, the two turning points sit at $\ln Y = 7.735$ (≈ \$2,287) and $\ln Y = 11.254$ (≈ \$77,205). Below \$2,287 inequality rises with income; between \$2,287 and \$77,205 it falls; above \$77,205 it rises again.

Analogy

Where the rollercoaster changes direction. Two turning points means two crests-or-troughs in the ride. The N-shape says: rise, fall, rise. Each turning point is a moment where the cart momentarily stops climbing or falling.

6. Within R² vs overall R². Two ways to summarize the fit of a panel regression. Overall R² uses both within and between variation in $y$. Within R² uses only the variation that survives demeaning. The within R² is what the FE model actually explains.

Example

The cubic TWFE has overall R² = 0.975 — most of which comes from the unit and time fixed effects mechanically explaining variation in gini. The within R² is 0.142 — the polynomial in log_GDPpc explains 14% of the within-country, within-period variation. The within R² is the honest number.

Analogy

How well you predict the changes. A great forecast of the past does not mean you understand what makes the future different. Within R² is the forecast on actual changes. Overall R² flatters the model with the easy parts.

7. Omitted variable bias (OVB). Bias from leaving out a confounder that correlates with both $\ln Y$ and gini. Pooled OLS ignores fixed country traits that drive both. TWFE removes time-invariant country traits. The 5x jump in coefficient magnitude between POLS and TWFE is an OVB diagnostic.

Example

Pooled OLS R² is 0.176 — most of the explanation comes from confounded between-country variation. TWFE within R² is 0.142 — almost all from within-country variation. The OVB hidden in pooled OLS is what motivates the FE specification.

Analogy

A stain on the camera lens. Pooled OLS thinks the dark spot in every photo is part of the subject. TWFE recognizes it is on the lens and wipes it off. What was attributed to “low GDP per capita” was actually country-specific shadow.

2. Setup and imports

Before running the analysis, install the required packages if needed:

pip install pyfixest great_tables

The following code imports PyFixest and standard data science libraries. pf.feols() is the main estimation function, accepting R-style formulas with a pipe | separator for fixed effects. Great Tables creates publication-quality tables rendered as PNG images.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pyfixest as pf
from great_tables import GT, md, style, loc
# Reproducibility
RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
# Site color palette
STEEL_BLUE = "#6a9bcc"
WARM_ORANGE = "#d97757"
NEAR_BLACK = "#141413"
TEAL = "#00d4c8"
# Data URLs
URL_TAB03 = "https://github.com/quarcs-lab/data-open/raw/master/pGDP/simpleTAB03.dta"
URL_TAB04 = "https://github.com/quarcs-lab/data-open/raw/master/pGDP/simpleTAB04.dta"

Dark theme figure styling (click to expand)

# Dark theme palette (consistent with site navbar/dark sections)
DARK_NAVY = "#0f1729"
GRID_LINE = "#1f2b5e"
LIGHT_TEXT = "#c8d0e0"
WHITE_TEXT = "#e8ecf2"
# Plot defaults — minimal, spine-free, dark background
plt.rcParams.update({
"figure.facecolor": DARK_NAVY,
"axes.facecolor": DARK_NAVY,
"axes.edgecolor": DARK_NAVY,
"axes.linewidth": 0,
"axes.labelcolor": LIGHT_TEXT,
"axes.titlecolor": WHITE_TEXT,
"axes.spines.top": False,
"axes.spines.right": False,
"axes.spines.left": False,
"axes.spines.bottom": False,
"axes.grid": True,
"grid.color": GRID_LINE,
"grid.linewidth": 0.6,
"grid.alpha": 0.8,
"xtick.color": LIGHT_TEXT,
"ytick.color": LIGHT_TEXT,
"xtick.major.size": 0,
"ytick.major.size": 0,
"text.color": WHITE_TEXT,
"font.size": 12,
"legend.frameon": False,
"legend.fontsize": 11,
"legend.labelcolor": LIGHT_TEXT,
"figure.edgecolor": DARK_NAVY,
"savefig.facecolor": DARK_NAVY,
"savefig.edgecolor": DARK_NAVY,
})

3. Data loading and panel structure

3.1 The Kuznets curve dataset

The dataset comes from Lessmann and Seidel (2017), who measured regional inequality within countries using satellite nighttime light data. The dependent variable is a population-weighted Gini coefficient — a number between 0 (perfect equality across regions) and 1 (all income concentrated in one region) — computed from subnational GDP estimates derived from nighttime light intensity. We load it directly from a Stata .dta file hosted on GitHub using pd.read_stata().

df3 = pd.read_stata(URL_TAB03)
print(f"Shape: {df3.shape}")
print(f"Columns: {list(df3.columns)}")
print(f"\nDescriptive statistics:")
print(df3.describe().round(4))
print(f"\nPanel structure:")
print(f" Countries: {df3['id'].nunique()}")
print(f" Time periods: {sorted(df3['year'].unique())}")
print(f"\nObservations per period:")
print(df3.groupby('year')['id'].count())

Shape: (880, 7)
Columns: ['id', 'year', 'country', 'gini', 'log_GDPpc', 'log_GDPpc2', 'log_GDPpc3']
Descriptive statistics:
id year gini log_GDPpc log_GDPpc2 log_GDPpc3
count 880.0000 880.0000 880.0000 880.0000 880.0000 880.0000
mean 89.9932 3.0318 0.0641 8.7599 78.2732 712.3774
std 51.9770 1.4090 0.0332 1.2403 21.6226 288.5019
min 1.0000 1.0000 0.0019 5.2458 27.5184 144.3558
25% 45.0000 2.0000 0.0381 7.7617 60.2448 467.6052
50% 89.5000 3.0000 0.0605 8.8514 78.3474 693.4843
75% 134.0000 4.0000 0.0847 9.7595 95.2473 929.5637
max 180.0000 5.0000 0.1601 11.6716 136.2253 1589.9617
Panel structure:
Countries: 180
Time periods: [1.0, 2.0, 3.0, 4.0, 5.0]
Observations per period:
Period 1: 168 | Period 2: 175 | Period 3: 178 | Period 4: 179 | Period 5: 180

The dataset contains 880 country-period observations spanning 180 countries across 5 time periods (5-year averages from 1990–1994 through 2010–2013, covering data from 1992–2012). The panel is slightly unbalanced — meaning not every country is observed in every period — with 168 countries in the first period growing to 180 by the last. The mean regional Gini is 0.064 with substantial variation (SD = 0.033, range 0.002 to 0.160), indicating that some countries have highly equal regional income distributions while others show pronounced disparities. Log GDP per capita ranges from 5.25 (about \$190, the poorest nations) to 11.67 (about \$117,000, oil-rich Gulf states), capturing the full development spectrum. The polynomial terms (log_GDPpc2, log_GDPpc3) are pre-computed in the dataset to ensure consistency with the original Stata analysis. Let us now visualize the data to see if the Kuznets pattern is visible.

3.2 The determinants dataset

A second dataset adds 14 covariates capturing resources, trade, mobility, governance, and ethnicity — the factors that may drive regional inequality beyond the Kuznets curve.

df4 = pd.read_stata(URL_TAB04)
print(f"Shape: {df4.shape}")
print(f"Key variables: gini, lnGDPpc (+ squared/cubed), rents, land, trade,")
print(f" fdi, gasoline, areaXgasoline, aid, school, ethnic_gini")
print(f"\nNotable missing values:")
print(f" aid: {df4['aid'].notna().sum()} / 880 ({df4['aid'].isna().mean():.0%} missing)")
print(f" school: {df4['school'].notna().sum()} / 880 ({df4['school'].isna().mean():.0%} missing)")
print(f" ethnic_gini: {df4['ethnic_gini'].notna().sum()} / 880 "
f"({df4['ethnic_gini'].isna().mean():.0%} missing)")

Shape: (880, 21)
Key variables: gini, lnGDPpc (+ squared/cubed), rents, land, trade,
fdi, gasoline, areaXgasoline, aid, school, ethnic_gini
Notable missing values:
aid: 711 / 880 (19% missing)
school: 748 / 880 (15% missing)
ethnic_gini: 845 / 880 (4% missing)

The determinants dataset includes the same 880 observations but adds 14 covariates. Missing data is most pronounced for foreign aid (19% missing) and school enrollment (15% missing), which will reduce sample sizes in some determinant models. We return to this dataset after establishing the Kuznets curve with fixed effects.

4. Visual exploration: Is there a Kuznets curve?

4.1 Pooled scatter with polynomial fits

Before estimating any regression, it helps to see the raw data. We plot every country-period observation of regional inequality against log GDP per capita, overlaying three polynomial fit lines: linear (dashed gray), quadratic (dashed teal), and cubic (solid orange). If the classic Kuznets inverted-U holds, the quadratic should capture the pattern. If the relationship bends twice — first up, then down, then up again — we need the cubic.

Think of a cubic polynomial as fitting a roller coaster track through the data: it can climb, descend, and rise again, capturing patterns that a straight line or simple curve would miss entirely.

fig, ax = plt.subplots(figsize=(10, 6))
x = df3["log_GDPpc"].values
y = df3["gini"].values
ax.scatter(x, y, alpha=0.35, s=18, color=STEEL_BLUE, edgecolors=DARK_NAVY)
# Fit and overlay three polynomial curves
x_grid = np.linspace(x.min(), x.max(), 200)
for deg, color, ls, lw, label in [
(1, LIGHT_TEXT, "--", 1.5, "Linear"),
(2, TEAL, "--", 1.8, "Quadratic (inverted-U)"),
(3, WARM_ORANGE, "-", 2.5, "Cubic (N-shape)"),
]:
coeffs = np.polyfit(x, y, deg)
ax.plot(x_grid, np.polyval(coeffs, x_grid), color=color, ls=ls, lw=lw, label=label)
ax.set_xlabel("Log GDP per capita (PPP, constant US$)")
ax.set_ylabel("Regional Inequality (Population-weighted Gini)")
ax.set_title("Regional Inequality vs National Development\n"
"180 Countries, 1992-2012 (pooled)")
ax.legend()
plt.savefig("kuznets_scatter_pooled.png", dpi=300, bbox_inches="tight")

The scatter reveals a clear pattern: regional inequality is highest among the poorest and richest nations, with lower inequality in the middle-income range. The linear fit (dashed gray) captures a downward trend but misses the curvature entirely. The quadratic fit (dashed teal) bends once but does not capture the upturn at high incomes. The cubic fit (solid orange) traces an N-shape — rising, falling, then rising again — that most closely follows the data cloud. This visual evidence motivates testing a cubic polynomial specification formally. But is this pattern stable across time periods?

4.2 Stability across periods

To check whether the N-shape is a persistent feature of the data or an artifact of a single time window, we plot the same scatter separately for each of the five periods:

periods = sorted(df3["year"].unique())
fig, axes = plt.subplots(1, len(periods), figsize=(20, 5), sharey=True)
# Map numeric periods to actual year ranges (Lessmann & Seidel 2017)
period_labels = {1: "1990--1994", 2: "1995--1999", 3: "2000--2004",
4: "2005--2009", 5: "2010--2013"}
for ax, period in zip(axes, periods):
sub = df3[df3["year"] == period]
ax.scatter(sub["log_GDPpc"], sub["gini"], alpha=0.4, s=20, color=STEEL_BLUE)
cp = np.polyfit(sub["log_GDPpc"], sub["gini"], 3)
xg = np.linspace(sub["log_GDPpc"].min(), sub["log_GDPpc"].max(), 100)
ax.plot(xg, np.polyval(cp, xg), color=WARM_ORANGE, lw=2)
ax.set_title(period_labels.get(int(period), f"Period {int(period)}"))
ax.set_xlabel("Log GDP pc")
axes[0].set_ylabel("Regional Gini")
plt.savefig("kuznets_scatter_by_period.png", dpi=300, bbox_inches="tight")

The N-shaped pattern appears in all five periods from 1990–1994 through 2010–2013, ruling out the possibility that the result is driven by a single unusual time window. The cubic fit line bends in the same direction across every panel, suggesting a stable structural relationship. Now let us formalize this with regression analysis, starting with the simplest pooled OLS specification.

5. Pooled OLS baseline: Linear, quadratic, and cubic

We begin by estimating three pooled OLS regressions of increasing polynomial complexity. The pooled specification treats every country-period observation as an independent draw, ignoring the panel structure entirely. This serves as a baseline that we will improve upon with fixed effects.

The cubic polynomial specification is:

$$\text{Gini}_i = \beta_0 + \beta_1 \ln(\text{GDP}_i) + \beta_2 [\ln(\text{GDP}_i)]^2 + \beta_3 [\ln(\text{GDP}_i)]^3 + \epsilon_i$$

In words, this equation models regional inequality as a polynomial function of log GDP per capita. The coefficient $\beta_1$ captures the linear association. The term $\beta_2$ allows the relationship to bend once (inverted-U if negative), and $\beta_3$ allows it to bend a second time (N-shape if positive). In the code, these correspond to log_GDPpc, log_GDPpc2, and log_GDPpc3.

We use pf.feols() to estimate all three models with clustered standard errors — standard errors that account for the fact that observations from the same country are not independent. The vcov={"CRV1": "id"} argument clusters at the country level.

# Pooled OLS: linear, quadratic, cubic
ols_linear = pf.feols("gini ~ log_GDPpc", data=df3, vcov={"CRV1": "id"})
ols_quad = pf.feols("gini ~ log_GDPpc + log_GDPpc2", data=df3, vcov={"CRV1": "id"})
ols_cubic = pf.feols("gini ~ log_GDPpc + log_GDPpc2 + log_GDPpc3", data=df3,
vcov={"CRV1": "id"})
# Compare coefficients across specifications
print("Pooled OLS Coefficient Comparison:")
print(f"{'Variable':<14} {'Linear':>10} {'Quadratic':>12} {'Cubic':>10}")
print("-" * 48)
for var in ["log_GDPpc", "log_GDPpc2", "log_GDPpc3"]:
vals = []
for m in [ols_linear, ols_quad, ols_cubic]:
vals.append(f"{m.coef()[var]:.4f}" if var in m.coef().index else "---")
print(f"{var:<14} {vals[0]:>10} {vals[1]:>12} {vals[2]:>10}")

Pooled OLS Coefficient Comparison:
Variable Linear Quadratic Cubic
------------------------------------------------
log_GDPpc -0.0108 0.0148 0.2405
log_GDPpc2 --- -0.0015 -0.0279
log_GDPpc3 --- --- 0.0010
R-squared: 0.164 0.170 0.176

The linear model shows a significant negative association between development and inequality (coefficient -0.011, p < 0.001), but explains only 16.4% of the variation. Adding the quadratic term barely improves fit (R-squared rises to 0.170) and neither term is individually significant, suggesting the simple inverted-U does not hold in the pooled data. The cubic specification reveals the N-shaped pattern (coefficients: 0.241, -0.028, 0.001) with all terms marginally significant (p-values around 0.07–0.09), but these are pooled estimates that confound between-country and within-country variation. The low R-squared of 0.176 confirms that cross-sectional variation dominates. Why does pooled OLS produce such noisy estimates? The answer lies in country heterogeneity.

6. Why fixed effects? The omitted variable problem

Pooled OLS treats all country-period observations as independent draws. But countries differ in geography, institutions, colonial history, and culture — factors that affect both inequality and development. If these unobserved factors correlate with GDP per capita, the pooled OLS coefficients are biased. This is called omitted variable bias — the regression attributes variation to GDP that is really driven by unobserved country characteristics.

Think of it this way: if you want to measure whether nutrition affects height, you cannot just compare children from different families — taller families tend to eat differently from shorter ones. You need to look at how height changes within the same family when nutrition changes. Fixed effects does exactly this for countries: it adds a separate intercept for each country, effectively controlling for all time-invariant country characteristics.

The spaghetti plot below makes this concrete. Each line traces a single country’s trajectory over time, while the dashed curve shows the pooled cross-sectional pattern.

# Select 20 countries spread across the GDP distribution
country_obs = df3.groupby("id").agg(
n_periods=("year", "count"), mean_gdp=("log_GDPpc", "mean")
).reset_index()
country_obs = country_obs[country_obs["n_periods"] >= 3].sort_values("mean_gdp")
idx = np.linspace(0, len(country_obs) - 1, 20, dtype=int)
selected_ids = country_obs.iloc[idx]["id"].values
fig, ax = plt.subplots(figsize=(10, 6))
for cid in selected_ids:
sub = df3[df3["id"] == cid].sort_values("log_GDPpc")
ax.plot(sub["log_GDPpc"], sub["gini"], color=LIGHT_TEXT, alpha=0.25,
lw=1.2, marker="o", ms=3)
# Highlight 6 diverse countries
highlight_ids = country_obs.iloc[
np.linspace(0, len(country_obs) - 1, 6, dtype=int)
]["id"].values
colors = [WARM_ORANGE, TEAL, STEEL_BLUE, "#e8956a", "#8ec8e8", "#66e8df"]
for i, cid in enumerate(highlight_ids):
sub = df3[df3["id"] == cid].sort_values("log_GDPpc")
ax.plot(sub["log_GDPpc"], sub["gini"], color=colors[i], lw=2.5,
marker="o", ms=5, label=sub["country"].iloc[0])
ax.set_xlabel("Log GDP per capita")
ax.set_ylabel("Regional Gini")
ax.set_title("Individual Country Trajectories vs Pooled Pattern\n"
"Each line = one country over time")
ax.legend(ncol=2)
plt.savefig("kuznets_spaghetti.png", dpi=300, bbox_inches="tight")

The spaghetti plot reveals the key insight: individual countries follow their own trajectories that differ substantially from the cross-sectional pattern. Liberia (far left) has high inequality at low GDP, while Qatar (far right) has high inequality at high GDP — but within each country, the trajectory over time looks nothing like the pooled cubic fit. A country at log GDP = 8 may have very different inequality than another at the same GDP level because of country-specific factors like geography, ethnic composition, and colonial history. Fixed effects remove these country-specific levels and focus only on how inequality changes within each country as it develops. Let us now estimate the fixed effects models.

7. Two-way fixed effects: Replicating Table 3

Two-way fixed effects (TWFE) adds two sets of dummy variables to the regression: country fixed effects ($\alpha_i$) absorb all time-invariant country characteristics, and year fixed effects ($\gamma_t$) absorb common global shocks like financial crises or commodity price swings. The model becomes:

$$\text{Gini}_{it} = \beta_1 \ln(\text{GDP}_{it}) + \beta_2 [\ln(\text{GDP}_{it})]^2 + \beta_3 [\ln(\text{GDP}_{it})]^3 + \alpha_i + \gamma_t + \epsilon_{it}$$

In words, this equation isolates the within-country, within-time-period relationship between development and inequality. The country fixed effects $\alpha_i$ ensure we compare each country to itself over time, not to other countries. The year fixed effects $\gamma_t$ ensure we do not conflate the Kuznets relationship with global trends. In PyFixest, we specify fixed effects after a pipe | in the formula: gini ~ log_GDPpc | id + year means regress gini on log_GDPpc, absorbing id (country) and year fixed effects.

# Three TWFE specifications: linear, quadratic, cubic
fe_linear = pf.feols("gini ~ log_GDPpc | id + year", data=df3, vcov={"CRV1": "id"})
fe_quad = pf.feols("gini ~ log_GDPpc + log_GDPpc2 | id + year", data=df3,
vcov={"CRV1": "id"})
fe_cubic = pf.feols("gini ~ log_GDPpc + log_GDPpc2 + log_GDPpc3 | id + year",
data=df3, vcov={"CRV1": "id"})
print("TWFE Cubic Model (Model 3):")
print(f" log_GDPpc: {fe_cubic.coef()['log_GDPpc']:.3f} "
f"(SE {fe_cubic.se()['log_GDPpc']:.3f}, p < 0.001) ***")
print(f" log_GDPpc2: {fe_cubic.coef()['log_GDPpc2']:.3f} "
f"(SE {fe_cubic.se()['log_GDPpc2']:.3f}, p < 0.001) ***")
print(f" log_GDPpc3: {fe_cubic.coef()['log_GDPpc3']:.3f} "
f"(SE {fe_cubic.se()['log_GDPpc3']:.3f}, p = 0.001) ***")
print(f" R-squared: {fe_cubic._r2:.3f} | R-squared Within: {fe_cubic._r2_within:.3f}")
print(f" Observations: {fe_cubic._N}")

TWFE Cubic Model (Model 3):
log_GDPpc: 0.293 (SE 0.078, p < 0.001) ***
log_GDPpc2: -0.032 (SE 0.009, p < 0.001) ***
log_GDPpc3: 0.001 (SE 0.000, p = 0.001) ***
R-squared: 0.975 | R-squared Within: 0.142
Observations: 879

Adding country and year fixed effects transforms the results dramatically. All three polynomial terms become highly significant (p < 0.001 for each), confirming the N-shaped relationship within countries over time. The overall R-squared of 0.975 indicates that country fixed effects absorb the vast majority of cross-sectional variation — 97.5% of total variation is explained once we account for which country and which period we are observing. The within-R-squared of 0.142 tells us that the cubic polynomial explains about 14.2% of the within-country variation in inequality, which is substantial given the short time dimension (5 periods). Compared to pooled OLS, the TWFE coefficients are slightly larger in magnitude (0.293 vs 0.241 for the linear term) and — crucially — the significance improves from marginal (p ~ 0.07) to highly significant (p < 0.001), demonstrating how fixed effects resolve omitted variable bias.

The Great Tables regression table below summarizes all three TWFE specifications in publication-quality format:

7.1 The linear TWFE model is uninformative

A key pedagogical finding emerges when we compare the three TWFE specifications side by side:

print("Linear TWFE:")
print(f" log_GDPpc: {fe_linear.coef()['log_GDPpc']:.3f} "
f"(SE {fe_linear.se()['log_GDPpc']:.3f}, "
f"p = {fe_linear.pvalue()['log_GDPpc']:.3f})")
print(f" R-squared Within: {fe_linear._r2_within:.3f}")

Linear TWFE:
log_GDPpc: -0.003 (SE 0.003, p = 0.265)
R-squared Within: 0.009

The linear TWFE model yields a coefficient of -0.003 that is statistically insignificant (p = 0.265) with a within-R-squared of only 0.009. A researcher who only estimated the linear specification would conclude that development has no relationship with inequality within countries — a misleading result. The true relationship is nonlinear: inequality rises with early development and falls later, so the linear approximation averages these opposing effects to roughly zero. This demonstrates why polynomial specifications are essential when testing the Kuznets hypothesis. Now let us compute where exactly the N-shaped curve bends.

8. The N-shaped curve: Computing turning points

The cubic TWFE model implies that inequality first rises, then falls, then rises again with development. To find where the curve changes direction, we take the first derivative of the polynomial and set it to zero:

$$\frac{\partial \text{Gini}}{\partial \ln(\text{GDP})} = \beta_1 + 2\beta_2 \ln(\text{GDP}) + 3\beta_3 [\ln(\text{GDP})]^2 = 0$$

In words, this equation asks: at what income level does the slope of the inequality-development relationship switch sign? Solving this quadratic equation yields two turning points — the first where inequality peaks and the second where it reaches a trough before rising again.

# Extract cubic TWFE coefficients
b1 = fe_cubic.coef()["log_GDPpc"] # 0.2931
b2 = fe_cubic.coef()["log_GDPpc2"] # -0.0320
b3 = fe_cubic.coef()["log_GDPpc3"] # 0.0011
# Solve: 3*b3*x^2 + 2*b2*x + b1 = 0
roots = np.roots([3 * b3, 2 * b2, b1])
real_roots = np.sort(roots[np.isreal(roots)].real)
turning_usd = np.exp(real_roots)
print(f"Cubic TWFE coefficients: b1 = {b1:.6f}, b2 = {b2:.6f}, b3 = {b3:.6f}")
print(f"Turning points (log scale): [{real_roots[0]:.3f}, {real_roots[1]:.3f}]")
print(f"Turning points (USD PPP): [${turning_usd[0]:,.0f}, ${turning_usd[1]:,.0f}]")

Cubic TWFE coefficients: b1 = 0.293112, b2 = -0.031969, b3 = 0.001122
Turning points (log scale): [7.735, 11.254]
Turning points (USD PPP): [$2,287, $77,205]

The two turning points define three development phases. The first turning point at \$2,287 GDP per capita marks where regional inequality peaks: below this threshold — very poor countries like Liberia and the DRC — development initially concentrates income in a leading region, widening the gap. Between \$2,287 and \$77,205 — the vast majority of countries, from Kenya through most of Europe — further development is associated with falling regional inequality as lagging regions catch up. The second turning point at \$77,205 suggests that the richest nations (essentially Qatar, Luxembourg, and similar outliers) may see inequality rise again as knowledge-economy agglomeration re-concentrates activity. These values closely replicate the paper’s reported thresholds of approximately \$2,288 and \$77,128, with minor differences due to rounding in the original Stata analysis.

The figure below visualizes the fitted N-shaped polynomial with shaded regions marking each development phase:

The three development phases are visually clear: rising inequality for the poorest nations (left orange region), convergence through middle income (blue region), and a secondary upturn at very high income (right orange region). The dual x-axis lets the reader map from log GDP — the scale used in the regression — to familiar dollar amounts. Next, let us compare the pooled OLS and TWFE estimates side by side.

9. Pooled OLS vs TWFE: Correcting for omitted variable bias

How much does controlling for country heterogeneity change the estimates? The table below compares the cubic polynomial coefficients from pooled OLS and TWFE:

print("Pooled OLS vs TWFE (cubic):")
print(f"{'Variable':<14} {'Pooled OLS':>12} {'TWFE':>12}")
print("-" * 40)
for var in ["log_GDPpc", "log_GDPpc2", "log_GDPpc3"]:
print(f"{var:<14} {ols_cubic.coef()[var]:>12.4f} {fe_cubic.coef()[var]:>12.4f}")

Pooled OLS vs TWFE (cubic):
Variable Pooled OLS TWFE
----------------------------------------
log_GDPpc 0.2405 0.2931
log_GDPpc2 -0.0279 -0.0320
log_GDPpc3 0.0010 0.0011

TWFE coefficients are slightly larger in magnitude than their pooled OLS counterparts (0.293 vs 0.241 for the linear term), and the confidence intervals are substantially tighter. The pooled OLS estimates are only marginally significant (p ~ 0.07), while the TWFE estimates are all significant at the 0.1% level. This demonstrates that fixed effects both correct bias (by removing confounding from time-invariant country characteristics) and improve precision (by reducing residual variance). The N-shape is not a cross-sectional artifact — it is a robust within-country phenomenon. Having established the Kuznets curve, we now turn to a broader question: what factors beyond income drive regional inequality?

10. Determinants of regional inequality

10.1 Exploring correlations

The determinants dataset adds nine variables capturing different channels through which factors might affect regional inequality: resource wealth, international trade, factor mobility, human capital, and ethnic composition. Before running regressions, we examine the correlation structure:

det_vars = ["gini", "lnGDPpc", "rents", "land", "trade", "fdi",
"gasoline", "aid", "school", "ethnic_gini"]
corr = df4[det_vars].corr()
fig, ax = plt.subplots(figsize=(10, 8))
im = ax.imshow(corr.values, cmap="RdBu_r", vmin=-1, vmax=1, aspect="auto")
for i in range(len(det_vars)):
for j in range(len(det_vars)):
ax.text(j, i, f"{corr.values[i, j]:.2f}", ha="center", va="center",
fontsize=8)
ax.set_title("Correlation Matrix: Determinants of Regional Inequality")
plt.savefig("kuznets_correlation_heatmap.png", dpi=300, bbox_inches="tight")

The ethnic Gini has the strongest positive correlation with regional inequality (r = 0.49), suggesting that countries with large income gaps between ethnic groups also tend to have large income gaps between regions. School enrollment has the strongest negative correlation (r = -0.41), consistent with education promoting regional convergence. Trade openness and GDP per capita are positively correlated (r = 0.38), which means pooled regressions of inequality on trade may partly reflect development effects. The fixed effects regressions below address this by controlling for the Kuznets polynomial and country heterogeneity simultaneously.

10.2 Determinant regressions: Replicating Table 4

We estimate five TWFE models, each adding a different group of determinants while keeping the cubic polynomial and country/year fixed effects. This replicates Table 4 of Lessmann and Seidel (2017):

det1 = pf.feols("gini ~ lnGDPpc + lnGDPpc2 + lnGDPpc3 + rents + land | id + year",
data=df4, vcov={"CRV1": "id"}) # Resources
det2 = pf.feols("gini ~ lnGDPpc + lnGDPpc2 + lnGDPpc3 + trade + fdi | id + year",
data=df4, vcov={"CRV1": "id"}) # Trade
det3 = pf.feols("gini ~ lnGDPpc + lnGDPpc2 + lnGDPpc3 + gasoline + areaXgasoline "
"| id + year", data=df4, vcov={"CRV1": "id"}) # Mobility
det4 = pf.feols("gini ~ lnGDPpc + lnGDPpc2 + lnGDPpc3 + aid + school | id + year",
data=df4, vcov={"CRV1": "id"}) # Aid/Education
det5 = pf.feols("gini ~ lnGDPpc + lnGDPpc2 + lnGDPpc3 + ethnic_gini | id + year",
data=df4, vcov={"CRV1": "id"}) # Ethnicity

Seven of nine determinants are statistically significant at the 10% level. Ethnic income inequality is the single strongest driver (coefficient 0.071, p < 0.001): a one-unit increase in the ethnic Gini is associated with a 7.1-percentage-point increase in regional inequality, holding the Kuznets curve constant. This is economically large given that the mean regional Gini is only 0.064. Arable land has the second-largest effect in absolute value but with the opposite sign (-0.053, p < 0.001), indicating that agricultural economies tend toward more equal regional development, likely because farming activity is geographically dispersed.

Resource rents increase inequality (0.018, p = 0.008), consistent with the “resource curse” — the pattern where natural resource wealth concentrates extractive income in specific regions. Trade openness modestly increases inequality (0.005, p = 0.007), suggesting that internationally connected regions pull ahead. Foreign aid increases inequality (0.015, p = 0.028), possibly because aid flows concentrate in capital cities. School enrollment reduces inequality (-0.014, p = 0.053), consistent with human capital diffusion promoting convergence.

FDI and gasoline price alone are not significant, though the interaction of gasoline price with country area is (0.006, p = 0.049), indicating that transport costs matter more in geographically large countries. But do these additional controls change the Kuznets curve itself?

10.3 Coefficient stability across specifications

A critical robustness check is whether the N-shaped Kuznets curve survives the addition of controls. If the polynomial coefficients change dramatically when we add determinants, the N-shape may be spurious — driven by omitted variables that correlate with both GDP and inequality:

specs = ["Baseline (Table 3)", "Resources", "Trade",
"Mobility", "Aid/Educ.", "Ethnicity"]
print(f"{'Specification':<20} {'ln(GDP)':>10} {'ln(GDP)^2':>12} {'ln(GDP)^3':>12}")
print("-" * 56)
for name, coefs in zip(specs, [
(0.2931, -0.0320, 0.0011), (0.3498, -0.0380, 0.0013),
(0.2054, -0.0222, 0.0008), (0.1711, -0.0186, 0.0007),
(0.2264, -0.0232, 0.0007), (0.1492, -0.0153, 0.0005),
]):
print(f"{name:<20} {coefs[0]:>10.4f} {coefs[1]:>12.4f} {coefs[2]:>12.4f}")

Specification ln(GDP) ln(GDP)^2 ln(GDP)^3
--------------------------------------------------------
Baseline (Table 3) 0.2931 -0.0320 0.0011
Resources 0.3498 -0.0380 0.0013
Trade 0.2054 -0.0222 0.0008
Mobility 0.1711 -0.0186 0.0007
Aid/Educ. 0.2264 -0.0232 0.0007
Ethnicity 0.1492 -0.0153 0.0005

The sign pattern (+, -, +) for the three polynomial terms is preserved across all six specifications, confirming the robustness of the N-shaped Kuznets curve. However, the magnitudes attenuate noticeably when ethnic inequality is included: the linear term drops from 0.293 to 0.149, and the cubic term halves from 0.0011 to 0.0005. This suggests that part of what appears as a “development effect” on regional inequality is actually driven by ethnic income disparities that correlate with development levels. The Resources specification actually strengthens the polynomial coefficients (0.350, -0.038, 0.001), indicating that controlling for resource rents and arable land sharpens the Kuznets curve rather than weakening it. The cubic term remains positive in all specifications but loses significance in the Aid/Education model (p = 0.180), where the smaller sample (N = 585) reduces statistical power.

10.4 Determinant effects at a glance

Finally, the bar chart below ranks all nine determinants by their coefficient magnitude, color-coded by whether they increase (orange) or decrease (blue) regional inequality:

The ethnic Gini dominates all other determinants, with a coefficient (0.071) that is 3.9 times larger than the next biggest positive effect (resource rents at 0.018) and 1.3 times larger than the largest effect in absolute value (arable land at -0.053). Arable land and school enrollment are the only factors that significantly reduce regional inequality, suggesting that geographically dispersed economic activity and broad-based human capital investment are the two channels through which countries can promote more equal regional development. The policy implication is clear: governments concerned about regional disparities should invest in education and be cautious about over-relying on resource extraction or trade liberalization, which tend to concentrate economic activity in specific regions.

11. Discussion

We can now answer the case study question: the relationship between regional inequality and economic development is N-shaped, not inverted-U. The cubic TWFE model yields highly significant coefficients (0.293, -0.032, 0.001, all p < 0.001) that define three development phases. Below \$2,287 GDP per capita, initial development concentrates economic activity and widens regional gaps. Between \$2,287 and \$77,205, the convergence story dominates — lagging regions catch up as infrastructure, education, and market access spread. Above \$77,205, inequality may rise again as knowledge-economy agglomeration re-concentrates activity, though this second upturn is estimated from very few observations (Qatar, Luxembourg, Norway).

The fixed effects framework proved essential. A researcher who estimated only the linear specification would conclude that development has no effect on inequality (coefficient -0.003, p = 0.265). This is wrong — the true relationship is nonlinear, and the opposing effects at different development stages cancel out in a linear model.

Among determinants, ethnic income inequality stands out as the most powerful driver of regional disparities. When ethnic inequality is controlled, the Kuznets polynomial attenuates substantially (the linear term drops from 0.293 to 0.149), raising the question of whether the Kuznets curve is partly an artifact of ethnic composition correlating with income levels. This finding has direct policy relevance: addressing ethnic income gaps may be a more effective lever for reducing regional inequality than broad economic growth alone.

Several caveats apply. The second turning point at \$77,205 is beyond most of the data and should be interpreted cautiously. Missing data reduces sample sizes for some determinant models (the Aid/Education model drops to 585 observations from 880). The within-R-squared ranges from 0.01 to 0.28 depending on the specification, meaning that a substantial share of within-country inequality variation remains unexplained. Most importantly, this analysis is descriptive, not causal. Fixed effects control for time-invariant confounders but cannot address time-varying confounders. The “determinants” should be interpreted as associations conditional on the Kuznets curve and country/year fixed effects, not as causal effects.

12. Summary and next steps

Key takeaways:

The Kuznets curve is N-shaped, not inverted-U. The cubic TWFE model with country and year fixed effects yields coefficients of 0.293, -0.032, and 0.001 (all p < 0.001), with a within-R-squared of 0.142 compared to just 0.009 for the linear specification. The N-shape is robust across all six model specifications.
Turning points anchor three development phases. Regional inequality peaks at \$2,287 GDP per capita and reaches a trough at \$77,205, defining a broad convergence zone where most of the world’s countries fall. The pattern is stable across all five time periods in the data.
Ethnic income inequality is the strongest determinant of regional disparities. With a coefficient of 0.071 (p < 0.001), it is 3.9 times larger than the next biggest positive effect. Controlling for it halves the Kuznets polynomial coefficients, suggesting that ethnic composition partly drives the apparent development-inequality relationship.
Fixed effects are essential for uncovering the Kuznets relationship. Pooled OLS cubic coefficients are only marginally significant (p ~ 0.07), while TWFE coefficients are highly significant (p < 0.001). The linear TWFE model is completely uninformative (p = 0.265), demonstrating that both the polynomial specification and the fixed effects are needed to reveal the true pattern.

Limitations: The analysis covers 1992–2012; patterns may differ with more recent data. The second turning point (\$77,205) relies on very few observations. The panel has only 5 periods, limiting within-country variation. All results are associations, not causal effects.

Next steps: Extend the analysis to more recent satellite data (e.g., VIIRS nighttime lights). Test whether the N-shape holds at the subnational level within individual countries. Explore instrumental variables or shift-share designs to identify causal effects of trade, FDI, or aid on regional inequality.

13. Exercises

Quadratic vs cubic test. Re-estimate the TWFE model with just the quadratic polynomial (gini ~ log_GDPpc + log_GDPpc2 | id + year). How does the within-R-squared compare to the cubic model (0.142)? Is the cubic term ($\beta_3$) individually significant? What would you conclude about the Kuznets hypothesis from the quadratic specification alone?
Subsample analysis. Split the sample into OECD and non-OECD countries. Re-estimate the cubic TWFE model for each subsample. Does the N-shape hold in both groups, or is it driven primarily by one? What happens to the turning points?
Full determinants model. Estimate a single TWFE model that includes all nine determinants simultaneously (rather than in separate models). How do the coefficients change compared to Table 4? Which variables remain significant? What does multicollinearity among the determinants do to the standard errors?

14. References

Exploratory Spatial Data Analysis: Spatial Clusters and Dynamics of Human Development in South America

Sun, 22 Mar 2026 00:00:00 +0000

1. Overview

When we look at a map of human development across South America, a pattern immediately stands out: prosperous regions tend to cluster together, and so do lagging regions. But is this clustering statistically significant, or could it arise by chance? And how have these spatial clusters evolved over time?

Exploratory Spatial Data Analysis (ESDA) provides the tools to answer these questions. ESDA is a set of techniques for visualizing spatial distributions, identifying patterns of spatial clustering, and detecting spatial outliers. Unlike standard exploratory data analysis, which treats observations as independent, ESDA explicitly accounts for the geographic location of each observation and the relationships between neighbors.

This tutorial uses the Subnational Human Development Index (SHDI) from Smits and Permanyer (2019) for 153 sub-national regions across 12 South American countries in 2013 and 2019 — the same dataset from the Pooled PCA tutorial. We progress from simple scatter plots and choropleth maps to formal tests of spatial dependence (Moran’s I), local cluster identification (LISA maps), and space-time dynamics. By the end, you will be able to answer: do nearby regions in South America share similar development levels, and how have these spatial clusters evolved between 2013 and 2019?

Learning objectives:

Understand the concept of spatial autocorrelation and why it matters for regional analysis
Create choropleth maps and scatter plots to visualize spatial distributions
Build and interpret a spatial weights matrix using Queen contiguity
Compute and interpret global Moran’s I for spatial dependence testing
Identify local spatial clusters (HH, LL) and outliers (HL, LH) using LISA statistics
Explore space-time dynamics of spatial clusters using directional Moran scatter plots
Compare country-level development trajectories within the spatial framework

Key concepts at a glance

The post leans on a small vocabulary repeatedly. The rest of the tutorial assumes you can move between these terms quickly. Each concept below has three parts. The definition is always visible. The example and analogy sit behind clickable cards: open them when you need them, leave them collapsed for a quick scan. If a later section mentions “Moran’s I” or “LISA” and the term feels slippery, this is the section to re-read.

1. Spatial weights matrix $W$, $w_{ij}$. An $n \times n$ matrix encoding which units are “neighbours” of which. Queen contiguity sets $w_{ij} = 1$ if regions $i$ and $j$ share an edge or vertex.

Example

In this post, libpysal.weights.Queen.from_dataframe(gdf) builds a Queen-contiguity weights matrix for the 153 South American regions. Most regions have 4–6 neighbours; islands have zero.

Analogy

A friendship graph between regions — who shares a fence with whom.

2. Global spatial autocorrelation (Moran’s I) $I = \frac{n}{\sum w} \cdot \frac{\sum_i \sum_j w_{ij}(y_i - \bar y)(y_j - \bar y)}{\sum (y_i - \bar y)^2}$. A scalar summary of how much like-values cluster geographically. Positive $I$ = clustering; near zero = random; negative = checkerboard.

Example

Moran’s I on SHDI is 0.5680 in 2013 and 0.6320 in 2019. Strong positive autocorrelation in both years — and the clustering strengthened. Permutation $p$ = 0.0010 for both: extremely unlikely under a null of randomness.

Analogy

How strongly opinions cluster among friends.

3. Local spatial autocorrelation (LISA) $I_i = z_i \sum_j w_{ij} z_j$. Decomposes the global Moran’s I into a per-unit local statistic. Identifies which regions belong to clusters, not just whether clusters exist.

Example

LISA in 2019 flags 30 high-high regions, 37 low-low regions, and 6 outliers (5 HL, 1 LH). 80 are statistically not significant. The clusters cover roughly half the map.

Analogy

The local cliques inside the social network — who gathers, not just whether anyone does.

4. Cluster typology HH, LL, HL, LH. Each significant LISA observation belongs to one of four types: HH (high value, high neighbours = hot spot), LL (cold spot), HL (high value, low neighbours = high outlier), LH (low value, high neighbours = low outlier).

Example

The post’s LISA map shows HH clusters concentrated in southern Chile and southeast Brazil, LL clusters in Guyana and northern Bolivia. Outliers are rare (5 HL, 1 LH in 2019).

Analogy

Popular kids surrounded by popular kids vs the lone rebel surrounded by the in-crowd.

5. Choropleth map. A map where each region is shaded by the value of a variable. Quantile and equal-interval are the two most common classification schemes.

Example

This post draws choropleths of SHDI for 2013 and 2019 with 5-class quantile breaks. The colour scale exposes regional inequality at a glance.

Analogy

A heat-map of the country.

6. Spatial spillover. The phenomenon that a region’s outcome is shaped by its neighbours' outcomes (or covariates). The reason regions are not independent observations.

Example

In this post, regions adjacent to Venezuelan ones experienced an SHDI decline of -0.0653 on average — the crisis spilled into neighbouring economies. Bolivia, by contrast, gained +0.0333.

Analogy

Your neighbours' garage band wakes you up too — your sleep is not independent of theirs.

7. Space-time dynamics. Comparing LISA results at $t_1$ vs $t_2$ to see how clusters move, expand, or fade. The directional Moran scatter plot summarizes the transitions.

Example

Between 2013 and 2019, 88% of Venezuelan regions moved into the LL cluster — a hot-spot collapse. The number of LL regions grew from 29 to 37; HH stayed roughly constant at 30–31.

Analogy

The social network changes year to year — cliques form and dissolve.

8. Permutation inference. $p$-values computed by randomly shuffling the outcome across regions thousands of times and asking how often the simulated Moran’s I exceeds the observed one. No normality assumption needed.

Example

For both 2013 and 2019, 999 random permutations of SHDI yield $p$ = 0.0010 (the smallest possible with 999 draws). The observed clustering is statistically extreme by any standard.

Analogy

Shuffling the seating chart at random to ask whether cliques would form by chance.

2. The ESDA pipeline

The analysis follows a natural progression from visualization to formal testing. Each step builds on the previous one, moving from “what does the data look like?” to “is the spatial pattern statistically significant?” to “where exactly are the clusters?”

graph LR
A["<b>Step 1</b><br/>Load &<br/>Explore"] --> B["<b>Step 2</b><br/>Visualize<br/>Maps"]
B --> C["<b>Step 3</b><br/>Spatial<br/>Weights"]
C --> D["<b>Step 4</b><br/>Global<br/>Moran's I"]
D --> E["<b>Step 5</b><br/>Local<br/>LISA"]
E --> F["<b>Step 6</b><br/>Space-Time<br/>Dynamics"]
style A fill:#141413,stroke:#6a9bcc,color:#fff
style B fill:#d97757,stroke:#141413,color:#fff
style C fill:#6a9bcc,stroke:#141413,color:#fff
style D fill:#6a9bcc,stroke:#141413,color:#fff
style E fill:#00d4c8,stroke:#141413,color:#fff
style F fill:#1a3a8a,stroke:#141413,color:#fff

Steps 1–2 are purely visual — they build intuition about where high and low values are concentrated. Step 3 formalizes the notion of “neighbors” through a spatial weights matrix. Steps 4–5 use that matrix to compute statistics that quantify spatial clustering, first globally (one number for the whole map) and then locally (one number per region). Step 6 connects the spatial and temporal dimensions by tracking how regions move through the Moran scatter plot between periods.

3. Setup and imports

The analysis uses GeoPandas for spatial data handling, PySAL for spatial statistics, and splot for specialized spatial visualizations.

import numpy as np
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
from libpysal.weights import Queen
from libpysal.weights import lag_spatial
from esda.moran import Moran, Moran_Local
from splot.esda import moran_scatterplot, lisa_cluster
from splot.libpysal import plot_spatial_weights
from adjustText import adjust_text
import mapclassify
# Reproducibility
RANDOM_SEED = 42
# Site color palette
STEEL_BLUE = "#6a9bcc"
WARM_ORANGE = "#d97757"
NEAR_BLACK = "#141413"
TEAL = "#00d4c8"

Dark theme figure styling (click to expand)

# Dark theme palette (consistent with site navbar/dark sections)
DARK_NAVY = "#0f1729"
GRID_LINE = "#1f2b5e"
LIGHT_TEXT = "#c8d0e0"
WHITE_TEXT = "#e8ecf2"
# Plot defaults — minimal, spine-free, dark background
plt.rcParams.update({
"figure.facecolor": DARK_NAVY,
"axes.facecolor": DARK_NAVY,
"axes.edgecolor": DARK_NAVY,
"axes.linewidth": 0,
"axes.labelcolor": LIGHT_TEXT,
"axes.titlecolor": WHITE_TEXT,
"axes.spines.top": False,
"axes.spines.right": False,
"axes.spines.left": False,
"axes.spines.bottom": False,
"axes.grid": True,
"grid.color": GRID_LINE,
"grid.linewidth": 0.6,
"grid.alpha": 0.8,
"xtick.color": LIGHT_TEXT,
"ytick.color": LIGHT_TEXT,
"xtick.major.size": 0,
"ytick.major.size": 0,
"text.color": WHITE_TEXT,
"font.size": 12,
"legend.frameon": False,
"legend.fontsize": 11,
"legend.labelcolor": LIGHT_TEXT,
"figure.edgecolor": DARK_NAVY,
"savefig.facecolor": DARK_NAVY,
"savefig.edgecolor": DARK_NAVY,
})

4. Data loading and exploration

The dataset is a GeoJSON file containing polygon geometries and development indicators for 153 sub-national regions across South America. It is a spatial version of the data from the Pooled PCA tutorial, sourced from the Global Data Lab (Smits and Permanyer, 2019). Each region has the Subnational Human Development Index (SHDI) and its three component indices — Health, Education, and Income — for 2013 and 2019.

DATA_URL = "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/python_esda2/data.geojson"
gdf = gpd.read_file(DATA_URL)
print(f"Loaded: {gdf.shape[0]} rows, {gdf.shape[1]} columns")
print(f"Countries: {gdf['country'].nunique()}")
print(f"CRS: {gdf.crs}")

Loaded: 153 rows, 25 columns
Countries: 12
CRS: EPSG:4326

Before computing change columns, we prepare the data for labeling. Some region names in the raw data are very long (e.g., “Chubut, Neuquen, Rio Negro, Santa Cruz, Tierra del Fuego”), so we simplify them. We also create a region_country column that appends the ISO country code to each region name — this makes labels immediately informative when regions from different countries appear on the same plot.

# Country name → ISO 3166-1 alpha-3 code
COUNTRY_ISO = {
"Argentina": "ARG", "Bolivia": "BOL", "Brazil": "BRA",
"Chili": "CHL", "Colombia": "COL", "Ecuador": "ECU",
"Guyana": "GUY", "Paraguay": "PRY", "Peru": "PER",
"Suriname": "SUR", "Uruguay": "URY", "Venezuela": "VEN",
}
gdf["country_iso"] = gdf["country"].map(COUNTRY_ISO)
# Simplify long region names
RENAME = {
"Catamarca, La Rioja, San Juan": "Catamarca-La Rioja",
"Corrientes, Entre Rios, Misiones": "Corrientes-Misiones",
"Chubut, Neuquen, Rio Negro, Santa Cruz, Tierra del Fuego": "Patagonia",
"La Pampa, San Luis, Mendoza": "La Pampa-Mendoza",
"Santiago del Estero, Tucuman": "Tucuman-Sgo Estero",
"Tarapaca (incl Arica and Parinacota)": "Tarapaca",
"Valparaiso (former Aconcagua)": "Valparaiso",
"Los Lagos (incl Los Rios)": "Los Lagos",
"Magallanes and La Antartica Chilena": "Magallanes",
"Antioquia (incl Medellin)": "Antioquia",
"Atlantico (incl Barranquilla)": "Atlantico",
"Bolivar (Sur and Norte)": "Bolivar",
"Essequibo Islands-West Demerara": "Essequibo-W Demerara",
"East Berbice-Corentyne": "E Berbice-Corentyne",
"Upper Takutu-Upper Essequibo": "Upper Takutu-Essequibo",
"Upper Demerara-Berbice": "Upper Demerara",
"Cuyuni-Mazaruni-Upper Essequibo": "Cuyuni-Mazaruni",
"Region Metropolitana": "R. Metropolitana",
"Federal District": "Federal Dist.",
"City of Buenos Aires": "C. Buenos Aires",
"Brokopondo and Sipaliwini": "Brokopondo-Sipaliwini",
"Montevideo and Metropolitan area": "Montevideo",
}
gdf["region"] = gdf["region"].replace(RENAME)
# Create region_country label column
gdf["region_country"] = gdf["region"] + " (" + gdf["country_iso"] + ")"

We then compute the change in SHDI and its components between the two periods.

gdf["shdi_change"] = gdf["shdi2019"] - gdf["shdi2013"]
gdf["health_change"] = gdf["healthindex2019"] - gdf["healthindex2013"]
gdf["educ_change"] = gdf["edindex2019"] - gdf["edindex2013"]
gdf["income_change"] = gdf["incindex2019"] - gdf["incindex2013"]
print(gdf[["shdi2013", "shdi2019", "shdi_change"]].describe().round(4).to_string())

 shdi2013 shdi2019 shdi_change
count 153.0000 153.0000 153.0000
mean 0.7424 0.7477 0.0053
std 0.0594 0.0613 0.0319
min 0.5540 0.5580 -0.0670
25% 0.7070 0.7150 0.0090
50% 0.7430 0.7440 0.0150
75% 0.7740 0.7840 0.0250
max 0.8780 0.8830 0.0450

The dataset covers 153 regions across 12 South American countries. Mean SHDI increased modestly from 0.7424 in 2013 to 0.7477 in 2019 (+0.0053), but the change varied widely: from a maximum decline of -0.0670 to a maximum improvement of +0.0450. The standard deviation of SHDI also increased slightly (0.0594 to 0.0613), hinting that regional disparities may have widened.

5. Exploratory scatter plots

5.1 HDI scatter: 2013 vs 2019

A scatter plot of SHDI in 2013 against SHDI in 2019 provides a quick overview of temporal dynamics. Points above the 45-degree line represent regions that improved; points below represent regions that declined.

fig, ax = plt.subplots(figsize=(8, 7))
ax.scatter(gdf["shdi2013"], gdf["shdi2019"],
color=STEEL_BLUE, edgecolors=DARK_NAVY, s=45, alpha=0.75, zorder=3)
lims = [min(gdf["shdi2013"].min(), gdf["shdi2019"].min()) - 0.01,
max(gdf["shdi2013"].max(), gdf["shdi2019"].max()) + 0.01]
ax.plot(lims, lims, color=WARM_ORANGE, linewidth=1.5, linestyle="--",
label="45° line (no change)", zorder=2)
ax.set_xlabel("SHDI 2013")
ax.set_ylabel("SHDI 2019")
ax.set_title("Subnational HDI: 2013 vs 2019")
ax.legend()
# Label extreme regions (biggest gains, biggest losses, highest, lowest)
residual = gdf["shdi2019"] - gdf["shdi2013"]
extremes = set()
extremes.update(residual.nlargest(3).index.tolist())
extremes.update(residual.nsmallest(3).index.tolist())
extremes.update(gdf["shdi2019"].nlargest(2).index.tolist())
extremes.update(gdf["shdi2019"].nsmallest(2).index.tolist())
texts = []
for i in extremes:
texts.append(ax.text(gdf.loc[i, "shdi2013"], gdf.loc[i, "shdi2019"],
gdf.loc[i, "region_country"], fontsize=8, color=LIGHT_TEXT))
adjust_text(texts, ax=ax, arrowprops=dict(arrowstyle="-", color=LIGHT_TEXT,
alpha=0.5, lw=0.5))
plt.savefig("esda2_scatter_hdi.png", dpi=300, bbox_inches="tight")
plt.show()

Of 153 regions, 126 improved their SHDI between 2013 and 2019, while 27 declined. The labels identify key cases: at the top, C. Buenos Aires (ARG) and R. Metropolitana (CHL) lead with SHDI above 0.88. At the bottom, Potaro-Siparuni (GUY) and Barima-Waini (GUY) remain the least developed. The biggest decliners — Federal Dist. (VEN), Carabobo (VEN), and Aragua (VEN) — are all Venezuelan states, falling well below the 45-degree line. The biggest improvers — Meta (COL), Vichada (COL), and Brokopondo-Sipaliwini (SUR) — rose above the line, with gains up to +0.045 points.

5.2 Component scatter plots

The SHDI is a composite of three sub-indices: Health, Education, and Income. Breaking down the change by component reveals which dimensions drove the aggregate patterns.

fig, axes = plt.subplots(1, 3, figsize=(18, 5.5))
components = [
("healthindex2013", "healthindex2019", "Health Index"),
("edindex2013", "edindex2019", "Education Index"),
("incindex2013", "incindex2019", "Income Index"),
]
for ax, (col13, col19, label) in zip(axes, components):
ax.scatter(gdf[col13], gdf[col19],
color=STEEL_BLUE, edgecolors=DARK_NAVY, s=40, alpha=0.7, zorder=3)
lims = [min(gdf[col13].min(), gdf[col19].min()) - 0.02,
max(gdf[col13].max(), gdf[col19].max()) + 0.02]
ax.plot(lims, lims, color=WARM_ORANGE, linewidth=1.5, linestyle="--", zorder=2)
ax.set_xlabel(f"{label} 2013")
ax.set_ylabel(f"{label} 2019")
ax.set_title(label)
# Label extreme regions per component
comp_residual = gdf[col19] - gdf[col13]
comp_extremes = set()
comp_extremes.update(comp_residual.nlargest(2).index.tolist())
comp_extremes.update(comp_residual.nsmallest(2).index.tolist())
texts = []
for i in comp_extremes:
texts.append(ax.text(gdf.loc[i, col13], gdf.loc[i, col19],
gdf.loc[i, "region_country"], fontsize=7, color=LIGHT_TEXT))
adjust_text(texts, ax=ax, arrowprops=dict(arrowstyle="-", color=LIGHT_TEXT,
alpha=0.5, lw=0.5))
fig.suptitle("HDI components: 2013 vs 2019", fontsize=14, y=1.02)
plt.tight_layout()
plt.savefig("esda2_scatter_components.png", dpi=300, bbox_inches="tight")
plt.show()

The three components tell very different stories. Health and Education improved almost universally — the vast majority of points lie above the 45-degree line. Income, however, tells a starkly different story: 71 of 153 regions (46.4%) experienced a decline in their income index between 2013 and 2019. This mixed signal — education and health gains partially offset by income losses — explains why the aggregate SHDI improvement was so modest (+0.005 on average). The income panel also shows wider scatter, indicating greater heterogeneity in economic trajectories across the continent.

6. Choropleth maps

6.1 HDI levels across South America

The scatter plots tell us what changed, but not where. Choropleth maps add the geographic dimension by coloring each region according to its SHDI value. To make the two years directly comparable, we use Fisher-Jenks natural breaks computed from 2013 and held constant for 2019. Fisher-Jenks is a classification method that finds natural groupings in data by minimizing within-class variance — it places break points where the data naturally separates into clusters. This way, a color change between maps reflects a genuine shift in development class, not a shifting classification scheme. The legend shows the number of regions in each class, making it easy to see how the distribution shifted.

import mapclassify
from matplotlib.patches import Patch
# Fisher-Jenks breaks from 2013 (5 classes)
fj = mapclassify.FisherJenks(gdf["shdi2013"].values, k=5)
breaks = fj.bins.tolist()
# Extend upper break to cover 2019 max
max_val = max(gdf["shdi2013"].max(), gdf["shdi2019"].max())
if max_val > breaks[-1]:
breaks[-1] = float(round(max_val + 0.001, 3))
# Apply same breaks to 2019
fj_2019 = mapclassify.UserDefined(gdf["shdi2019"].values, bins=breaks)
# Class transitions
classes_2013 = fj.yb
classes_2019 = fj_2019.yb
improved = (classes_2019 > classes_2013).sum()
stayed = (classes_2019 == classes_2013).sum()
declined = (classes_2019 < classes_2013).sum()
print(f"Breaks (from 2013): {[round(b, 3) for b in breaks]}")
print(f" Improved (moved up): {improved}")
print(f" Stayed same: {stayed}")
print(f" Declined (moved down): {declined}")

Breaks (from 2013): [0.622, 0.693, 0.734, 0.789, 0.884]
Improved (moved up): 43
Stayed same: 86
Declined (moved down): 24

# Class labels
class_labels = []
lower = round(gdf["shdi2013"].min(), 2)
for b in breaks:
class_labels.append(f"{lower:.2f} – {b:.2f}")
lower = round(b, 2)
fig, axes = plt.subplots(1, 2, figsize=(16, 12))
cmap = plt.cm.coolwarm
norm = plt.Normalize(vmin=0, vmax=len(breaks) - 1)
for ax, year_col, title, year_fj in [
(axes[0], "shdi2013", "SHDI 2013", fj),
(axes[1], "shdi2019", "SHDI 2019", fj_2019),
]:
colors = [cmap(norm(c)) for c in year_fj.yb]
gdf.plot(ax=ax, color=colors, edgecolor=GRID_LINE, linewidth=0.3)
ax.set_title(title, fontsize=14, pad=10)
ax.set_axis_off()
# Legend with region counts per class
counts = np.bincount(year_fj.yb, minlength=len(breaks))
handles = [Patch(facecolor=cmap(norm(i)), edgecolor=GRID_LINE,
label=f"{cl} (n={c})")
for i, (cl, c) in enumerate(zip(class_labels, counts))]
ax.legend(handles=handles, title="SHDI Class", loc="lower right",
fontsize=10, title_fontsize=11)
# Label extreme regions on both maps
map_extremes = gdf["shdi2019"].nlargest(3).index.tolist() + \
gdf["shdi2019"].nsmallest(3).index.tolist()
for ax_map in axes:
texts = []
for i in map_extremes:
centroid = gdf.geometry.iloc[i].centroid
texts.append(ax_map.text(centroid.x, centroid.y,
gdf.loc[i, "region_country"],
fontsize=7, color=WHITE_TEXT, weight="bold"))
adjust_text(texts, ax=ax_map, arrowprops=dict(arrowstyle="-|>",
color=LIGHT_TEXT, alpha=0.9, lw=1.2, mutation_scale=8))
plt.savefig("esda2_choropleth_hdi.png", dpi=300, bbox_inches="tight")
plt.show()

The Fisher-Jenks classification reveals both persistence and change in South America’s development geography. Using the same 2013 breaks for both maps, 43 regions moved up at least one class between 2013 and 2019, 86 stayed in the same class, and 24 declined. The legend counts make the shifts visible: the lowest class shrank from n=6 to n=4, while the middle classes absorbed most of the movement. The Southern Cone and southern Brazil consistently occupy the highest class (red tones), while the Amazon basin, Guyana, and parts of Venezuela anchor the lowest class (blue tones). This visual clustering is precisely what spatial autocorrelation statistics will later quantify — high values are surrounded by high values, and low values are surrounded by low values.

6.2 Mapping HDI change

A map of SHDI change (2019 minus 2013) reveals the geographic distribution of gains and losses, using a diverging color scale centered at zero.

fig, ax = plt.subplots(1, 1, figsize=(10, 10))
abs_max = max(abs(gdf["shdi_change"].min()), abs(gdf["shdi_change"].max()))
gdf.plot(column="shdi_change", cmap="RdYlGn", ax=ax, legend=False,
edgecolor=DARK_NAVY, linewidth=0.3, vmin=-abs_max, vmax=abs_max)
ax.set_title("Change in SHDI (2019 - 2013)", fontsize=14, pad=10)
ax.set_axis_off()
# Label biggest gainers and losers
change_top = gdf["shdi_change"].nlargest(3).index.tolist()
change_bot = gdf["shdi_change"].nsmallest(3).index.tolist()
texts = []
for i in change_top + change_bot:
centroid = gdf.geometry.iloc[i].centroid
texts.append(ax.text(centroid.x, centroid.y, gdf.loc[i, "region"],
fontsize=7, color=WHITE_TEXT, weight="bold"))
adjust_text(texts, ax=ax, arrowprops=dict(arrowstyle="-|>",
color=LIGHT_TEXT, alpha=0.9, lw=1.2,
mutation_scale=8))
sm = plt.cm.ScalarMappable(cmap="RdYlGn",
norm=plt.Normalize(vmin=-abs_max, vmax=abs_max))
cbar = fig.colorbar(sm, ax=ax, orientation="horizontal",
fraction=0.03, pad=0.02, aspect=40)
cbar.set_label("SHDI change (2019 - 2013)")
plt.savefig("esda2_choropleth_change.png", dpi=300, bbox_inches="tight")
plt.show()

The change map reveals that development losses are geographically concentrated, not randomly scattered. The labels pinpoint the extremes: Federal Dist. (VEN), Carabobo (VEN), and Aragua (VEN) show the deepest red (declines of up to -0.067 points), while Vichada (COL), Meta (COL), and Brokopondo-Sipaliwini (SUR) show the brightest green (improvements of up to +0.045). The geographic concentration of gains and losses suggests that spatial proximity plays a role in development trajectories — a hypothesis that we formalize in the next sections.

7. Spatial weights

7.1 What is a spatial weights matrix?

To test for spatial clustering formally, we first need to define what “neighbor” means. A spatial weights matrix $W$ is an $n \times n$ matrix where each entry $w_{ij}$ encodes the spatial relationship between regions $i$ and $j$. If two regions are neighbors, $w_{ij} > 0$; if not, $w_{ij} = 0$.

The most common approach for polygon data is contiguity-based weights:

Queen contiguity: Two regions are neighbors if they share any boundary point (even a single corner). Named after the queen in chess, which can move in any direction.
Rook contiguity: Two regions are neighbors only if they share an edge (not just a corner). More restrictive than Queen.

We use Queen contiguity because it captures the broadest definition of adjacency, which is appropriate for irregular administrative boundaries.

7.2 Building Queen contiguity weights

PySAL’s Queen.from_dataframe() builds the weights matrix directly from a GeoDataFrame. After construction, we row-standardize the matrix so that each region’s neighbor weights sum to 1. This makes the spatial lag (the weighted average of neighbors' values) directly interpretable as the mean neighbor value.

from libpysal.weights import Queen
W = Queen.from_dataframe(gdf)
W.transform = "r" # Row-standardize
print(f"Number of regions: {W.n}")
print(f"Min neighbors: {W.min_neighbors}")
print(f"Max neighbors: {W.max_neighbors}")
print(f"Mean neighbors: {W.mean_neighbors:.2f}")
print(f"Islands: {W.islands}")

Number of regions: 153
Min neighbors: 0
Max neighbors: 11
Mean neighbors: 4.93
Islands: [87, 145]

The Queen contiguity matrix connects 153 regions with an average of 4.93 neighbors each (minimum 0, maximum 11). Two regions have no neighbors (islands): San Andres (COL) (index 87) and Nueva Esparta (VEN) (index 145) — both are island territories separated from the mainland by water. PySAL excludes these isolates from spatial autocorrelation calculations, as they have no defined spatial relationship with other regions. Row-standardization ensures that each region’s spatial lag is the simple average of its neighbors' values, regardless of how many neighbors it has.

7.3 Visualizing the connectivity structure

The plot_spatial_weights() function from splot overlays the weights network on the map, drawing lines between each region’s centroid and its neighbors' centroids.

fig, ax = plt.subplots(figsize=(10, 10))
gdf.plot(ax=ax, facecolor="none", edgecolor=GRID_LINE, linewidth=0.5)
plot_spatial_weights(W, gdf, ax=ax)
ax.set_title("Queen contiguity weights", fontsize=14, pad=10)
ax.set_axis_off()
plt.savefig("esda2_spatial_weights.png", dpi=300, bbox_inches="tight")
plt.show()

The network visualization shows the connectivity structure underlying all spatial statistics in this tutorial. Denser networks appear in areas with many small regions (e.g., southern Brazil, northern Argentina), while sparser connections appear in areas with large administrative units (e.g., the Amazon basin). The two island territories (San Andres and Nueva Esparta) appear as isolated dots with no connecting lines. This network is the foundation for computing spatial lags — the weighted average of neighbors' values — which is the building block of Moran’s I.

8. Global spatial autocorrelation

8.1 Moran’s I: concept and intuition

Moran’s I is the most widely used measure of global spatial autocorrelation. It answers a simple question: do similar values tend to cluster together more than expected by chance? Think of it like temperature on a weather map — if it is hot in one city, nearby cities are likely hot too. Moran’s I measures how strongly this “neighbor similarity” holds for development levels across South American regions.

The statistic is defined as:

$$I = \frac{n}{\sum_{i} \sum_{j} w_{ij}} \cdot \frac{\sum_{i} \sum_{j} w_{ij} (x_i - \bar{x})(x_j - \bar{x})}{\sum_{i} (x_i - \bar{x})^2}$$

where $n$ is the number of regions, $w_{ij}$ are the spatial weights, $x_i$ is the value at region $i$, and $\bar{x}$ is the overall mean. In plain language: Moran’s I compares the product of deviations from the mean for each pair of neighbors. If high-value regions tend to be next to high-value regions (and low next to low), these products are positive, and $I$ is positive.

$I \approx +1$: strong positive spatial autocorrelation (clustering of similar values)
$I \approx 0$: no spatial pattern (random arrangement)
$I \approx -1$: strong negative spatial autocorrelation (checkerboard pattern)

The expected value under spatial randomness is $E(I) = -1/(n-1)$, which approaches zero for large $n$.

8.2 Moran’s I for HDI (2013 and 2019)

We compute Moran’s I with 999 random permutations to generate a reference distribution and assess statistical significance. A permutation test works by randomly shuffling all the SHDI values across the map 999 times — like dealing cards to random seats. If the real Moran’s I is more extreme than almost all the shuffled values, we can be confident the spatial pattern is real, not coincidence.

from esda.moran import Moran
moran_2013 = Moran(gdf["shdi2013"], W, permutations=999)
moran_2019 = Moran(gdf["shdi2019"], W, permutations=999)
print(f"SHDI 2013: I = {moran_2013.I:.4f}, p-value = {moran_2013.p_sim:.4f}, "
f"z-score = {moran_2013.z_sim:.4f}")
print(f"SHDI 2019: I = {moran_2019.I:.4f}, p-value = {moran_2019.p_sim:.4f}, "
f"z-score = {moran_2019.z_sim:.4f}")
print(f"Expected I (random): {moran_2013.EI:.4f}")

SHDI 2013: I = 0.5680, p-value = 0.0010, z-score = 10.7661
SHDI 2019: I = 0.6320, p-value = 0.0010, z-score = 11.9890
Expected I (random): -0.0066

Moran’s I for SHDI is strongly positive and highly significant in both years. In 2013, $I = 0.5680$ (p = 0.001, z = 10.77), and in 2019, $I = 0.6320$ (p = 0.001, z = 11.99). Both values are far above the expected value under spatial randomness ($E(I) = -0.0066$), confirming that regions with similar development levels are spatially clustered. Notably, spatial autocorrelation strengthened from 2013 to 2019 ($I$ increased from 0.568 to 0.632), suggesting that development clusters became more pronounced over the period — the spatial divide deepened.

8.3 Moran scatter plot

The Moran scatter plot visualizes the spatial relationship by plotting each region’s standardized value ($z_i$) against the spatial lag of its neighbors ($Wz_i$). The slope of the regression line through the scatter equals Moran’s I. The four quadrants identify the type of spatial association for each region:

HH (top-right): High values surrounded by high neighbors
LL (bottom-left): Low values surrounded by low neighbors
LH (top-left): Low values surrounded by high neighbors (spatial outlier)
HL (bottom-right): High values surrounded by low neighbors (spatial outlier)

from scipy import stats as scipy_stats
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
for ax, moran_obj, year in [
(axes[0], moran_2013, "2013"),
(axes[1], moran_2019, "2019"),
]:
# Standardize values and compute spatial lag
y = gdf[f"shdi{year}"].values
z = (y - y.mean()) / y.std()
wz = lag_spatial(W, z)
ax.scatter(z, wz, color=STEEL_BLUE, s=35, alpha=0.7,
edgecolors=GRID_LINE, linewidths=0.3, zorder=3)
# Regression line (slope = Moran's I)
slope, intercept, _, _, _ = scipy_stats.linregress(z, wz)
x_range = np.array([z.min(), z.max()])
ax.plot(x_range, intercept + slope * x_range, color=WARM_ORANGE,
linewidth=1.5, zorder=2)
# Quadrant dividers at origin
ax.axhline(0, color=LIGHT_TEXT, linewidth=0.8, alpha=0.5, zorder=1)
ax.axvline(0, color=LIGHT_TEXT, linewidth=0.8, alpha=0.5, zorder=1)
# Quadrant labels
xlim, ylim = ax.get_xlim(), ax.get_ylim()
pad_x = (xlim[1] - xlim[0]) * 0.05
pad_y = (ylim[1] - ylim[0]) * 0.05
ax.text(xlim[1] - pad_x, ylim[1] - pad_y, "HH", fontsize=13,
ha="right", va="top", color=LIGHT_TEXT, alpha=0.5)
ax.text(xlim[0] + pad_x, ylim[1] - pad_y, "LH", fontsize=13,
ha="left", va="top", color=LIGHT_TEXT, alpha=0.5)
ax.text(xlim[0] + pad_x, ylim[0] + pad_y, "LL", fontsize=13,
ha="left", va="bottom", color=LIGHT_TEXT, alpha=0.5)
ax.text(xlim[1] - pad_x, ylim[0] + pad_y, "HL", fontsize=13,
ha="right", va="bottom", color=LIGHT_TEXT, alpha=0.5)
ax.set_xlabel(f"SHDI {year} (standardized)")
ax.set_ylabel(f"Spatial lag of SHDI {year}")
ax.set_title(f"({'a' if year == '2013' else 'b'}) Moran scatter plot "
f"— {year} (I = {moran_obj.I:.4f})")
plt.tight_layout()
plt.savefig("esda2_moran_global.png", dpi=300, bbox_inches="tight")
plt.show()

Both Moran scatter plots show a clear positive slope, with the majority of regions falling in the HH and LL quadrants (positive spatial autocorrelation). The steeper slope in the 2019 panel visually confirms the increase in Moran’s I from 0.5680 to 0.6320. Regions in the HH quadrant (top-right) represent the Southern Cone prosperity cluster, while regions in the LL quadrant (bottom-left) represent the Amazon/Guyana deprivation cluster. The relatively few points in the LH and HL quadrants are spatial outliers — regions whose development level diverges sharply from their neighbors.

9. Local spatial autocorrelation (LISA)

9.1 From global to local: why LISA matters

Global Moran’s I gives us one number for the entire map, confirming that spatial clustering exists. But it does not tell us where the clusters are located. Local Indicators of Spatial Association (LISA) decompose the global statistic into a contribution from each individual region (Anselin, 1995).

The local Moran statistic for region $i$ is:

$$I_i = z_i \sum_{j} w_{ij} z_j$$

where $z_i = (x_i - \bar{x}) / s$ is the standardized value at region $i$ and $\sum_{j} w_{ij} z_j$ is its spatial lag (the weighted average of neighbors' standardized values). In plain language: each region’s local statistic is the product of its own deviation from the mean and the average deviation of its neighbors. In the code, $x_i$ corresponds to gdf["shdi2019"] and $w_{ij}$ to the row-standardized Queen weights W.

Each region receives a local Moran’s I statistic and is classified into one of four types based on its quadrant in the Moran scatter plot:

HH (High-High): A high-value region surrounded by high-value neighbors — a “hot spot” or prosperity cluster
LL (Low-Low): A low-value region surrounded by low-value neighbors — a “cold spot” or deprivation trap
HL (High-Low): A high-value region surrounded by low-value neighbors — a positive spatial outlier
LH (Low-High): A low-value region surrounded by high-value neighbors — a negative spatial outlier

Statistical significance is assessed via permutation tests. Only regions with p-values below a chosen threshold (here, $p < 0.10$) are classified as belonging to a cluster.

9.2 LISA for HDI 2019

We compute the local Moran’s I for SHDI in 2019 and visualize the results as a Moran scatter plot with significant regions colored by quadrant (left panel) and a cluster map (right panel).

localMoran_2019 = Moran_Local(gdf["shdi2019"], W, permutations=999, seed=12345)
wlag_2019 = lag_spatial(W, gdf["shdi2019"].values)
sig_2019 = localMoran_2019.p_sim < 0.10
q_labels = {1: "HH", 2: "LH", 3: "LL", 4: "HL"}
for q_val, q_name in q_labels.items():
count = ((localMoran_2019.q == q_val) & sig_2019).sum()
print(f" {q_name}: {count}")
print(f" Not significant: {(~sig_2019).sum()}")

 HH: 30
LH: 1
LL: 37
HL: 5
Not significant: 80

LISA_COLORS = {1: "#d7191c", 2: "#89cff0", 3: "#2c7bb6", 4: "#fdae61"}
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(14, 6))
# (a) LISA scatter plot with colored quadrants
ax = axes[0]
slope, intercept, _, _, _ = scipy_stats.linregress(gdf["shdi2019"].values, wlag_2019)
# Non-significant points (grey)
ns_mask = ~sig_2019
ax.scatter(gdf.loc[ns_mask, "shdi2019"], wlag_2019[ns_mask],
color="#bababa", s=30, alpha=0.4, edgecolors=GRID_LINE,
linewidths=0.3, label="ns", zorder=2)
# Significant points colored by quadrant
for q_val, q_name in q_labels.items():
mask = (localMoran_2019.q == q_val) & sig_2019
if mask.any():
ax.scatter(gdf.loc[mask, "shdi2019"], wlag_2019[mask],
color=LISA_COLORS[q_val], s=40, alpha=0.8,
edgecolors=GRID_LINE, linewidths=0.3,
label=q_name, zorder=3)
# Regression line
x_range = np.array([gdf["shdi2019"].min(), gdf["shdi2019"].max()])
ax.plot(x_range, intercept + slope * x_range, color=WARM_ORANGE,
linewidth=1.2, zorder=1)
# Crosshairs at mean
ax.axhline(wlag_2019.mean(), color=GRID_LINE, linewidth=0.8, linestyle="--", zorder=0)
ax.axvline(gdf["shdi2019"].mean(), color=GRID_LINE, linewidth=0.8, linestyle="--", zorder=0)
ax.set_xlabel("SHDI 2019")
ax.set_ylabel("Spatial lag of SHDI 2019")
ax.set_title(f"(a) Moran scatter plot (I = {moran_2019.I:.4f})")
# (b) LISA cluster map
lisa_cluster(localMoran_2019, gdf, p=0.10,
legend_kwds={"bbox_to_anchor": (0.02, 0.90)}, ax=axes[1])
axes[1].set_facecolor(DARK_NAVY)
axes[1].set_title("(b) LISA clusters (p < 0.10)")
# Label extreme LISA regions on both panels
label_idx = []
hh_mask = (localMoran_2019.q == 1) & sig_2019
if hh_mask.any():
label_idx += gdf.loc[hh_mask, "shdi2019"].nlargest(3).index.tolist()
ll_mask = (localMoran_2019.q == 3) & sig_2019
if ll_mask.any():
label_idx += gdf.loc[ll_mask, "shdi2019"].nsmallest(3).index.tolist()
hl_mask = (localMoran_2019.q == 4) & sig_2019
if hl_mask.any():
label_idx.append(gdf.loc[hl_mask, "shdi2019"].idxmax())
lh_mask = (localMoran_2019.q == 2) & sig_2019
if lh_mask.any():
label_idx.append(gdf.loc[lh_mask, "shdi2019"].idxmin())
# Scatter labels
texts = [axes[0].text(gdf.loc[i, "shdi2019"], wlag_2019[i], gdf.loc[i, "region"],
fontsize=7, color=LIGHT_TEXT) for i in label_idx]
adjust_text(texts, ax=axes[0], arrowprops=dict(arrowstyle="-", color=LIGHT_TEXT,
alpha=0.5, lw=0.5))
# Map labels
texts = [axes[1].text(gdf.geometry.iloc[i].centroid.x, gdf.geometry.iloc[i].centroid.y,
gdf.loc[i, "region_country"], fontsize=7, color=WHITE_TEXT, weight="bold")
for i in label_idx]
adjust_text(texts, ax=axes[1], arrowprops=dict(arrowstyle="-|>", color=LIGHT_TEXT,
alpha=0.9, lw=1.2, mutation_scale=8))
plt.tight_layout()
plt.savefig("esda2_lisa_2019.png", dpi=300, bbox_inches="tight")
plt.show()

At the 10% significance level, the 2019 LISA analysis identifies 30 HH regions, 37 LL regions, 5 HL outliers, 1 LH outlier, and 80 non-significant regions. The labels highlight the extremes of each cluster type. The three highest HH regions — R. Metropolitana (CHL, SHDI = 0.883), C. Buenos Aires (ARG, 0.882), and Antofagasta (CHL, 0.875) — anchor the Southern Cone prosperity core. The three lowest LL regions — Potaro-Siparuni (GUY, 0.558), Barima-Waini (GUY, 0.592), and Upper Takutu-Essequibo (GUY, 0.601) — anchor the deprivation cluster in northern South America. San Andres (COL) (0.789) appears as an HL outlier: a high-development island surrounded by lower-development mainland neighbors. Potosi (BOL) (0.631) is the lone LH outlier: a lagging region surrounded by better-performing neighbors.

9.3 LISA for HDI 2013

Repeating the analysis for 2013 allows us to compare how clusters have evolved over time.

localMoran_2013 = Moran_Local(gdf["shdi2013"], W, permutations=999, seed=12345)
wlag_2013 = lag_spatial(W, gdf["shdi2013"].values)
sig_2013 = localMoran_2013.p_sim < 0.10
for q_val, q_name in q_labels.items():
count = ((localMoran_2013.q == q_val) & sig_2013).sum()
print(f" {q_name}: {count}")
print(f" Not significant: {(~sig_2013).sum()}")

 HH: 31
LH: 0
LL: 29
HL: 5
Not significant: 88

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(14, 6))
# (a) LISA scatter plot with colored quadrants
ax = axes[0]
slope, intercept, _, _, _ = scipy_stats.linregress(gdf["shdi2013"].values, wlag_2013)
ns_mask = ~sig_2013
ax.scatter(gdf.loc[ns_mask, "shdi2013"], wlag_2013[ns_mask],
color="#bababa", s=30, alpha=0.4, edgecolors=GRID_LINE,
linewidths=0.3, label="ns", zorder=2)
for q_val, q_name in q_labels.items():
mask = (localMoran_2013.q == q_val) & sig_2013
if mask.any():
ax.scatter(gdf.loc[mask, "shdi2013"], wlag_2013[mask],
color=LISA_COLORS[q_val], s=40, alpha=0.8,
edgecolors=GRID_LINE, linewidths=0.3,
label=q_name, zorder=3)
x_range = np.array([gdf["shdi2013"].min(), gdf["shdi2013"].max()])
ax.plot(x_range, intercept + slope * x_range, color=WARM_ORANGE,
linewidth=1.2, zorder=1)
ax.axhline(wlag_2013.mean(), color=GRID_LINE, linewidth=0.8, linestyle="--", zorder=0)
ax.axvline(gdf["shdi2013"].mean(), color=GRID_LINE, linewidth=0.8, linestyle="--", zorder=0)
ax.set_xlabel("SHDI 2013")
ax.set_ylabel("Spatial lag of SHDI 2013")
ax.set_title(f"(a) Moran scatter plot (I = {moran_2013.I:.4f})")
# (b) LISA cluster map
lisa_cluster(localMoran_2013, gdf, p=0.10,
legend_kwds={"bbox_to_anchor": (0.02, 0.90)}, ax=axes[1])
axes[1].set_facecolor(DARK_NAVY)
axes[1].set_title("(b) LISA clusters (p < 0.10)")
# Label extreme LISA regions (3 HH, 3 LL, 1 HL; no LH in 2013)
label_idx = []
hh_mask = (localMoran_2013.q == 1) & sig_2013
if hh_mask.any():
label_idx += gdf.loc[hh_mask, "shdi2013"].nlargest(3).index.tolist()
ll_mask = (localMoran_2013.q == 3) & sig_2013
if ll_mask.any():
label_idx += gdf.loc[ll_mask, "shdi2013"].nsmallest(3).index.tolist()
hl_mask = (localMoran_2013.q == 4) & sig_2013
if hl_mask.any():
label_idx.append(gdf.loc[hl_mask, "shdi2013"].idxmax())
lh_mask = (localMoran_2013.q == 2) & sig_2013
if lh_mask.any():
label_idx.append(gdf.loc[lh_mask, "shdi2013"].idxmin())
texts = [axes[0].text(gdf.loc[i, "shdi2013"], wlag_2013[i], gdf.loc[i, "region"],
fontsize=7, color=LIGHT_TEXT) for i in label_idx]
adjust_text(texts, ax=axes[0], arrowprops=dict(arrowstyle="-", color=LIGHT_TEXT,
alpha=0.5, lw=0.5))
texts = [axes[1].text(gdf.geometry.iloc[i].centroid.x, gdf.geometry.iloc[i].centroid.y,
gdf.loc[i, "region_country"], fontsize=7, color=WHITE_TEXT, weight="bold")
for i in label_idx]
adjust_text(texts, ax=axes[1], arrowprops=dict(arrowstyle="-|>", color=LIGHT_TEXT,
alpha=0.9, lw=1.2, mutation_scale=8))
plt.tight_layout()
plt.savefig("esda2_lisa_2013.png", dpi=300, bbox_inches="tight")
plt.show()

The 2013 LISA analysis identifies 31 HH regions, 29 LL regions, 5 HL outliers, 0 LH outliers, and 88 non-significant regions. The same three HH leaders appear: C. Buenos Aires (ARG, 0.878), R. Metropolitana (CHL, 0.857), and Antofagasta (CHL, 0.852). The same three LL anchors persist: Potaro-Siparuni (GUY, 0.554), Barima-Waini (GUY, 0.577), and Upper Takutu-Essequibo (GUY, 0.585). The HL outlier in 2013 is Nueva Esparta (VEN) (0.797) — an island state that performed well despite its mainland neighbors. Comparing with 2019, the most striking change is the expansion of the LL cluster from 29 to 37 regions, while the HH cluster remained roughly stable (31 to 30). This asymmetric evolution is consistent with the income decline concentrated in Venezuela, which pulled more regions into the deprivation cluster.

9.4 Comparing LISA clusters across time

A transition table reveals how regions moved between LISA categories from 2013 to 2019.

sig_2013 = localMoran_2013.p_sim < 0.10
sig_2019 = localMoran_2019.p_sim < 0.10
q_labels = {1: "HH", 2: "LH", 3: "LL", 4: "HL"}
labels_2013 = ["ns" if not sig_2013[i] else q_labels[localMoran_2013.q[i]]
for i in range(len(gdf))]
labels_2019 = ["ns" if not sig_2019[i] else q_labels[localMoran_2019.q[i]]
for i in range(len(gdf))]
transition_df = pd.crosstab(
pd.Series(labels_2013, name="2013"),
pd.Series(labels_2019, name="2019")
)
print(transition_df.to_string())

2019 HH HL LH LL ns
2013
HH 27 0 0 0 4
HL 0 2 0 2 1
LL 0 2 0 18 9
ns 3 1 1 17 66

The transition table reveals strong cluster persistence. Of the 31 regions in the HH cluster in 2013, 27 remained HH in 2019 (87% persistence), while only 4 became non-significant. Of the 29 LL regions in 2013, 18 remained LL (62% persistence). The most notable transition is from non-significant to LL: 17 regions that were not part of any significant cluster in 2013 joined the low-development cluster by 2019. This expansion of the LL cluster, combined with the high persistence of HH, paints a picture of entrenched spatial inequality — prosperity clusters are stable, and deprivation clusters are growing.

10. Space-time dynamics

10.1 Directional Moran scatter plot

The LISA transition table tracks changes in statistical significance, but regions can also move within the Moran scatter plot even without crossing significance thresholds. A directional Moran scatter plot shows the movement vector for each region from its 2013 position to its 2019 position in the (standardized value, spatial lag) space. The arrows reveal the direction and magnitude of change in both a region’s own development and its neighbors' development.

To make the two periods comparable, we standardize both years using the pooled mean and standard deviation (across both periods combined), following the same logic as the Pooled PCA tutorial.

from libpysal.weights import lag_spatial
# Standardize using pooled parameters
mean_all = np.mean(np.concatenate([gdf["shdi2013"].values, gdf["shdi2019"].values]))
std_all = np.std(np.concatenate([gdf["shdi2013"].values, gdf["shdi2019"].values]))
z_2013 = (gdf["shdi2013"].values - mean_all) / std_all
z_2019 = (gdf["shdi2019"].values - mean_all) / std_all
# Spatial lags
wz_2013 = lag_spatial(W, z_2013)
wz_2019 = lag_spatial(W, z_2019)
fig, ax = plt.subplots(figsize=(9, 8))
for i in range(len(gdf)):
ax.annotate("", xy=(z_2019[i], wz_2019[i]),
xytext=(z_2013[i], wz_2013[i]),
arrowprops=dict(arrowstyle="->", color=STEEL_BLUE,
alpha=0.5, lw=0.8))
ax.scatter(z_2013, wz_2013, color=WARM_ORANGE, s=20, alpha=0.6,
label="2013", zorder=4)
ax.scatter(z_2019, wz_2019, color=TEAL, s=20, alpha=0.6,
label="2019", zorder=4)
ax.axhline(0, color=GRID_LINE, linewidth=1)
ax.axvline(0, color=GRID_LINE, linewidth=1)
ax.set_xlabel("SHDI (standardized)")
ax.set_ylabel("Spatial lag of SHDI")
ax.set_title("Directional Moran scatter plot: movements from 2013 to 2019")
ax.legend()
plt.savefig("esda2_directional_moran.png", dpi=300, bbox_inches="tight")
plt.show()

# Classify quadrant transitions
q_2013 = np.where((z_2013 >= 0) & (wz_2013 >= 0), "HH",
np.where((z_2013 < 0) & (wz_2013 >= 0), "LH",
np.where((z_2013 < 0) & (wz_2013 < 0), "LL", "HL")))
q_2019 = np.where((z_2019 >= 0) & (wz_2019 >= 0), "HH",
np.where((z_2019 < 0) & (wz_2019 >= 0), "LH",
np.where((z_2019 < 0) & (wz_2019 < 0), "LL", "HL")))
transition_moran = pd.crosstab(
pd.Series(q_2013, name="2013"),
pd.Series(q_2019, name="2019")
)
print(transition_moran.to_string())
stayed = (q_2013 == q_2019).sum()
moved = (q_2013 != q_2019).sum()
print(f"\nStayed in same quadrant: {stayed} ({stayed/len(gdf)*100:.1f}%)")
print(f"Moved to different quadrant: {moved} ({moved/len(gdf)*100:.1f}%)")

2019 HH HL LH LL
2013
HH 41 1 2 10
HL 9 6 0 5
LH 0 0 2 3
LL 7 10 11 46
Stayed in same quadrant: 95 (62.1%)
Moved to different quadrant: 58 (37.9%)

The directional Moran scatter plot reveals the space-time dynamics of South American development. 95 regions (62.1%) remained in the same Moran scatter plot quadrant between 2013 and 2019, while 58 (37.9%) crossed quadrant boundaries. The most stable quadrants are HH (41 of 54 stayed, 76%) and LL (46 of 74 stayed, 62%), confirming that both prosperity and deprivation clusters are persistent. The most common transitions are LL to LH (11 regions) and HL to HH (9 regions), suggesting some upward mobility at the boundary of the prosperity cluster. However, the 10 HH-to-LL transitions highlight that the Venezuelan crisis pulled previously well-performing regions into the low-development quadrant — a dramatic downward trajectory that affected both the regions themselves and their neighbors.

10.2 Country focus: Venezuela vs Bolivia

Venezuela and Bolivia offer a stark contrast in subnational development trajectories. In 2013, Venezuela’s regions were spread across the upper half of the Moran scatter plot — 13 of 24 regions sat in the HH quadrant, reflecting relatively high development levels and high-development neighbors. Bolivia’s 9 regions, by contrast, were concentrated in the lower-left corner (8 in LL, 1 in LH). By 2019, these two countries had moved in opposite directions. We isolate them in the directional Moran scatter plot to compare their movement vectors.

# Filter Venezuela and Bolivia regions
ven_mask = gdf["country"] == "Venezuela"
bol_mask = gdf["country"] == "Bolivia"
# Shared axis limits (from the full dataset, for comparability)
all_z = np.concatenate([z_2013, z_2019])
all_wz = np.concatenate([wz_2013, wz_2019])
pad = 0.3
shared_xlim = (all_z.min() - pad, all_z.max() + pad)
shared_ylim = (all_wz.min() - pad, all_wz.max() + pad)
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(16, 7))
for ax, mask, title in [
(axes[0], bol_mask, "(a) Bolivia"),
(axes[1], ven_mask, "(b) Venezuela"),
]:
# Background: all regions (grey, faded)
for i in range(len(gdf)):
ax.annotate("", xy=(z_2019[i], wz_2019[i]),
xytext=(z_2013[i], wz_2013[i]),
arrowprops=dict(arrowstyle="->", color=GRID_LINE,
alpha=0.15, lw=0.5))
ax.scatter(z_2013, wz_2013, color=GRID_LINE, s=10, alpha=0.15, zorder=2)
ax.scatter(z_2019, wz_2019, color=GRID_LINE, s=10, alpha=0.15, zorder=2)
# Highlighted country
for i in gdf.index[mask]:
ax.annotate("", xy=(z_2019[i], wz_2019[i]),
xytext=(z_2013[i], wz_2013[i]),
arrowprops=dict(arrowstyle="->", color=STEEL_BLUE,
alpha=0.7, lw=1.0))
ax.scatter(z_2013[mask], wz_2013[mask], color=WARM_ORANGE, s=30,
alpha=0.8, edgecolors=GRID_LINE, linewidths=0.3,
label="2013", zorder=5)
ax.scatter(z_2019[mask], wz_2019[mask], color=TEAL, s=30,
alpha=0.8, edgecolors=GRID_LINE, linewidths=0.3,
label="2019", zorder=5)
# Labels at 2019 positions
texts = []
for i in gdf.index[mask]:
texts.append(ax.text(z_2019[i], wz_2019[i], gdf.loc[i, "region"],
fontsize=7, color=LIGHT_TEXT))
adjust_text(texts, ax=ax, arrowprops=dict(arrowstyle="-", color=LIGHT_TEXT,
alpha=0.5, lw=0.5))
# Quadrant lines and labels
ax.axhline(0, color=GRID_LINE, linewidth=1, zorder=1)
ax.axvline(0, color=GRID_LINE, linewidth=1, zorder=1)
ax.set_xlim(shared_xlim)
ax.set_ylim(shared_ylim)
ox = (shared_xlim[1] - shared_xlim[0]) * 0.05
oy = (shared_ylim[1] - shared_ylim[0]) * 0.05
for lbl, ha, va, x, y in [
("HH", "right", "top", shared_xlim[1] - ox, shared_ylim[1] - oy),
("LH", "left", "top", shared_xlim[0] + ox, shared_ylim[1] - oy),
("LL", "left", "bottom", shared_xlim[0] + ox, shared_ylim[0] + oy),
("HL", "right", "bottom", shared_xlim[1] - ox, shared_ylim[0] + oy),
]:
ax.text(x, y, lbl, fontsize=14, ha=ha, va=va,
color=LIGHT_TEXT, alpha=0.6)
ax.set_xlabel("SHDI (standardized)")
ax.set_ylabel("Spatial lag of SHDI")
ax.set_title(title)
ax.legend(fontsize=8)
plt.tight_layout()
plt.savefig("esda2_directional_ven_bol.png", dpi=300, bbox_inches="tight")
plt.show()

# Summary statistics for Venezuela and Bolivia
for country, mask in [("Venezuela", ven_mask), ("Bolivia", bol_mask)]:
n = mask.sum()
mean_change = gdf.loc[mask, "shdi_change"].mean()
min_change = gdf.loc[mask, "shdi_change"].min()
max_change = gdf.loc[mask, "shdi_change"].max()
# Quadrant transitions
q13 = q_2013[mask]
q19 = q_2019[mask]
stayed = (q13 == q19).sum()
moved = (q13 != q19).sum()
print(f"\n{country} ({n} regions):")
print(f" Mean SHDI change: {mean_change:+.4f}")
print(f" Range: [{min_change:+.4f}, {max_change:+.4f}]")
print(f" Quadrant stability: {stayed} stayed, {moved} moved")
print(f" 2013 quadrants: {', '.join(f'{q}={c}' for q, c in zip(*np.unique(q13, return_counts=True)))}")
print(f" 2019 quadrants: {', '.join(f'{q}={c}' for q, c in zip(*np.unique(q19, return_counts=True)))}")

Venezuela (24 regions):
Mean SHDI change: -0.0653
Range: [-0.0670, -0.0640]
Quadrant stability: 3 stayed, 21 moved
2013 quadrants: HH=13, HL=5, LH=3, LL=3
2019 quadrants: HL=1, LH=2, LL=21
Bolivia (9 regions):
Mean SHDI change: +0.0333
Range: [+0.0300, +0.0350]
Quadrant stability: 7 stayed, 2 moved
2013 quadrants: LH=1, LL=8
2019 quadrants: HL=1, LH=2, LL=6

Panel (a) shows Bolivia’s modest but consistent rightward movement. All 9 regions started in the lower-left portion of the plot (8 in LL, 1 in LH) and shifted rightward by 2019, reflecting genuine improvement in own-region development. The mean SHDI change was +0.033, with a remarkably tight range ([+0.030, +0.035]) indicating that the gains were broad-based across all Bolivian regions. Seven of 9 regions (78%) remained in the same quadrant, with 2 moving out of LL — one to LH and one to HL. The arrows are short and point consistently to the right, meaning Bolivia improved its own development levels without substantially changing the spatial lag (its neighbors' conditions remained similar). This pattern suggests steady, internally driven progress that has not yet been large enough to escape the low-development spatial cluster.

Panel (b) tells the opposite story. Venezuela’s 24 regions experienced the most dramatic downward shift in the entire dataset, with a mean SHDI change of -0.065. In 2013, Venezuelan regions were spread across the upper portion of the plot — 13 in HH, 5 in HL, 3 in LH, and only 3 in LL. By 2019, the picture had completely inverted: 21 of 24 regions (88%) crossed quadrant boundaries, with 21 ending in the LL quadrant. The arrows sweep uniformly downward and to the left, reflecting both the collapse of each region’s own development level and the negative spillover onto its neighbors' spatial lags. The narrow range of change ([-0.067, -0.064]) reveals that the crisis was not localized to a few regions — it was a near-uniform national collapse that dragged every Venezuelan region, regardless of its 2013 starting point, into the low-development quadrant.

The juxtaposition is instructive. Bolivia’s arrows are short, rightward, and clustered — a country making incremental gains within a stable spatial structure. Venezuela’s arrows are long, southwest-pointing, and tightly bundled — a country experiencing systemic collapse that erased decades of development advantage in just six years. The contrast highlights how economic crises can propagate spatially: Venezuela’s decline did not just reduce its own regions' development, it also pulled down the spatial lags of neighboring Colombian and Brazilian border regions, contributing to the expansion of the LL cluster documented in Section 9.

11. Discussion

Spatial autocorrelation in South American human development is strong and persistent. Global Moran’s I increased from 0.568 in 2013 to 0.632 in 2019 (both p = 0.001), indicating that the spatial clustering of development levels strengthened over the period. This means the development gap between prosperous and lagging regions is not only large but spatially structured — high-development regions form a contiguous band across the Southern Cone, while low-development regions form an equally contiguous band across the Amazon basin and northern South America.

The LISA analysis pinpoints these clusters with precision. In 2019, 30 regions form a significant HH cluster (high development surrounded by high-development neighbors) and 37 regions form a significant LL cluster (low development surrounded by low-development neighbors). The LL cluster expanded from 29 to 37 regions between 2013 and 2019, driven primarily by Venezuela’s economic crisis and its spillover effects on neighboring regions. The HH cluster remained stable (31 to 30), with 87% persistence — a sign that prosperity corridors in the Southern Cone are structurally entrenched.

The space-time analysis reveals that 62% of regions stayed in the same Moran scatter plot quadrant, but the 38% that moved tell an important story. The most concerning transitions are the 10 regions that moved from HH to LL and the 17 previously non-significant regions that joined the LL LISA cluster. These movements are concentrated in Venezuela and its neighbors, illustrating how economic shocks can propagate spatially.

The Venezuela–Bolivia comparison crystallizes the two forces shaping South America’s spatial development landscape. Venezuela’s 24 regions collapsed nearly uniformly (mean SHDI change of -0.065, with 88% crossing quadrant boundaries), transforming a country that was largely in the HH quadrant in 2013 into one almost entirely in the LL quadrant by 2019. Bolivia’s 9 regions, starting from a much lower base, improved steadily (+0.033) with 78% quadrant stability. These divergent trajectories illustrate that spatial clusters are not static: they can expand rapidly through crisis-driven contagion (Venezuela pulling its neighbors downward) or contract slowly through sustained internal improvement (Bolivia gradually lifting its regions rightward in the Moran scatter plot). The fact that Venezuela’s decline was spatially contagious — dragging down the spatial lags of neighboring Colombian and Brazilian border regions — while Bolivia’s improvement remained spatially contained underscores an asymmetry: negative shocks propagate faster and farther across borders than positive ones.

For policy, these findings suggest that spatially targeted interventions may be more effective than uniform national programs. The persistent LL clusters represent development traps where a region’s own conditions are reinforced by the equally poor conditions of its neighbors. Breaking these traps may require coordinated cross-regional or cross-border programs that address the spatial dimension of underdevelopment. Bolivia’s experience suggests that broad-based national improvement can lift all regions, but escaping the low-development spatial cluster may require the additional step of improving neighbors' conditions simultaneously — a challenge that calls for cross-border cooperation.

12. Summary and next steps

Key takeaways:

Method insight: ESDA reveals spatial patterns invisible in aspatial analysis. The same dataset that shows a modest aggregate improvement (+0.005 SHDI) conceals a deepening spatial divide — Moran’s I increased from 0.568 to 0.632, meaning spatial clustering strengthened between 2013 and 2019.
Data insight: 30 HH and 37 LL regions form statistically significant clusters at the 10% level. The LL cluster expanded by 8 regions (from 29 to 37), while the HH cluster remained stable. Cluster persistence is high: 87% for HH and 62% for LL, indicating entrenched spatial inequality.
Country insight: Venezuela and Bolivia illustrate contrasting development dynamics. Venezuela’s 24 regions collapsed nearly uniformly (mean -0.065), with 88% crossing quadrant boundaries from the upper to the lower portion of the Moran scatter plot. Bolivia’s 9 regions improved steadily (+0.033) with 78% quadrant stability, showing broad-based gains that have not yet been large enough to escape the LL spatial cluster.
Limitation: Queen contiguity assumes shared borders, which excludes island territories (San Andres, Nueva Esparta) and may not capture cross-water economic linkages. With only two time periods (2013 and 2019), we cannot distinguish permanent structural clusters from temporary effects of the Venezuelan crisis. The p = 0.10 significance threshold is relatively permissive.
Next step: Extend the analysis with spatial regression models (spatial lag and spatial error models) to test whether a region’s development is directly influenced by its neighbors' development, or whether the clustering is driven by shared underlying factors. Bivariate LISA could reveal whether income clusters coincide with education clusters. Adding more time periods (2000–2019) from the full Global Data Lab series would enable Spatial Markov chain analysis of cluster transition probabilities.

13. Exercises

Income clusters. Repeat the LISA analysis for the income index (incindex2019) instead of SHDI. Are income clusters in the same locations as HDI clusters? How many regions belong to both an income LL and an HDI LL cluster?
Alternative weights. Build k-nearest neighbors weights (KNN from libpysal.weights) with $k = 5$ and Rook contiguity (Rook from libpysal.weights) instead of Queen contiguity. How does Moran’s I change under each specification? Does the KNN approach resolve the island problem?
Bivariate Moran. Use Moran_BV from esda to compute the bivariate Moran’s I between education and income indices. Are regions with high education surrounded by regions with high income, or are the two dimensions spatially independent?
Spatial autocorrelation of change. Compute Moran’s I for shdi_change instead of the level variables. Is the change in SHDI between 2013 and 2019 itself spatially clustered? Compare the result with the change choropleth from Section 6.2. Hint: Moran(gdf["shdi_change"], W, permutations=999).
Component-level Moran’s I. Compute Moran’s I for the health, education, and income indices separately in both 2013 and 2019. Which component shows the strongest spatial autocorrelation? Does the income index — which declined in 46% of regions — show a different spatial pattern than health or education?
Multiple testing sensitivity. Re-run the 2019 LISA analysis at $p < 0.05$ instead of $p < 0.10$. How many HH and LL regions survive the stricter threshold? Research the Bonferroni correction ($0.05 / 153 \approx 0.0003$) and the False Discovery Rate (FDR) procedure — how would these affect the cluster counts?
Neighbor count distribution. Plot a histogram of the number of neighbors per region from the Queen weights matrix (use W.cardinalities). What is the shape of the distribution? Which regions have the most and fewest neighbors, and why?
Is the Moran’s I increase significant? Moran’s I rose from 0.568 to 0.632 between 2013 and 2019. But does this difference pass a significance test? Try a bootstrap approach: pool the 2013 and 2019 SHDI values, randomly assign them to the two periods 999 times, and compute the difference in Moran’s I each time. Where does the observed difference (0.064) fall in the bootstrap distribution?
Moran’s I excluding Venezuela. Recompute Moran’s I for 2013 and 2019 after dropping Venezuela’s 24 regions (rebuild the Queen weights on the subset GeoDataFrame). Does the increase in spatial autocorrelation survive? If not, the “deepening spatial divide” may be driven by a single country’s crisis rather than a continent-wide trend.
LISA significance map. Create a choropleth map coloring each region by its LISA p-value (localMoran_2019.p_sim) using a sequential colormap. How many regions have $p < 0.01$ vs $p < 0.05$ vs $p < 0.10$? Are the deeply significant regions ($p < 0.01$) concentrated in the same locations as the cluster map from Section 9.2?

14. References

Acknowledgements

AI tools (Claude Code, Gemini, NotebookLM) were used to make the contents of this post more accessible to students. Nevertheless, the content in this post may still have errors. Caution is needed when applying the contents of this post to true research projects.

Multiscale Geographically Weighted Regression: Spatially Varying Economic Convergence in Indonesia

Sun, 22 Mar 2026 00:00:00 +0000

1. Overview

When we ask “do poorer regions catch up to richer ones?”, the standard approach is to run a single regression across all regions and report one coefficient. But what if the answer depends on where you look? A negative coefficient in Sumatra does not mean the same process is at work in Papua. A global regression forces every district onto the same line — and in doing so, it may hide the most interesting part of the story.

Multiscale Geographically Weighted Regression (MGWR) addresses this by estimating a separate set of coefficients at every location, weighted by proximity. Its key innovation over standard GWR is that each variable is allowed to operate at its own spatial scale. The intercept (representing baseline growth conditions) might vary smoothly across large regions, while the convergence coefficient might shift sharply between neighboring districts. MGWR discovers these scales from the data rather than imposing a single bandwidth on all variables.

This tutorial applies MGWR to 514 Indonesian districts to answer: does economic catching-up happen at the same pace everywhere in Indonesia, or does geography shape how fast poorer districts close the gap? We progress from a global regression baseline through MGWR estimation and coefficient mapping, revealing that the global R² of 0.214 jumps to 0.762 once we allow the relationship to vary across space.

Learning objectives:

Understand why a single regression coefficient may hide important spatial variation
Estimate location-specific relationships with spatially varying coefficients
Apply MGWR to allow each variable to operate at its own spatial scale
Map and interpret spatially varying coefficients across Indonesia
Compare global OLS vs MGWR model fit and diagnostics

Key concepts at a glance

The post leans on a small vocabulary repeatedly. The rest of the tutorial assumes you can move between these terms quickly. Each concept below has three parts. The definition is always visible. The example and analogy sit behind clickable cards: open them when you need them, leave them collapsed for a quick scan. If a later section mentions “bandwidth” or “spatial heterogeneity” and the term feels slippery, this is the section to re-read.

1. Local regression $\hat\beta(s)$ varies by location. One regression per location $s$, weighted by spatial proximity. Coefficients become functions of geographic position rather than fixed numbers.

Example

In this post the convergence coefficient $\hat\beta$ on ln_gdppc2010 varies across the 514 Indonesian districts — from -1.74 (strong catching-up) to +0.42 (divergence).

Analogy

Drawing a different best-fit line at each map dot, not one global line for the whole country.

2. Bandwidth (kernel) $h$. The number of nearest neighbours each local regression uses. Smaller $h$ = more localized, noisier estimates; larger $h$ = smoother but flatter.

Example

This post selects an optimal bandwidth of 44 districts (out of 514) for both regressors. Each local regression at a given district uses its 44 nearest neighbours.

Analogy

The radius of the circle of friends a local model listens to before deciding.

3. Spatial heterogeneity $\beta_i \neq \beta_j$. Coefficients differ across space. The relationship between predictors and outcome is not constant geographically.

Example

In this post catching-up is strong in 149 of 514 districts (29% with significant negative β) but insignificant or positive in the other 365 districts. Convergence is not a single Indonesia-wide story.

Analogy

Different family recipes in different villages — not the same dish everywhere.

4. GWR vs MGWR one $h$ vs $h$ per regressor. GWR uses a single bandwidth for all coefficients. MGWR allows each coefficient to have its own bandwidth, capturing the fact that different processes operate at different spatial scales.

Example

In this post both ln_gdppc2010 and the intercept happen to share bandwidth = 44, but in general MGWR could have e.g. bandwidth 30 for one variable and 200 for another. The constraint relaxation is the methodological advance.

Analogy

One volume knob for everyone vs each instrument with its own knob.

5. Local R² $R^2_i$. The R² of the local regression at district $i$. Maps to a colour scale to show where the model fits well and where it struggles.

Example

This post maps local R² across Indonesia. Fits are strong in dense Java districts and weaker in sparse, remote eastern islands where the 44 nearest neighbours span huge geographic distances.

Analogy

“How well-played is the song in this village”.

6. AICc model selection lower AICc = better. The corrected Akaike Information Criterion penalizes model complexity. The standard MGWR-vs-OLS comparison.

Example

In this post global OLS has AICc = 1341.25 while MGWR has AICc = 838.41 — a difference of more than 500 strongly favours the spatially varying model.

Analogy

The picky food critic comparing the two restaurants and giving a definitive verdict.

7. β-convergence $g_i = \alpha + \beta \ln Y_{i,0} + \varepsilon_i$. The classic growth-economics test: poor regions catching up with rich ones leads to a negative β coefficient on initial income.

Example

This post’s global β = -0.1948 (mild catching-up overall). MGWR reveals β ranges from -1.74 (strong local convergence) to +0.42 (local divergence). The story is heterogeneous and the global average hides this.

Analogy

Poor districts catching up with rich ones. A negative slope means the gap shrinks; a positive slope means the gap widens.

8. Effective number of parameters trace of hat matrix. MGWR has more flexibility than OLS but less than fitting one regression per district. The “effective” parameter count quantifies this middle ground.

Example

This post’s MGWR uses 52.076 effective parameters — far more than OLS’s 2 but far less than 514×2 = 1,028 (one regression per district). MGWR finds the right level of model complexity automatically.

Analogy

A soft count of how many independent knobs the model really has.

2. The modeling pipeline

The analysis follows a natural progression: start with a simple global model, visualize the spatial patterns it cannot capture, then let MGWR reveal the local structure.

graph LR
A["<b>Step 1</b><br/>Load &<br/>Explore"] --> B["<b>Step 2</b><br/>Map<br/>Variables"]
B --> C["<b>Step 3</b><br/>Global<br/>OLS"]
C --> D["<b>Step 4</b><br/>MGWR<br/>Estimation"]
D --> E["<b>Step 5</b><br/>Map<br/>Coefficients"]
E --> F["<b>Step 6</b><br/>Significance<br/>& Compare"]
style A fill:#141413,stroke:#6a9bcc,color:#fff
style B fill:#d97757,stroke:#141413,color:#fff
style C fill:#6a9bcc,stroke:#141413,color:#fff
style D fill:#00d4c8,stroke:#141413,color:#fff
style E fill:#00d4c8,stroke:#141413,color:#fff
style F fill:#1a3a8a,stroke:#141413,color:#fff

3. Setup and imports

The analysis uses mgwr for multiscale regression, GeoPandas for spatial data, and mapclassify for choropleth classification.

import numpy as np
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
import mapclassify
from scipy import stats
from mgwr.gwr import MGWR
from mgwr.sel_bw import Sel_BW
import warnings
warnings.filterwarnings("ignore")
# Site color palette
STEEL_BLUE = "#6a9bcc"
WARM_ORANGE = "#d97757"
NEAR_BLACK = "#141413"
TEAL = "#00d4c8"

Dark theme figure styling (click to expand)

DARK_NAVY = "#0f1729"
GRID_LINE = "#1f2b5e"
LIGHT_TEXT = "#c8d0e0"
WHITE_TEXT = "#e8ecf2"
plt.rcParams.update({
"figure.facecolor": DARK_NAVY,
"axes.facecolor": DARK_NAVY,
"axes.edgecolor": DARK_NAVY,
"axes.linewidth": 0,
"axes.labelcolor": LIGHT_TEXT,
"axes.titlecolor": WHITE_TEXT,
"axes.spines.top": False,
"axes.spines.right": False,
"axes.spines.left": False,
"axes.spines.bottom": False,
"axes.grid": True,
"grid.color": GRID_LINE,
"grid.linewidth": 0.6,
"grid.alpha": 0.8,
"xtick.color": LIGHT_TEXT,
"ytick.color": LIGHT_TEXT,
"xtick.major.size": 0,
"ytick.major.size": 0,
"text.color": WHITE_TEXT,
"font.size": 12,
"legend.frameon": False,
"legend.fontsize": 11,
"legend.labelcolor": LIGHT_TEXT,
"figure.edgecolor": DARK_NAVY,
"savefig.facecolor": DARK_NAVY,
"savefig.edgecolor": DARK_NAVY,
})

4. Data loading and exploration

The dataset covers 514 Indonesian districts with GDP per capita in 2010 and the subsequent growth rate through 2018. Indonesia is an ideal setting for studying spatial heterogeneity: it spans over 17,000 islands across 5,000 km of ocean, with enormous variation in economic structure, geography, and institutional capacity.

The core idea behind convergence is straightforward: if poorer districts tend to grow faster than richer ones, the income gap narrows over time. In a regression framework, this means we expect a negative relationship between initial income (log GDP per capita in 2010) and subsequent growth. The question is whether that negative relationship holds uniformly across the archipelago — or whether it is stronger in some places and weaker (or even reversed) in others.

CSV_URL = ("https://github.com/quarcs-lab/data-quarcs/raw/refs/heads/"
"master/indonesia514/dataBeta.csv")
GEO_URL = ("https://github.com/quarcs-lab/data-quarcs/raw/refs/heads/"
"master/indonesia514/mapIdonesia514-opt.geojson")
df = pd.read_csv(CSV_URL)
geo = gpd.read_file(GEO_URL)
gdf = geo.merge(df, on="districtID", how="left")
print(f"Loaded: {gdf.shape[0]} districts, {gdf.shape[1]} columns")
print(gdf[["ln_gdppc2010", "g"]].describe().round(4).to_string())

Loaded: 514 districts, 16 columns
ln_gdppc2010 g
count 514.0000 514.0000
mean 9.8371 0.3860
std 0.7603 0.3205
min 7.1657 -2.0452
25% 9.3983 0.2583
50% 9.7626 0.3453
75% 10.1739 0.4158
max 13.4438 2.0563

The 514 districts span a wide range of initial income: log GDP per capita ranges from 7.17 (the poorest district, roughly \$1,300 per capita) to 13.44 (the richest, roughly \$690,000 — likely a resource-extraction enclave). Growth rates also vary enormously, from -2.05 (severe contraction) to +2.06 (rapid expansion), with a mean of 0.39. This high variance in both variables suggests that a single regression line will struggle to capture the full picture.

5. Exploratory maps

Before fitting any model, we map the two key variables to see whether spatial patterns are visible to the naked eye. If initial income and growth are geographically clustered, that is already a hint that spatial models will outperform global ones.

fig, axes = plt.subplots(2, 1, figsize=(14, 14))
for ax, col, title in [
(axes[0], "ln_gdppc2010", "(a) Log GDP per capita, 2010"),
(axes[1], "g", "(b) GDP growth rate, 2010–2018"),
]:
fj = mapclassify.FisherJenks(gdf[col].dropna().values, k=5)
classified = mapclassify.UserDefined(gdf[col].values, bins=fj.bins.tolist())
cmap = plt.cm.coolwarm
norm = plt.Normalize(vmin=0, vmax=4)
colors = [cmap(norm(c)) for c in classified.yb]
gdf.plot(ax=ax, color=colors, edgecolor=GRID_LINE, linewidth=0.2)
ax.set_title(title, fontsize=14, pad=10)
ax.set_axis_off()
plt.tight_layout()
plt.savefig("mgwr_map_xy.png", dpi=300, bbox_inches="tight")
plt.show()

The maps reveal clear spatial structure. Initial income (panel a) is highest in Jakarta and resource-rich districts in Kalimantan and Papua (warm red), while the lowest-income districts cluster in eastern Nusa Tenggara and parts of Maluku (cool blue). Growth rates (panel b) show a different pattern: some of the poorest districts in Papua and Sulawesi experienced rapid growth (suggesting catching-up), while several high-income resource districts saw contraction. The fact that these patterns are geographically organized — not randomly scattered — motivates the use of spatially varying models.

6. Global regression baseline

The simplest test for economic convergence fits a single regression line through all 514 districts. If the slope is negative, poorer districts (low initial income) tend to grow faster than richer ones.

$$g_i = \alpha + \beta \cdot \ln(y_{i,2010}) + \varepsilon_i$$

where $g_i$ is the growth rate, $\ln(y_{i,2010})$ is log initial income, and $\beta < 0$ indicates convergence. In the code, $g_i$ corresponds to the column g and $\ln(y_{i,2010})$ to ln_gdppc2010.

slope, intercept, r_value, p_value, std_err = stats.linregress(
gdf["ln_gdppc2010"], gdf["g"]
)
print(f"Slope (convergence coefficient): {slope:.4f}")
print(f"R-squared: {r_value**2:.4f}")
print(f"p-value: {p_value:.6f}")

Slope (convergence coefficient): -0.1948
R-squared: 0.2135
p-value: 0.000000

fig, ax = plt.subplots(figsize=(10, 7))
ax.scatter(gdf["ln_gdppc2010"], gdf["g"],
color=STEEL_BLUE, edgecolors=GRID_LINE, s=35, alpha=0.6, zorder=3)
x_range = np.linspace(gdf["ln_gdppc2010"].min(), gdf["ln_gdppc2010"].max(), 100)
ax.plot(x_range, intercept + slope * x_range, color=WARM_ORANGE,
linewidth=2, zorder=2)
ax.set_xlabel("Log GDP per capita (2010)")
ax.set_ylabel("GDP growth rate (2010–2018)")
ax.set_title("Global convergence regression")
plt.savefig("mgwr_scatter_global.png", dpi=300, bbox_inches="tight")
plt.show()

The global regression confirms that convergence exists on average: the slope is $-0.195$ (p < 0.001), meaning a 1-unit increase in log initial income is associated with a 0.195 percentage-point lower growth rate. However, the R² of only 0.214 means this single line explains just 21% of the variation in growth rates. The scatter plot shows enormous dispersion around the regression line — many districts with similar initial income experienced vastly different growth trajectories. This low explanatory power is the motivation for MGWR: perhaps the relationship is not weak everywhere, but rather strong in some regions and absent in others, and a single coefficient is simply averaging over this heterogeneity.

7. From global to local: why MGWR?

7.1 The limitation of a single coefficient

The global regression tells us that $\beta = -0.195$ on average across Indonesia. But consider two districts with the same initial income — one in Java, where infrastructure and market access are strong, and one in Papua, where remoteness and institutional challenges dominate. There is no reason to expect the same convergence dynamic in both places. A single coefficient forces them onto the same line.

Geographically Weighted Regression (GWR) addresses this by estimating a separate regression at each location, using a kernel function — a distance-decay weighting scheme (typically Gaussian or bisquare) that gives more weight to nearby observations and less to distant ones. The result is a set of location-specific coefficients — each district gets its own slope and intercept:

$$g_i = \alpha(u_i, v_i) + \beta(u_i, v_i) \cdot \ln(y_{i,2010}) + \varepsilon_i$$

where $(u_i, v_i)$ are the geographic coordinates of district $i$, and both $\alpha$ and $\beta$ are now functions of location rather than fixed constants. In the code, $(u_i, v_i)$ correspond to COORD_X and COORD_Y. The bandwidth parameter $h$ controls how many neighbors contribute to each local regression — a small bandwidth means only very close districts matter (highly local), while a large bandwidth approaches the global model.

However, standard GWR uses a single bandwidth for all variables, which means the intercept and the convergence coefficient are forced to vary at the same spatial scale.

MGWR removes this constraint. It allows each variable to find its own optimal bandwidth through an iterative back-fitting procedure — a process that cycles through each variable, optimizing its bandwidth while holding the others fixed, until all bandwidths converge. If baseline growth conditions vary smoothly across large regions (large bandwidth), while the convergence speed varies sharply between neighboring districts (small bandwidth), MGWR will discover this from the data. This makes MGWR a more flexible and realistic model for processes that operate at multiple spatial scales. The key assumption is that spatial relationships are locally stationary within each kernel window — the relationship between income and growth is approximately constant among the nearest $h$ districts, even if it differs across the full map.

7.2 MGWR estimation

The mgwr package requires variables to be standardized (zero mean, unit variance) before multiscale bandwidth selection. This ensures that the bandwidths are comparable across variables measured in different units. The spherical=True flag tells the algorithm to compute great-circle distances rather than Euclidean distances, which is essential when working with geographic coordinates spanning a large area like Indonesia.

# Prepare variables
y = gdf["g"].values.reshape((-1, 1))
X = gdf[["ln_gdppc2010"]].values
coords = list(zip(gdf["COORD_X"], gdf["COORD_Y"]))
# Standardize (required for MGWR)
Zy = (y - y.mean(axis=0)) / y.std(axis=0)
ZX = (X - X.mean(axis=0)) / X.std(axis=0)
# Bandwidth selection and model fitting
mgwr_selector = Sel_BW(coords, Zy, ZX, multi=True, spherical=True)
mgwr_bw = mgwr_selector.search()
mgwr_results = MGWR(coords, Zy, ZX, mgwr_selector, spherical=True).fit()
mgwr_results.summary()

===========================================================================
Model type Gaussian
Number of observations: 514
Number of covariates: 2
Global Regression Results
---------------------------------------------------------------------------
R2: 0.214
Adj. R2: 0.212
Multi-Scale Geographically Weighted Regression (MGWR) Results
---------------------------------------------------------------------------
Spatial kernel: Adaptive bisquare
MGWR bandwidths
---------------------------------------------------------------------------
Variable Bandwidth ENP_j Adj t-val(95%) Adj alpha(95%)
X0 44.000 26.805 3.127 0.002
X1 44.000 25.271 3.109 0.002
Diagnostic information
---------------------------------------------------------------------------
Residual sum of squares: 122.081
Effective number of parameters (trace(S)): 52.076
Sigma estimate: 0.514
R2 0.762
Adjusted R2 0.736
AICc: 838.405
===========================================================================

The MGWR results are striking. R² jumps from 0.214 (global) to 0.762 (MGWR) — the spatially varying model explains more than three times as much variation as the global regression. Both the intercept and the convergence coefficient receive a bandwidth of 44, meaning each local regression draws on the 44 nearest districts. This is a relatively local scale (44 out of 514 districts, or about 8.6% of the sample), confirming that the convergence relationship varies substantially across the archipelago. The effective number of parameters is 52.1, reflecting the cost of estimating location-specific coefficients instead of two global ones.

7.3 Mapping MGWR coefficients

The power of MGWR lies in the coefficient maps. Instead of a single number for the whole country, we can now visualize how the convergence relationship changes from district to district. Because MGWR is estimated on standardized variables, the mapped coefficients are in standard-deviation units: a coefficient of $-1.0$ means that a one-standard-deviation increase in log initial income is associated with a one-standard-deviation decrease in growth at that location.

gdf["mgwr_intercept"] = mgwr_results.params[:, 0]
gdf["mgwr_slope"] = mgwr_results.params[:, 1]

Intercept map — the intercept captures baseline growth conditions after accounting for initial income. Positive values indicate districts that grew faster than expected given their income level; negative values indicate underperformance.

fig, ax = plt.subplots(figsize=(14, 8))
# Fisher-Jenks classification with Patch legend (see script.py for details)
gdf.plot(ax=ax, column="mgwr_intercept", scheme="FisherJenks", k=5,
cmap="coolwarm", edgecolor=GRID_LINE, linewidth=0.2, legend=True)
ax.set_title(f"MGWR intercept (bandwidth = {int(mgwr_bw[0])})")
ax.set_axis_off()
plt.savefig("mgwr_mgwr_intercept.png", dpi=300, bbox_inches="tight")
plt.show()

The intercept map reveals a clear east–west gradient. Districts in western Indonesia (Sumatra and Java) tend to have negative intercepts — they grew less than the convergence model would predict based on their initial income alone. Districts in eastern Indonesia (Papua, Maluku, Nusa Tenggara) show positive intercepts, indicating growth that exceeded what initial income would predict. This pattern may reflect the role of resource extraction, infrastructure investment, and fiscal transfers that disproportionately boosted growth in less-developed eastern regions during the 2010–2018 period.

Convergence coefficient map — the slope captures how strongly initial income predicts subsequent growth at each location. Large negative values indicate rapid catching-up; values near zero or positive indicate no convergence or divergence.

fig, ax = plt.subplots(figsize=(14, 8))
gdf.plot(ax=ax, column="mgwr_slope", scheme="FisherJenks", k=5,
cmap="coolwarm", edgecolor=GRID_LINE, linewidth=0.2, legend=True)
ax.set_title(f"MGWR convergence coefficient (bandwidth = {int(mgwr_bw[1])})")
ax.set_axis_off()
plt.savefig("mgwr_mgwr_slope.png", dpi=300, bbox_inches="tight")
plt.show()

The convergence coefficient map is the central finding of this analysis. The global regression reported a single $\beta = -0.195$, but MGWR reveals that this average hides enormous spatial variation. The strongest catching-up (deepest blue, coefficients as negative as $-1.74$) concentrates in western Sumatra and parts of Kalimantan — districts where poorer areas grew much faster than richer neighbors. In contrast, most of Java, eastern Indonesia, and the Maluku islands show coefficients near zero (light pink), indicating that the convergence relationship is essentially absent in these areas. A handful of districts show weakly positive coefficients (up to 0.42), suggesting localized divergence where richer districts pulled further ahead. The coefficient ranges from $-1.74$ to $+0.42$, with a median of $-0.085$ and a standard deviation of 0.553 — far from the single value of $-0.195$ reported by the global model.

7.4 Statistical significance

Not all local coefficients are statistically distinguishable from zero. MGWR provides t-values corrected for multiple testing, which we use to classify each district’s convergence coefficient as significantly negative (catching-up), not significant, or significantly positive (diverging).

mgwr_filtered_t = mgwr_results.filter_tvals()
t_sig = mgwr_filtered_t[:, 1] # Slope t-values
sig_cats = np.where(t_sig < 0, "Negative (catching-up)",
np.where(t_sig > 0, "Positive (diverging)", "Not significant"))
print(f"Negative (catching-up): {(sig_cats == 'Negative (catching-up)').sum()}")
print(f"Not significant: {(sig_cats == 'Not significant').sum()}")
print(f"Positive (diverging): {(sig_cats == 'Positive (diverging)').sum()}")

Negative (catching-up): 149
Not significant: 365
Positive (diverging): 0

fig, ax = plt.subplots(figsize=(14, 8))
cat_colors = {
"Negative (catching-up)": "#2c7bb6",
"Not significant": GRID_LINE,
"Positive (diverging)": "#d7191c",
}
colors_sig = [cat_colors[c] for c in sig_cats]
gdf.plot(ax=ax, color=colors_sig, edgecolor=GRID_LINE, linewidth=0.2)
ax.set_title("MGWR convergence coefficient: statistical significance")
ax.set_axis_off()
plt.savefig("mgwr_mgwr_significance.png", dpi=300, bbox_inches="tight")
plt.show()

Of 514 districts, 149 (29%) show statistically significant convergence at the corrected 5% level — concentrated in Sumatra, western Kalimantan, and Sulawesi. The remaining 365 districts (71%) have convergence coefficients that are not distinguishable from zero after correcting for multiple comparisons. No district shows significant divergence. This means that while the global regression detects convergence on average, it is actually driven by a minority of districts — primarily in western Indonesia — while the majority of the archipelago shows no significant relationship between initial income and growth.

8. Model comparison

The table below summarizes how much explanatory power the spatially varying model adds over the global baseline.

print(f"{'Metric':<25} {'Global OLS':>12} {'MGWR':>12}")
print(f"{'R²':<25} {0.2135:>12.4f} {0.7625:>12.4f}")
print(f"{'Adj. R²':<25} {0.2120:>12.4f} {0.7357:>12.4f}")
print(f"{'AICc':<25} {1341.25:>12.2f} {838.41:>12.2f}")
print(f"{'Bandwidth (intercept)':<25} {'all (514)':>12} {'44':>12}")
print(f"{'Bandwidth (slope)':<25} {'all (514)':>12} {'44':>12}")

Metric Global OLS MGWR
R² 0.2135 0.7625
Adj. R² 0.2120 0.7357
AICc 1341.25 838.41
Bandwidth (intercept) all (514) 44
Bandwidth (slope) all (514) 44

MGWR more than triples the explained variance ($R^2$: 0.214 to 0.762) and dramatically reduces the AICc from 1341 to 838, confirming that the improvement in fit is not merely due to additional flexibility. The bandwidth of 44 for both variables means each local regression uses the nearest 44 districts (about 8.6% of the sample), confirming that the convergence process is highly localized. The adjusted $R^2$ of 0.736 accounts for the additional complexity (52 effective parameters vs 2 in OLS) and still shows a massive improvement, indicating that the spatial variation in coefficients is genuine and not overfitting.

9. Discussion

Economic catching-up in Indonesia is not uniform — it is concentrated in western Sumatra and parts of Kalimantan, while most of the archipelago shows no significant convergence. The global regression’s $\beta = -0.195$ suggests a moderate convergence tendency, but MGWR reveals that this average is driven by a subset of 149 districts (29%) with strong catching-up dynamics. The remaining 365 districts have convergence coefficients indistinguishable from zero.

The intercept map adds another dimension: eastern Indonesian districts tend to have positive intercepts (above-expected growth), while western districts have negative intercepts (below-expected growth). This east–west gradient likely reflects the impact of fiscal transfers, resource booms, and infrastructure programs that targeted less-developed regions during the 2010–2018 period. Combined with the convergence coefficient map, the picture is nuanced: eastern Indonesia grew faster than expected (high intercept), but not because of convergence dynamics (near-zero slope) — rather, because of other factors captured by the intercept.

For policy, these findings challenge the assumption that national-level convergence statistics reflect what is happening locally. A policymaker looking at $\beta = -0.195$ might conclude that Indonesia’s development strategy is successfully closing regional gaps. MGWR reveals that catching-up is geographically selective, and the majority of districts are not on a convergence path at all. Spatially targeted interventions — rather than uniform national programs — may be needed to address this uneven landscape.

10. Summary and next steps

Key takeaways:

Method insight: MGWR reveals spatial heterogeneity invisible to global regression. R² improves from 0.214 to 0.762 by allowing location-specific coefficients. Both variables operate at a bandwidth of 44 districts (~8.6% of the sample), indicating highly localized economic dynamics. Variable standardization is essential before MGWR estimation.
Data insight: Only 149 of 514 Indonesian districts (29%) show statistically significant convergence, concentrated in Sumatra and Kalimantan. The convergence coefficient ranges from $-1.74$ to $+0.42$, far from the global average of $-0.195$. Eastern Indonesia grows faster than expected (positive intercepts) but not through convergence — the catching-up mechanism is absent there.
Limitation: The bivariate model (one independent variable) is intentionally simple for pedagogical purposes. Real convergence analysis would include controls for human capital, infrastructure, institutional quality, and sectoral composition. The bandwidth of 44 applies to both variables in this case, but with additional covariates, MGWR’s ability to assign different bandwidths per variable would be more visible.
Next step: Extend the model with additional covariates (education, investment, fiscal transfers) to disentangle the sources of spatial heterogeneity. Apply MGWR to panel data with multiple time periods. Compare MGWR results with the spatial clusters identified in the ESDA tutorial to see whether convergence hotspots align with LISA clusters.

11. Exercises

Add a second variable. Include an education indicator (e.g., years of schooling) as a second independent variable and re-run MGWR. Do the two covariates receive different bandwidths? What does that tell you about the spatial scale at which education affects growth?
Map the t-values. Instead of mapping the raw coefficients, map the local t-statistics from mgwr_results.tvalues[:, 1]. How does this map compare to the significance map based on corrected t-values?
Compare with ESDA. Run a Moran’s I test on the MGWR residuals. Is there remaining spatial autocorrelation? If not, MGWR has successfully captured the spatial structure. If yes, what might be missing?

12. References

Acknowledgements

Exploratory Spatial Data Analysis (ESDA)

Fri, 01 Mar 2024 00:00:00 +0000

Exploratory Spatial Data Analysis (ESDA) of Regional Development

This interactive application enables users to explore municipal development indicators across Bolivia. In particular, it offers:

🗺️ Geographical data visualizations
📈 Distribution and comparative analysis tools
💾 Downloadable datasets
🧮 Access to a cloud-based computational notebook on Google Colab

⚠️ This application is open source and still work in progress. Source code is available at: github.com/cmg777/streamlit_esda101

📚 Data Sources and Credits

Primary data source: Municipal Atlas of the SDGs in Bolivia 2020.
Additional indicators for multiple years were sourced from the GeoQuery project.
Administrative boundaries from the GeoBoundaries database
Streamlit web app and computational notebook by Carlos Mendez.
Erick Gonzales and Pedro Leoni also colaborated in the organization of the data and the creation of the initial geospatial database

Citation:
Mendez, C. (2025, March 24). Regional Development Indicators of Bolivia: A Dashboard for Exploratory Analysis (Version 0.0.2) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.15074864

🌐 Context and Motivation

Adopted in 2015, the 2030 Agenda for Sustainable Development established 17 Sustainable Development Goals. While global metrics offer useful benchmarks, they often overlook subnational disparities—particularly in heterogeneous countries such as Bolivia.

🇧🇴 Bolivia ranks 79/166 on the 2020 SDG Index (score: 69.3)
🏘️ The Municipal Atlas of the SDGs in Bolivia 2020 reveals intra-national disparities comparable to global inter-country variation

📊 Development Index: Índice Municipal de Desarrollo Sostenible (IMDS)

The Municipal Sustainable Development Index (IMDS) summarizes municipal performance using 62 indicators across 15 Sustainable Development Goals. However, systematic and reliable information on goals 12 and 14 were not available at the municipal level.

🎯 Methodological Criteria

✅ Relevance to local Sustainable Development Goal targets
📥 Data availability from official or trusted sources
🌐 Full municipal coverage (339 municipalities)
🕒 Data mostly from 2012–2019
🧮 Low redundancy between indicators

🗃️ Indicators by Sustainable Development Goal

🧱 Goal 1: No Poverty

Energy poverty rate (2012, INE)
Multidimensional Poverty Index (2013, UDAPE)
Unmet Basic Needs (2012, INE)
Access to basic services: water, sanitation, electricity (2012, INE)

🌾 Goal 2: Zero Hunger

Chronic malnutrition in children under five (2016, Ministry of Health)
Obesity prevalence in women (2016, Ministry of Health)
Average agricultural unit size (2013, Agricultural Census)
Tractor density per 1,000 farms (2013, Agricultural Census)

🏥 Goal 3: Good Health and Well-being

Infant and under-five mortality rates (2016, Ministry of Health)
Institutional birth coverage (2016, Ministry of Health)
Incidence of Chagas, HIV, malaria, tuberculosis, dengue (2016, Ministry of Health)
Adolescent fertility rate (2016, Ministry of Health)

📚 Goal 4: Quality Education

Secondary school dropout rates, by gender (2016, Ministry of Education)
Adult literacy rate (2012, INE)
Share of population with higher education (2012, INE)
Share of qualified teachers, initial and secondary levels (2016, Ministry of Education)

⚖️ Goal 5: Gender Equality

Gender parity in education, labor participation, and poverty (2012–2016, INE and UDAPE)
Note: Data on gender-based violence not available at municipal level

💧 Goal 6: Clean Water and Sanitation

Access to potable water (2012, INE)
Access to sanitation services (2012, INE)
Proportion of treated wastewater (2015, Ministry of Environment)

⚡ Goal 7: Affordable and Clean Energy

Electricity coverage (2012, INE)
Per capita electricity consumption (2015, Ministry of Energy)
Use of clean cooking energy (2015, Ministry of Hydrocarbons)
CO₂ emissions per capita, energy-related (2015, international satellite data)

💼 Goal 8: Decent Work and Economic Growth

Share of non-functioning electricity meters (proxy for informality/unemployment) (2015, Ministry of Energy)
Labor force participation rate (2012, INE)
Youth not in education, employment, or training (NEET rate) (2015, Ministry of Labor)

🏗️ Goal 9: Industry, Innovation, and Infrastructure

Internet access in households (2012, INE)
Mobile signal coverage (2015, telecommunications data)
Availability of urban infrastructure (2015, Ministry of Public Works)

⚖️ Goal 10: Reduced Inequality

Proxy measures: municipal differences in poverty and participation rates (2012–2016, INE and UDAPE)

🏘️ Goal 11: Sustainable Cities and Communities

Urban housing adequacy (2012, INE)
Access to collective transportation (2015, Ministry of Transport)

🌍 Goal 13: Climate Action

Natural disaster resilience index (2015, Ministry of Environment)
CO₂ emissions and forest degradation (2015, satellite data)

🌳 Goal 15: Life on Land

Deforestation rates (2015, satellite data)
Biodiversity loss indicators (2015, Ministry of Environment)

🕊️ Goal 16: Peace, Justice, and Strong Institutions

Birth registration coverage (2012, INE)
Crime and homicide rates (2015, Ministry of Government)
Corruption perceptions (2015, civil society organizations)

🤝 Goal 17: Partnerships for the Goals

Municipal fiscal capacity (2015, Ministry of Economy)
Public investment per capita (2015, Ministry of Economy)

⚠️ Limitations and Future Work

No disaggregated data for Indigenous Territories (TIOC)
Many indicators based on 2012 Census; updates pending
Limited information for Goals 12 and 14 at municipal level
No indicators for educational quality (due to lack of standardized testing)
Gender violence data unavailable at municipal scale

🔗 Access

Original website: atlas.sdsnbolivia.org
Original Publication: sdsnbolivia.org/Atlas
Source Code of the Web App: github.com/cmg777/streamlit_esda101
Computational Notebook: Google Colab

Monitoring subnational human development

Sun, 24 Sep 2023 00:00:00 +0000

A geocomputational notebook to monitor subnational human development

Exploratory data analysis
Exploratory spatial data analysis
- Spatial mapping
- Spatial dependence
- Spatial inequality

Convergence clubs

Sun, 03 Sep 2023 00:00:00 +0000

About the book

Testing for economic convergence across countries has been a central issue in the literature of economic growth and development. This book introduces a modern framework to study the cross-country convergence dynamics of labor productivity and its proximate sources: capital accumulation and aggregate efficiency. In particular, recent convergence dynamics of developed as well as developing countries are evaluated through the lens of a non-linear dynamic factor model and a clustering algorithm for panel data. This framework allows us to examine key economic phenomena such as technological heterogeneity and multiple equilibria. Overall, the book provides a succinct review of the recent club convergence literature, a comparative view of developed and developing countries, and a tutorial on how to implement the club convergence framework in the statistical software Stata. These three features will help graduate students and researchers catch up with the latest developments and methodological implementations of the club convergence literature.

About the author: https://carlos-mendez.org
Read the book online: Only for Nagoya University students
Buy the ebook
Buy the book

Introduction and overview
Measuring labor productivity and its proximate sources
A modern framework to study convergence
Convergence clubs in labor productivity
Convergence clubs in capital accumulation
Convergence clubs in aggregate efficiency
Concluding remarks and new research directions

Tutorials	Download datasets
Video Tutorial	Download full dataset
Convergence clubs analysis using Stata	Download dataset definitions; See dataset definitions
Convergence clubs analysis using R	Download R dataset of developed countries
Explore the data using Python in Deepnote	Download R dataset of developing countries
Explore the data using Python in Google Colab
Explore the data using R in R Studio Cloud

Tutorial: Convergence test and identification of clubs using Stata

Du (2017) introduced a Stata package to perform the econometric convergence analysis and club clustering algorithm of Phillips and Sul (2007). Although the package is well documented and easy to use, it does not include commands to create figures or export tables of results. In what follows, the basic use of the package is described with some additional pieces of code to automate the creation of figures and export of results.

The code below installs the convergence clubs package and its dependencies. It is important to note that Stata 12.1 or higher is needed to run the convergence clubs package. In addition, to export the results to excel, Stata 14.2 or higher is needed to use the putexcel command. Finally, note that this installation should only be done once.

*-------------------------------------------------------
***************** Install packages*********************
*-------------------------------------------------------
* Install the convergence clubs package
findit st0503_1
net install st0503_1, from(http://www.stata-journal.com/software/sj19-1)
* Install package dependencies
ssc install moremata
*-------------------------------------------------------

After installing the package, we need to define some global (macro) parameters such as the name of the dataset (for example, hiYes_log_lp), the main variable to be studied (for example, log_lp), the label of that variable (for example, Labor Productivity), the type of cross-sectional unit (for example, country), and the type of temporal unit (for example,year). Users of this code should carefully check these five parameters as the next steps crucially depend on them to work correctly.

*-------------------------------------------------------
clear all
macro drop _all
set more off
*-------------------------------------------------------
***************** Define five global parameters*********
*-------------------------------------------------------
* (1) Indicate name of the dataset (Example: hiYes_log_lp.dta)
global dataSet hiYes_log_lp
* (2) Indicate name of the variable to be studied (Example: log_lp)
global xVar log_lp
* (3) Write label of the variable (Example: Labor Productivity)
global xVarLabel Labor Productivity
* (4) Indicate cross-sectional unit ID (Example: country)
global csUnitName country
* (5) Indicate temporal unit ID (Example: year)
global timeUnit year
*-------------------------------------------------------

To have a record of the written commands and results (excluding the display of figures), let us start a log file. The name of this file is automatically captured from the previously defined parameters.

*-------------------------------------------------------
***************** Start log file************************
*-------------------------------------------------------
log using "${dataSet}_clubs.txt", text replace
*-------------------------------------------------------

Next, from the current working directory, we load the dataset, which is in a .dta format, and set the structure of the data. Again, we do not have to modify anything from this code as long as the global parameters are correctly defined.

*-------------------------------------------------------
***************** Load and set panel data ***********
*-------------------------------------------------------
** Load data
use "${dataSet}.dta"
* Keep necessary variables
keep id ${csUnitName} ${timeUnit} ${xVar}
* Set panel data
xtset id ${timeUnit}
*-------------------------------------------------------

The next piece of code is the most important one of the entire package. It runs the log-t convergence test, the clustering and merge algorithms, and lists the final results in a table. If we are using a log file, all code and results are recorded in the dataSet_clubs.txt file. In addition, by using the putexcel we can export the results in a table form to excel.

*-------------------------------------------------------
***************** Apply PS convergence test ***********
*-------------------------------------------------------
* (1) Run log-t regression
putexcel set "${dataSet}_test.xlsx", sheet(logtTest) replace
logtreg ${xVar}, kq(0.333)
ereturn list
matrix result0 = e(res)
putexcel A1 = matrix(result0), names nformat("#.##") overwritefmt
* (2) Run clustering algorithm
putexcel set "${dataSet}_test.xlsx", sheet(initialClusters) modify
psecta ${xVar}, name(${csUnitName}) kq(0.333) gen(club_${xVar})
matrix b=e(bm)
matrix t=e(tm)
matrix result1=(b \ t)
matlist result1, border(rows) rowtitle("log(t)") format(%9.3f) left(4)
putexcel A1 = matrix(result1), names nformat("#.##") overwritefmt
* (3) Run merge algorithm
putexcel set "${dataSet}_test.xlsx", sheet(mergingClusters) modify
scheckmerge ${xVar}, kq(0.333) club(club_${xVar})
matrix b=e(bm)
matrix t=e(tm)
matrix result2=(b \ t)
matlist result2, border(rows) rowtitle("log(t)") format(%9.3f) left(4)
putexcel A1 = matrix(result2), names nformat("#.##") overwritefmt
* (4) List final clusters
putexcel set "${dataSet}_test.xlsx", sheet(finalClusters) modify
imergeclub ${xVar}, name(${csUnitName}) kq(0.333) club(club_${xVar}) gen(finalclub_${xVar})
matrix b=e(bm)
matrix t=e(tm)
matrix result3=(b \ t)
matlist result3, border(rows) rowtitle("log(t)") format(%9.3f) left(4)
putexcel A1 = matrix(result3), names nformat("#.##") overwritefmt
*-------------------------------------------------------

To plot the dynamics of the cross-sectional units and their respective convergence clubs, we first need to re-scale the data based on the cross-sectional average of each year. The code below performs that task. The result of this code is an extended panel dataset (in both .dta and .csv formats) that includes the list of countries, club membership, and the absolute and relative values of the variable under study.

*-------------------------------------------------------
***************** Generate relative variables**********
*-------------------------------------------------------
** Generate relative variable (useful for ploting)
save "temporary1.dta",replace
use "temporary1.dta"
collapse ${xVar}, by(${timeUnit})
gen id=999999
append using "temporary1.dta"
sort id ${timeUnit}
gen ${xVar}_av = ${xVar} if id==999999
bysort ${timeUnit} (${xVar}_av): replace ${xVar}_av = ${xVar}_av[1]
gen re_${xVar} = 1*(${xVar}/${xVar}_av)
label var re_${xVar} "Relative ${xVar} (Average=1)"
drop ${xVar}_av
sort id ${timeUnit}
drop if id == 999999
rm "temporary1.dta"
* order variables
order ${csUnitName}, before(${timeUnit})
order id, before(${csUnitName})
* Export data to csv
export delimited using "${dataSet}_clubs.csv", replace
save "${dataSet}_clubs.dta", replace
*-------------------------------------------------------

Given the extended dataset, the code below plots multiple figures and export them as .pdf and .gph formats. There are three types of plots. First, the relative transition paths of all countries are plotted. This plot is useful as it provides a first graphical overview of dataset. Second, relative transition paths are plotted based on the club classification. Not only a plot for each club is created, but there is also a plot that compares all clubs using a common y-axis. Third, a plot based on within-club averages is also created. It is important to note that the colors and design of figures are based on the plotplainblind scheme. See @Bischof2017 for further information about the graphical scheme. This scheme can be installed by typing the following in the Stata console: net install gr0070, from(http://www.stata-journal.com/software/sj17-3). Activate the scheme by typing: set scheme plotplainblind.

*-------------------------------------------------------
***************** Plot the clubs *********************
*-------------------------------------------------------
** All lines
xtline re_${xVar}, overlay legend(off) scale(1.6) ytitle("${xVarLabel}", size(small)) yscale(lstyle(none)) ylabel(, noticks labcolor(gs10)) xscale(lstyle(none)) xlabel(, noticks labcolor(gs10)) xtitle("") name(allLines, replace)
graph save "${dataSet}_allLines.gph", replace
graph export "${dataSet}_allLines.pdf", replace
** Indentified Clubs
summarize finalclub_${xVar}
return list
scalar nunberOfClubs = r(max)
forval i=1/`=nunberOfClubs' {
xtline re_${xVar} if finalclub_${xVar} == `i', overlay title("Club `i'", size(small)) legend(off) scale(1.5) yscale(lstyle(none)) ytitle("${xVarLabel}", size(small)) ylabel(, noticks labcolor(gs10)) xtitle("") xscale(lstyle(none)) xlabel(, noticks labcolor(gs10)) name(club`i', replace)
local graphs `graphs' club`i'
}
graph combine `graphs', ycommon
graph save "${dataSet}_clubsLines.gph", replace
graph export "${dataSet}_clubsLines.pdf", replace
** Within-club averages
collapse (mean) re_${xVar}, by(finalclub_${xVar} ${timeUnit})
xtset finalclub_${xVar} ${timeUnit}
rename finalclub_${xVar} Club
xtline re_${xVar}, overlay scale(1.6) ytitle("${xVarLabel}", size(small)) yscale(lstyle(none)) ylabel(, noticks labcolor(gs10)) xscale(lstyle(none)) xlabel(, noticks labcolor(gs10)) xtitle("") name(clubsAverages, replace)
graph save "${dataSet}_clubsAverages.gph", replace
graph export "${dataSet}_clubsAverages.pdf", replace
clear
use "${dataSet}_clubs.dta"
*-------------------------------------------------------

The code below exports the list of countries and their club membership to a .csv file. This list can be used as a handy reference in the appendix section of a publication.

*-------------------------------------------------------
***************** Export list of clubs ****************
*-------------------------------------------------------
summarize ${timeUnit}
scalar finalYear = r(max)
keep if ${timeUnit} == `=finalYear'
keep id ${csUnitName} finalclub_${xVar}
sort finalclub_${xVar} ${csUnitName}
export delimited using "${dataSet}_clubsList.csv", replace
*-------------------------------------------------------

Finally, the code below closes the log file.

*-------------------------------------------------------
***************** Close log file*************
*-------------------------------------------------------
log close
*-------------------------------------------------------

Spatial inequality dynamics

Sun, 27 Aug 2023 00:00:00 +0000

Monitoring regional sustainable development

Sat, 26 Aug 2023 00:00:00 +0000

A geocomputational notebook to monitor regional development in Bolivia

Carlos Mendez (Nagoya Univerisity), Erick Gonzales (United Nations), Lykke Andersen (SDSN Bolivia)

Exploratory data analysis
Exploratory spatial data analysis
- Spatial dependence
- Spatial inequality
- Spatial heterogeneity

https://shorturl.at/evEFS

Suggested citation:
Mendez, C., Gonzales, E., & Andersen, L. (2023). A geocomputational notebook to monitor regional development in Bolivia. Zenodo. https://doi.org/10.5281/zenodo.828685

Github repository: https://github.com/quarcs-lab/project2021o-notebook

The Solow growth model and its convergence prediction

Sat, 29 Jul 2023 00:00:00 +0000

📊 The Augmented Solow Model: An Overview with Python, R, and Stata

How do countries grow richer, and why do some grow faster than others? Today, we’re diving into a computational exploration of economic growth using the augmented Solow model, an enhanced version of Solow’s foundational 1956 model that includes insights from Mankiw, Romer, and Weil (1992). This model helps explain why some countries grow richer than others and whether poor countries are indeed catching up to the wealthier ones. Let’s unpack the model, the equations, and what the data says.

🔍 The Classic Solow Model: A Quick Recap

The Solow model is one of the cornerstones of economic growth theory. It explains how countries grow by focusing on three main ingredients:

Physical Capital (★): Think of it as the machines, factories, and tools that help us produce more.
Labor (👨‍🌾): The workforce that puts the capital to use.
Technology (or Productivity): The magic that makes capital and labor more effective.

The original Solow model tells us that growth can occur through accumulating physical capital, increasing the workforce, and through technological progress. However, over time, capital experiences diminishing returns — the more you invest, the less extra output you get, unless technology improves.

🧠 Why Augment the Model?

In 1992, Mankiw, Romer, and Weil suggested adding human capital to the mix. Human capital, like education and health, can significantly enhance productivity. By adding this to the model, we get a richer understanding of growth disparities between nations.

This shows that growth is not just about physical investments and labor but also about how well the workforce is trained and educated. Human capital plays a pivotal role in enhancing productivity, which can accelerate growth, particularly in poorer countries.

📈 Convergence: Are Poorer Countries Catching Up?

A critical prediction of the Solow model is convergence — the idea that poorer countries should grow faster than richer countries, eventually catching up in terms of per capita income.

However, data shows conditional convergence rather than unconditional convergence. This means countries tend to converge to their own steady-state levels of income, which are defined by their individual characteristics like savings rate, population growth, and human capital levels.

🗃️ Data Analysis & Key Insights

The dataset used in this analysis includes cross-country data on economic indicators like GDP, investment rates, and education levels.

Data Samples:

Non-oil Sample (98 countries): Countries not heavily reliant on oil production.
Intermediate Sample (75 countries): Excludes very small countries and those with data issues.
OECD Sample (22 countries): Focuses on countries with higher data quality.

The Python notebook processes these datasets to estimate the parameters for savings, population growth, and human capital, helping us understand the role of these factors in determining income levels and growth rates across countries.

🔗 Further Resources

Video review: For a foundational overview of the Solow growth model, check out this introductory video
Stata Replication Code: To replicate the key tables and figures from Mankiw, Romer, and Weil, access the GitHub Gist here.
Primer on the Solow Model: For those new to the basics, this primer is a great place to start.

🖥️ Python Notebook Insights

The computational notebook provides step-by-step Python-based analysis, from loading the dataset to estimating parameters and visualizing growth trends. By transforming variables like GDP, savings, and education into their logarithmic forms, the model reveals the underlying dynamics of growth and the relative importance of each factor.

📝 Summary

The augmented Solow model enriches our understanding of economic growth by adding human capital into the equation. This addition helps explain why some countries grow faster than others and supports the concept of conditional convergence — the idea that countries grow towards their own unique steady states based on their savings rates, population growth, and education.

Learn by R coding using this Google Colab notebook.

Learn by Python coding using this Google Colab notebook.

Learn by Stata coding using this Stata script.

Introduction to spatial data science

Mon, 01 Apr 2019 00:00:00 +0000

Introduction to spatial data science with Python