Do Institutions Cause Prosperity?

An IV tutorial: instrumenting modern institutions with settler mortality

0.9442SLS effect of institutions

+81%larger than naive OLS

64ex-colonies, one instrument

Carlos Mendez

Nagoya University (GSID)

July 8, 2026

The Tension

Act I

Richer countries have better institutions — but a correlation cannot tell us which way the arrow points

Stronger property-rights institutions track far higher income across countries. The gradient is real and huge.

But maybe rich countries simply afford better courts — or geography drives both. The slope is correlation; it cannot prove cause.

Across every specification, the causal effect lives near 0.9 — well above the OLS slope of 0.5

Coefficient on institutions ($\hat\beta$) across six specifications, 95% CIs. Orange = naive OLS; steel = IV with settler mortality; teal = an alternative instrument.

Where we’re going

Why OLS is biased here — reverse causality, omitted variables, measurement error
The three conditions every instrument must satisfy
Settler mortality as the instrument: first stage and reduced form
2SLS as a single ratio — and what the 0.944 number really is

The Investigation

Act II

Institutions are endogenous, so OLS does not estimate the causal effect

Reverse causality — rich countries can afford better institutions
Omitted variables — geography, culture, human capital drive both
Measurement error — the institutions index is a noisy proxy, attenuating OLS

A regressor is endogenous when it correlates with the error term; then OLS is biased even with infinite data.

The structural model: the error is correlated with the regressor — that is the whole problem

\[Y_i = \alpha + \beta X_i + U_i, \qquad \mathrm{Cov}(X_i, U_i) \neq 0\]

The outcome $Y_i$ (log GDP) depends on the endogenous regressor $X_i$ (institutions) plus an error $U_i$ that gathers every unobserved driver of income.

The target is $\beta$, the true causal coefficient. The non-zero $\mathrm{Cov}(X_i, U_i)$ is exactly why OLS misses it.

A valid instrument must clear three bars: relevance, exclusion, exogeneity

What an instrument $Z$ must do

Relevance — $Z$ moves $X$ (testable: first-stage $F$)
Exclusion — $Z$ affects $Y$ only through $X$
Exogeneity — $Z$ is uncorrelated with $U$

AJR’s instrument

$Z =$ log settler mortality circa 1700
Deadly colonies became extractive; safe colonies got European-style institutions
1700 mortality cannot react to 1995 GDP

The lab: AJR’s base sample of 64 ex-colonies, one instrument, a 60-fold income range

Outcome — log GDP per capita 1995 (logpgp95), spanning roughly $450 to $27,400
Endogenous regressor — protection from expropriation (avexpr), 0–10
Instrument — log settler mortality (logem4), nearly six log points of spread

The baseco==1 subset of the wider ~163-country world: 64 ex-colonies with valid mortality data.

Relevance holds: a one-log-point rise in mortality cuts institutions by 0.607, F = 16.85

First-stage scatter of institutions (avexpr) on log settler mortality (logem4), 64 ex-colonies. Slope $-0.607$, $F = 16.85$, $R^2 = 0.27$.

The reduced form confirms it: deadlier colonies are about 30 times poorer today

Reduced-form scatter of log GDP (logpgp95) on log settler mortality (logem4). The slope ($\approx -0.573$) is the total effect of the instrument on the outcome.

2SLS is just one division: the reduced-form slope over the first-stage slope

\[\hat\beta_{2SLS} = \frac{\widehat{\mathrm{Cov}}(Y, Z)}{\widehat{\mathrm{Cov}}(X, Z)} = \frac{\hat\beta_{RF}}{\hat\beta_{FS}} = \frac{-0.573}{-0.607} = 0.944\]

The numerator is the total effect of the instrument on the outcome; the denominator rescales by how much the instrument moves institutions.

The whole IV machinery, in one ratio: $-0.573 / -0.607 = 0.944$.

Two libraries, one formula: pyfixest gives the estimate, linearmodels the diagnostics

import pyfixest as pf
from linearmodels.iv import IV2SLS
# structural 2SLS via pyfixest's "exog | endog ~ instrument" syntax
m_iv = pf.feols("logpgp95 ~ 1 | avexpr ~ logem4", data=base, vcov="HC1")
# weak-IV F, Wu-Hausman, Hansen J — the tests pyfixest does not report
res  = IV2SLS(base["logpgp95"], X_exog, base[["avexpr"]], base[["logem4"]]).fit(cov_type="robust")

Naive OLS sees only half the story: a slope of 0.522

0.522

OLS estimate of $\hat\beta$ on institutions, base sample (SE 0.050) — the benchmark IV will overturn

The strongest objection — and the answer

Objection. Maybe the tropical disease environment that killed settlers still depresses productivity today — a direct arrow from mortality to GDP that breaks the exclusion restriction.

Response. Adding modern health controls pulls $\hat\beta$ down only to 0.55–0.69 — still above OLS — and the overidentification tests (Hansen J, $p$ = 0.18–0.79) do not reject joint exogeneity. The threat is real but bounded; it does not erase the effect.

The Resolution

Act III

Instrumenting institutions recovers a causal effect of 0.944 — Wu-Hausman confirms OLS was biased

0.944

2SLS $\hat\beta$ on institutions (SE 0.176, 95% CI [0.60, 1.29]); Wu-Hausman $F = 24.22$, $p < 0.0001$

The causal effect is 81% larger than OLS — measurement error, not endogeneity, dominated the bias

Estimator	$\hat\beta$	SE	95% CI
OLS (base sample)	0.522	0.050	—
2SLS (settler mortality)	0.944	0.176	[0.60, 1.29]

IV > OLS by 81% implies attenuation from measurement error outweighed reverse causality and omitted variables.

The 0.944 is a LATE for compliers, and it leans on assumptions only partly testable

What it is

A Local Average Treatment Effect (Imbens-Angrist)
The effect for complier countries near the colonization margin
Not a population average for every country

What still carries weight

Exclusion is untestable in principle
Albouy (2012): ~36% of mortality data imputed or shared
Hansen J cannot detect coordinated witnesses

Let the disease environment of 1700, not the regression of 1995, identify the effect of institutions.