Downloads
Each dataset is available as a labeled Stata .dta and its source file.
⇩ Download all data (ZIP)stata_codebook.do
| Dataset | Grain | Rows | Stata | Source |
|---|---|---|---|---|
raw_data | county-year | 31,843 × 22 | raw_data.dta | raw_data.csv |
data_prepared | county-year | 28,644 × 17 | data_prepared.dta | data_prepared.csv |
Run stata_codebook.do in Stata once to attach long-form per-variable notes to the .dta files.
Load directly in code
Every file loads straight from GitHub (raw URLs). Swap the file name to load any dataset.
Stata
* Stata 14+ : `use` reads an https URL directly
global BASE "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/r_did2/data/"
use "${BASE}raw_data.dta", clear
describe
notesPython
!pip install -q pyreadstat
import pandas as pd
BASE = "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/r_did2/data/"
df = pd.read_stata(BASE + "raw_data.dta")
# load every dataset at once
files = ["raw_data", "data_prepared"]
data = {f: pd.read_stata(BASE + f + ".dta") for f in files}
# pyreadstat (richest metadata) reads LOCAL files -> download first
import pyreadstat, urllib.request
urllib.request.urlretrieve(BASE + "raw_data.dta", "raw_data.dta")
df, meta = pyreadstat.read_dta("raw_data.dta")Copy and paste this snippet in Google Colab app. https://colab.research.google.com/notebooks/empty.ipynb
R
# R : haven::read_dta auto-downloads an https URL
library(haven)
BASE <- "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/r_did2/data/"
df <- read_dta(paste0(BASE, "raw_data.dta"))Overview & sources
Companion data for a hands-on R tutorial that asks whether the Affordable Care Act's staggered Medicaid expansion reduced adult mortality, and uses the question to show how population weighting changes the target parameter when the units (U.S. counties) differ in size by orders of magnitude. Following Baker, Callaway, Cunningham, Goodman-Bacon and Sant'Anna's (2025) Practitioner's Guide, the post runs an eight-stage DiD pipeline — 2×2 cell means, three equivalent TWFE specifications, covariate-adjusted OR/IPW/DRDID, a 2×T event study, the Callaway–Sant'Anna staggered ATT(g,t) design, and a Rambachan–Roth HonestDiD sensitivity analysis — computing every estimate both unweighted and weighted by 2013 adult population. The headline 2×2 ATT(2014) flips sign with weighting, from +0.122 deaths per 100,000 unweighted to −2.563 weighted, while no 95% confidence interval at any stage comfortably excludes zero. The two estimands answer different questions — the typical treated county versus the typical treated adult.
raw_data is the merged CDC county mortality file (deaths per 100,000 adults aged 20–64) joined to state-level ACA Medicaid-expansion timing, as downloaded — one row per county × year, 2009–2019, before any cleaning. data_prepared is the balanced analysis sample built from it: drop the five pre-2014 expanders (DC, DE, MA, NY, VT), require full mortality coverage 2009–2019 and full covariate coverage in 2013–2014, and add the modeling columns (covariate shares, fixed 2013 population weight, treatment-year and post indicators).
Data sources
| Source | Provides | Reference / URL |
|---|---|---|
| CDC WONDER (mortality) | County-level death counts and crude mortality rates for adults aged 20-64, plus population denominators | U.S. Centers for Disease Control and Prevention, CDC WONDER. https://wonder.cdc.gov/ |
| ACA Medicaid expansion timing | State Medicaid-expansion adoption status, year, and month (the staggered treatment timing) | KFF, Status of State Medicaid Expansion Decisions. https://www.kff.org/medicaid/issue-brief/status-of-state-medicaid-expansion-decisions-interactive-map/ |
| County socioeconomic covariates | Unemployment, poverty, and median household income by county-year (baseline covariates) | U.S. Census Bureau / Bureau of Labor Statistics county-level series (as merged in the source file). |
| Replicated study & method references | Empirical example, estimators, and concepts | Baker et al. (2025, arXiv:2503.13323); Callaway & Sant'Anna (2021); Sant'Anna & Zhao (2020); Rambachan & Roth (2023); Imbens & Rubin (2015). |
Cite this data
Please cite this dataset as follows.
APA
Mendez, C. (2026). Difference-in-Differences for Regional Data: Did Medicaid Expansion Reduce Mortality? [Data set]. https://carlos-mendez.org/post/r_did2/
Baker, A., Callaway, B., Cunningham, S., Goodman-Bacon, A., & Sant'Anna, P. H. C. (2025). Difference-in-Differences Designs: A Practitioner's Guide. arXiv:2503.13323. Callaway, B., & Sant'Anna, P. H. C. (2021). Difference-in-Differences with multiple time periods. Journal of Econometrics, 225(2), 200-230. Sant'Anna, P. H. C., & Zhao, J. (2020). Doubly robust difference-in-differences estimators. Journal of Econometrics, 219(1), 101-122. Rambachan, A., & Roth, J. (2023). A More Credible Approach to Parallel Trends. Review of Economic Studies, 90(5), 2555-2591.BibTeX
@misc{mendez2026rdid2,
author = {Mendez, Carlos},
title = {Difference-in-Differences for Regional Data: Did Medicaid Expansion Reduce Mortality?},
year = {2026},
howpublished = {\url{https://carlos-mendez.org/post/r_did2/}},
note = {Data set}
}
@article{baker2025did,
author = {Baker, Andrew and Callaway, Brantly and Cunningham, Scott and Goodman-Bacon, Andrew and Sant'Anna, Pedro H. C.},
title = {Difference-in-Differences Designs: A Practitioner's Guide},
journal = {arXiv preprint arXiv:2503.13323}, year = {2025}
}
@article{callaway2021did,
author = {Callaway, Brantly and Sant'Anna, Pedro H. C.},
title = {Difference-in-Differences with multiple time periods},
journal = {Journal of Econometrics}, volume = {225}, number = {2},
pages = {200--230}, year = {2021}
}
@article{santanna2020drdid,
author = {Sant'Anna, Pedro H. C. and Zhao, Jun},
title = {Doubly robust difference-in-differences estimators},
journal = {Journal of Econometrics}, volume = {219}, number = {1},
pages = {101--122}, year = {2020}
}
@article{rambachan2023honest,
author = {Rambachan, Ashesh and Roth, Jonathan},
title = {A More Credible Approach to Parallel Trends},
journal = {Review of Economic Studies}, volume = {90}, number = {5},
pages = {2555--2591}, year = {2023}
}Variable explorer search & filter all 30 variables
Type to filter by name or label, or use the chips to filter by type. Each row shows a mini distribution. Click a header to sort.
| Variable | Type | Distribution | Label | Definition | Units | In files | Source |
|---|---|---|---|---|---|---|---|
Description# | identifier | – | Expansion description (free text) | Free-text note on the state's expansion implementation (date, retroactivity, etc.). | string | raw_data | KFF / ACA timing |
Post# | dummy | Post-2014 period dummy | 1 for years 2014 and later, else 0 (the post period in the 2x2 design). | 0/1 | data_prepared | Derived | |
Treat_2014# | dummy | 2014-cohort treatment dummy | 1 if the county's state expanded Medicaid in 2014, else 0. | 0/1 | data_prepared | Derived | |
county# | identifier | – | County name (with state abbrev.) | County name followed by its two-letter state abbreviation, e.g. "Autauga County, AL". | string | raw_data, data_prepared | CDC WONDER |
county_code# | identifier | – | County FIPS code | Five-digit federal (FIPS) county identifier; the unit id for the panel. | FIPS | raw_data, data_prepared | CDC WONDER |
crude_rate_20_64# | continuous | Crude mortality rate, adults 20-64 | Deaths per 100,000 adults aged 20-64 — the DiD outcome variable. | per 100,000 | raw_data, data_prepared | CDC WONDER | |
deaths# | continuous | Deaths, adults 20-64 | Count of deaths among adults aged 20-64 in the county-year. | count | raw_data | CDC WONDER | |
expansion_status# | identifier | – | ACA Medicaid expansion status (text) | Whether the state had adopted and implemented Medicaid expansion. | category | raw_data | KFF / ACA timing |
labor_force# | continuous | Civilian labor force | County civilian labor force count (denominator of the raw unemployment rate). | persons | raw_data | BLS (merged) | |
maca# | identifier | – | Month of ACA Medicaid expansion | Calendar month (1-12) in which the state implemented expansion; missing for never-expanders. | month (1-12) | raw_data | KFF / ACA timing |
median_income# | continuous | Median household income | County median household income; in the raw file expressed in US$, rescaled to thousands of US$ in the prepared data. | US$ (raw) / US$ 000s (prepared) | raw_data, data_prepared | Census (merged) | |
perc_female# | continuous | Female share, adults 20-64 (%) | Percent of the county's 20-64 population that is female (baseline covariate). | % | data_prepared | Derived (CDC) | |
perc_hispanic# | continuous | Hispanic share, adults 20-64 (%) | Percent of the county's 20-64 population that is Hispanic (baseline covariate). | % | data_prepared | Derived (CDC) | |
perc_white# | continuous | White share, adults 20-64 (%) | Percent of the county's 20-64 population that is white (baseline covariate). | % | data_prepared | Derived (CDC) | |
population_20_64# | continuous | Population aged 20-64 | County adult population aged 20-64 (denominator of the crude mortality rate). | persons | raw_data, data_prepared | CDC WONDER | |
population_20_64_female# | continuous | Population 20-64, female | Female adults aged 20-64 (numerator for perc_female). | persons | raw_data | CDC WONDER | |
population_20_64_hispanic# | continuous | Population 20-64, Hispanic | Adults aged 20-64 identifying as Hispanic (numerator for perc_hispanic). | persons | raw_data | CDC WONDER | |
population_20_64_white# | continuous | Population 20-64, white | White adults aged 20-64 (numerator for perc_white). | persons | raw_data | CDC WONDER | |
population_total# | continuous | Total county population | Total resident population of the county-year (all ages). | persons | raw_data | CDC WONDER | |
poverty_rate# | continuous | Poverty rate (%) | Share of the county population below the federal poverty line. | % | raw_data, data_prepared | Census (merged) | |
set_wt# | continuous | Fixed 2013 adult population weight | Each county's 2013 population aged 20-64, held constant across all 11 years (the population weight). | persons | data_prepared | Derived (CDC) | |
state# | identifier | – | State name | Full name of the U.S. state the county belongs to. | string | raw_data | CDC WONDER |
state_abb# | identifier | – | State abbreviation | Two-letter U.S. state postal abbreviation. | code | data_prepared | Derived |
stfips# | identifier | – | State FIPS code | Numeric federal (FIPS) identifier for the state (1-56). | FIPS | raw_data | CDC WONDER |
treat_year# | identifier | – | Treatment year (did convention) | Year the county's state expanded Medicaid (2014/2015/2016/2019), or 0 for never-treated counties. | year / 0 | data_prepared | Derived |
unemp_rate# | continuous | Unemployment rate (%) | County unemployment rate (baseline covariate). | % | raw_data, data_prepared | Derived (BLS) | |
unemployed# | continuous | Number unemployed | County count of unemployed persons in the labor force. | persons | raw_data | BLS (merged) | |
yaca# | year | – | Year of ACA Medicaid expansion | Calendar year the state implemented Medicaid expansion; missing (NA) for never-expanders. | year | raw_data, data_prepared | KFF / ACA timing |
year# | year | – | Calendar year | Annual time index of the observation. | year | raw_data, data_prepared | CDC WONDER |
year_code# | year | – | Year code (CDC) | CDC WONDER's year code; numerically equal to the calendar year here. | year | raw_data | CDC WONDER |
Cross-file variable index
Which file each variable appears in (● = present).
Construction & formulas
The outcome is the CDC crude mortality rate crude_rate_20_64 (deaths per 100,000
adults aged 20–64). Every estimate is computed twice: unweighted (each county
counts equally — the ATT for the typical treated county) and
population-weighted by the fixed 2013 adult population set_wt
(each adult counts equally — the ATT for the typical treated adult).
- 2×2 cell-means DiD (
ATT(2014)):(Ȳ_T,post − Ȳ_T,pre) − (Ȳ_C,post − Ȳ_C,pre)— the treated group's 2013→2014 change minus the control group's change. - TWFE 2×2:
Y_it = β₀ + β₁·1{D=1} + β₂·1{t=2014} + β^(2×2)·(D×Post) + ε_it; on a balanced 2×2 panel the Levels, two-way-FE, and long-difference forms recover the sameβ^(2×2). - Normalized difference (covariate balance):
(X̄_T − X̄_C) / √[(S²_T + S²_C)/2]; |value| > 0.25 flags imbalance (Imbens & Rubin 2015). - Doubly-robust DiD (DRDID):
(1/n) Σ (ŵ_{D=1} − ŵ_{D=0})(ΔY_i − μ̂_{Δ,D=0}(X_i))— consistent if either the outcome model or the propensity model is correct (Sant'Anna & Zhao 2020). - Group-time ATT (Callaway–Sant'Anna):
ATT(g,t) = E[Y_it(g) − Y_it(∞) | G_i = g], aggregated by cohort and by event time. - HonestDiD relative magnitudes: bound the post-period parallel-trends
violation at a multiple
M̄of the worst observed pre-period violation; the breakdown value is the smallestM̄that overturns the sign.
Constructed columns in data_prepared (built by the post's R script from
raw_data): perc_white = population_20_64_white / population_20_64 · 100
(and likewise perc_hispanic, perc_female); unemp_rate rescaled
to percent (×100); median_income rescaled to thousands of US$ (÷1000);
set_wt = each county's 2013 population_20_64, held constant across years;
treat_year = yaca if it falls in 2014–2019, else 0 (the did
never-treated convention); Treat_2014 = 1 if yaca == 2014;
Post = 1 if year ≥ 2014.
The datasets
Switch datasets with the tabs. Each shows the full variable dictionary plus a sortable statistics table with mini distributions and data coverage.
expand to search (Ctrl/⌘+F) or print across all datasets
Variable dictionary
| Variable | Label | Definition | Construction | Units | Source | Coverage |
|---|---|---|---|---|---|---|
state identifier | State name | Full name of the U.S. state the county belongs to. | From the CDC mortality file. | string | CDC WONDER | raw file |
stfips identifier | State FIPS code | Numeric federal (FIPS) identifier for the state (1-56). | From the CDC mortality file. | FIPS | CDC WONDER | raw file |
county identifier | County name (with state abbrev.) | County name followed by its two-letter state abbreviation, e.g. "Autauga County, AL". | From the CDC mortality file; the trailing two characters give state_abb in the prepared data. | string | CDC WONDER | both files |
county_code identifier | County FIPS code | Five-digit federal (FIPS) county identifier; the unit id for the panel. | From the CDC mortality file. | FIPS | CDC WONDER | both files |
year year | Calendar year | Annual time index of the observation. | From the CDC mortality file (2009-2019). | year | CDC WONDER | both files |
year_code year | Year code (CDC) | CDC WONDER's year code; numerically equal to the calendar year here. | From the CDC mortality file. | year | CDC WONDER | raw file |
deaths continuous | Deaths, adults 20-64 | Count of deaths among adults aged 20-64 in the county-year. | From the CDC mortality file (numerator of the crude rate). | count | CDC WONDER | raw file |
population_20_64 continuous | Population aged 20-64 | County adult population aged 20-64 (denominator of the crude mortality rate). | From the CDC mortality file. | persons | CDC WONDER | both files |
crude_rate_20_64 continuous | Crude mortality rate, adults 20-64 | Deaths per 100,000 adults aged 20-64 — the DiD outcome variable. | 100,000 x deaths / population_20_64 (as supplied by CDC WONDER). | per 100,000 | CDC WONDER | both files |
population_total continuous | Total county population | Total resident population of the county-year (all ages). | From the CDC mortality file. | persons | CDC WONDER | raw file |
population_20_64_hispanic continuous | Population 20-64, Hispanic | Adults aged 20-64 identifying as Hispanic (numerator for perc_hispanic). | From the CDC mortality file. | persons | CDC WONDER | raw file |
population_20_64_female continuous | Population 20-64, female | Female adults aged 20-64 (numerator for perc_female). | From the CDC mortality file. | persons | CDC WONDER | raw file |
population_20_64_white continuous | Population 20-64, white | White adults aged 20-64 (numerator for perc_white). | From the CDC mortality file. | persons | CDC WONDER | raw file |
unemployed continuous | Number unemployed | County count of unemployed persons in the labor force. | From the county labor-market series in the merged file. | persons | BLS (merged) | raw file |
labor_force continuous | Civilian labor force | County civilian labor force count (denominator of the raw unemployment rate). | From the county labor-market series in the merged file. | persons | BLS (merged) | raw file |
unemp_rate continuous | Unemployment rate (%) | County unemployment rate (baseline covariate). | Raw fractional rate rescaled to percent (x100) in the prepared data. | % | Derived (BLS) | prepared file |
poverty_rate continuous | Poverty rate (%) | Share of the county population below the federal poverty line. | From the county socioeconomic series in the merged file (already in percent). | % | Census (merged) | both files |
median_income continuous | Median household income | County median household income; in the raw file expressed in US$, rescaled to thousands of US$ in the prepared data. | From the county socioeconomic series; the prepared data divides by 1,000. | US$ (raw) / US$ 000s (prepared) | Census (merged) | both files |
expansion_status identifier | ACA Medicaid expansion status (text) | Whether the state had adopted and implemented Medicaid expansion. | From the state ACA-expansion timing source. | category | KFF / ACA timing | raw file |
Description identifier | Expansion description (free text) | Free-text note on the state's expansion implementation (date, retroactivity, etc.). | From the state ACA-expansion timing source. | string | KFF / ACA timing | raw file |
yaca year | Year of ACA Medicaid expansion | Calendar year the state implemented Medicaid expansion; missing (NA) for never-expanders. | Parsed from the state ACA-expansion timing source; arrives as a string with "NA" sentinels and is coerced to numeric. | year | KFF / ACA timing | both files |
maca identifier | Month of ACA Medicaid expansion | Calendar month (1-12) in which the state implemented expansion; missing for never-expanders. | Parsed from the state ACA-expansion timing source. | month (1-12) | KFF / ACA timing | raw file |
Distribution & statistics (click a header to sort)
| Variable | Distribution | Coverage | N | Distinct | Min | Mean | Median | Max | SD |
|---|---|---|---|---|---|---|---|---|---|
state | – | 100% | 31,843 | 51 | — | — | — | — | — |
stfips | – | 100% | 31,843 | 51 | — | — | — | — | — |
county | – | 100% | 31,843 | 3,064 | — | — | — | — | — |
county_code | – | 100% | 31,843 | 3,064 | — | — | — | — | — |
year | – | 100% | 31,843 | 11 | 2009 | 2014.0 | 2014 | 2019 | 3.16 |
year_code | – | 100% | 31,843 | 11 | 2009 | 2014.0 | 2014 | 2019 | 3.16 |
deaths | 100% | 31,783 | 1,985 | 0 | 229.8 | 79.00 | 16,188 | 603.6 | |
population_20_64 | 100% | 31,783 | 24,240 | 47.00 | 65,477 | 16,906 | 6,338,759 | 206,172 | |
crude_rate_20_64 | 100% | 31,783 | 30,001 | 0 | 454.1 | 435.9 | 1,883.8 | 158.7 | |
population_total | 100% | 31,783 | 26,782 | 71.00 | 109,908 | 29,358 | 10,170,292 | 336,957 | |
population_20_64_hispanic | 100% | 31,783 | 9,396 | 0 | 11,009 | 671.0 | 3,016,128 | 75,656 | |
population_20_64_female | 100% | 31,783 | 19,955 | 20.00 | 32,949 | 8,370.0 | 3,183,635 | 104,084 | |
population_20_64_white | 100% | 31,783 | 23,587 | 17.00 | 51,224 | 14,695 | 4,558,532 | 149,425 | |
unemployed | 100% | 31,758 | 7,939 | 4.00 | 3,515.2 | 873.0 | 621,950 | 12,818 | |
labor_force | 100% | 31,758 | 23,031 | 43.00 | 54,367 | 13,724 | 5,148,584 | 169,077 | |
unemp_rate | 100% | 31,758 | 31,497 | 0.011 | 0.067 | 0.061 | 0.294 | 0.031 | |
poverty_rate | 100% | 31,777 | 448 | 2.60 | 16.44 | 15.50 | 56.70 | 6.43 | |
median_income | 100% | 31,777 | 22,080 | 18,860 | 47,863 | 45,641 | 151,806 | 13,223 | |
expansion_status | – | 100% | 31,843 | 2 | — | — | — | — | — |
Description | – | 69% | 22,054 | 18 | — | — | — | — | — |
yaca | – | 69% | 22,054 | 7 | 2014 | 2016.2 | 2014 | 2023 | 3.10 |
maca | – | 69% | 22,054 | 8 | — | — | — | — | — |
Variable dictionary
| Variable | Label | Definition | Construction | Units | Source | Coverage |
|---|---|---|---|---|---|---|
state_abb identifier | State abbreviation | Two-letter U.S. state postal abbreviation. | Last two characters of the county string (str_sub). | code | Derived | prepared file |
county identifier | County name (with state abbrev.) | County name followed by its two-letter state abbreviation, e.g. "Autauga County, AL". | From the CDC mortality file; the trailing two characters give state_abb in the prepared data. | string | CDC WONDER | both files |
county_code identifier | County FIPS code | Five-digit federal (FIPS) county identifier; the unit id for the panel. | From the CDC mortality file. | FIPS | CDC WONDER | both files |
year year | Calendar year | Annual time index of the observation. | From the CDC mortality file (2009-2019). | year | CDC WONDER | both files |
population_20_64 continuous | Population aged 20-64 | County adult population aged 20-64 (denominator of the crude mortality rate). | From the CDC mortality file. | persons | CDC WONDER | both files |
yaca year | Year of ACA Medicaid expansion | Calendar year the state implemented Medicaid expansion; missing (NA) for never-expanders. | Parsed from the state ACA-expansion timing source; arrives as a string with "NA" sentinels and is coerced to numeric. | year | KFF / ACA timing | both files |
crude_rate_20_64 continuous | Crude mortality rate, adults 20-64 | Deaths per 100,000 adults aged 20-64 — the DiD outcome variable. | 100,000 x deaths / population_20_64 (as supplied by CDC WONDER). | per 100,000 | CDC WONDER | both files |
perc_female continuous | Female share, adults 20-64 (%) | Percent of the county's 20-64 population that is female (baseline covariate). | 100 x population_20_64_female / population_20_64. | % | Derived (CDC) | prepared file |
perc_white continuous | White share, adults 20-64 (%) | Percent of the county's 20-64 population that is white (baseline covariate). | 100 x population_20_64_white / population_20_64. | % | Derived (CDC) | prepared file |
perc_hispanic continuous | Hispanic share, adults 20-64 (%) | Percent of the county's 20-64 population that is Hispanic (baseline covariate). | 100 x population_20_64_hispanic / population_20_64. | % | Derived (CDC) | prepared file |
unemp_rate continuous | Unemployment rate (%) | County unemployment rate (baseline covariate). | Raw fractional rate rescaled to percent (x100) in the prepared data. | % | Derived (BLS) | prepared file |
poverty_rate continuous | Poverty rate (%) | Share of the county population below the federal poverty line. | From the county socioeconomic series in the merged file (already in percent). | % | Census (merged) | both files |
median_income continuous | Median household income | County median household income; in the raw file expressed in US$, rescaled to thousands of US$ in the prepared data. | From the county socioeconomic series; the prepared data divides by 1,000. | US$ (raw) / US$ 000s (prepared) | Census (merged) | both files |
set_wt continuous | Fixed 2013 adult population weight | Each county's 2013 population aged 20-64, held constant across all 11 years (the population weight). | population_20_64 in 2013, broadcast to every year of the county so weighting does not conflate population growth with mortality change. | persons | Derived (CDC) | prepared file |
treat_year identifier | Treatment year (did convention) | Year the county's state expanded Medicaid (2014/2015/2016/2019), or 0 for never-treated counties. | yaca if 2014 <= yaca <= 2019, else 0 — the did package's never-treated coding. | year / 0 | Derived | prepared file |
Treat_2014 dummy | 2014-cohort treatment dummy | 1 if the county's state expanded Medicaid in 2014, else 0. | 1 if yaca == 2014, else 0. | 0/1 | Derived | prepared file |
Post dummy | Post-2014 period dummy | 1 for years 2014 and later, else 0 (the post period in the 2x2 design). | 1 if year >= 2014, else 0. | 0/1 | Derived | prepared file |
Distribution & statistics (click a header to sort)
| Variable | Distribution | Coverage | N | Distinct | Min | Mean | Median | Max | SD |
|---|---|---|---|---|---|---|---|---|---|
state_abb | – | 100% | 28,644 | 46 | — | — | — | — | — |
county | – | 100% | 28,644 | 2,604 | — | — | — | — | — |
county_code | – | 100% | 28,644 | 2,604 | — | — | — | — | — |
year | – | 100% | 28,644 | 11 | 2009 | 2014.0 | 2014 | 2019 | 3.16 |
population_20_64 | 100% | 28,644 | 22,256 | 1,793.0 | 65,737 | 18,232 | 6,338,759 | 207,493 | |
yaca | – | 68% | 19,459 | 7 | 2014 | 2016.2 | 2014 | 2023 | 3.10 |
crude_rate_20_64 | 100% | 28,644 | 27,553 | 72.33 | 458.3 | 441.6 | 1,560.7 | 153.7 | |
perc_female | 100% | 28,644 | 28,483 | 24.15 | 49.37 | 50.02 | 60.28 | 3.05 | |
perc_white | 100% | 28,644 | 28,599 | 10.10 | 84.95 | 91.72 | 99.70 | 16.49 | |
perc_hispanic | 100% | 28,644 | 28,532 | 0.150 | 8.39 | 3.65 | 96.43 | 12.86 | |
unemp_rate | 100% | 28,644 | 28,484 | 1.07 | 6.83 | 6.23 | 29.41 | 3.08 | |
poverty_rate | 100% | 28,644 | 438 | 2.60 | 16.66 | 15.90 | 50.40 | 6.45 | |
median_income | 100% | 28,644 | 20,550 | 20.99 | 47.62 | 45.24 | 151.8 | 13.23 | |
set_wt | 100% | 28,644 | 2,534 | 1,891.0 | 65,530 | 18,408 | 6,221,536 | 206,632 | |
treat_year | – | 100% | 28,644 | 5 | — | — | — | — | — |
Treat_2014 | 100% | 28,644 | 2 | 0 | 0.376 | 0 | 1.00 | 0.484 | |
Post | 100% | 28,644 | 2 | 0 | 0.545 | 1.00 | 1.00 | 0.498 |
Known limitations & caveats
- Pedagogical, not definitive. The authors of the replicated guide flag this case as illustrative: "The results are pedagogical in spirit and do not represent the best possible estimates of Medicaid's effect on adult mortality."
- Underpowered. None of the six 2x2 covariate-adjusted 95% confidence intervals excludes zero; the data cannot settle the policy question.
- Weighting changes the estimand. Unweighted (ATT for the typical treated county) and population-weighted (ATT for the typical treated adult) answer different causal questions; the 2x2 ATT(2014) flips sign between them (+0.122 vs -2.563).
- Crude, not age-adjusted. The outcome is the CDC crude death rate for ages 20-64, not an age-adjusted rate, so compositional differences across cohorts are not removed.
- Small cohorts are noisy. The 2015, 2016, and 2019 expansion cohorts are small (171 / 93 / 140 counties); cohort-specific estimates carry wide confidence intervals.
- Tutorial bootstrap. The Callaway-Sant'Anna bootstrap uses 2,000 iterations for speed (the reference scripts use 25,000), affecting the third significant figure of each CI.