← Back to the post
Interactive data dictionary

Staggered Synthetic Difference-in-Differences: Gender Quotas and Women in Parliament

The quota_example panel from the Stata sdid package — 119 countries, 1990–2015, with staggered quota adoption.

1
dataset
7
variables
119
countries
1990–2015
years

Downloads

Each dataset is available as a labeled Stata .dta and its source file.

⇩ Download all data (ZIP)stata_codebook.do

DatasetGrainRowsStataSource
quota_examplecountry-year3,094 × 7quota_example.dtaquota_example.dta

Run stata_codebook.do in Stata once to attach long-form per-variable notes to the .dta files.

Load directly in code

Every file loads straight from GitHub (raw URLs). Swap the file name to load any dataset.

Stata

* Stata 14+ : `use` reads an https URL directly
global BASE "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/stata_sdid_staggered/data/"
use "${BASE}quota_example.dta", clear
describe
notes

Python

!pip install -q pyreadstat
import pandas as pd
BASE = "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/stata_sdid_staggered/data/"
df = pd.read_stata(BASE + "quota_example.dta")

# load every dataset at once
files = ["quota_example"]
data = {f: pd.read_stata(BASE + f + ".dta") for f in files}

# pyreadstat (richest metadata) reads LOCAL files -> download first
import pyreadstat, urllib.request
urllib.request.urlretrieve(BASE + "quota_example.dta", "quota_example.dta")
df, meta = pyreadstat.read_dta("quota_example.dta")

Copy and paste this snippet in Google Colab app. https://colab.research.google.com/notebooks/empty.ipynb

R

# R : haven::read_dta auto-downloads an https URL
library(haven)
BASE <- "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/stata_sdid_staggered/data/"
df <- read_dta(paste0(BASE, "quota_example.dta"))

Overview & sources

Companion data for a Stata tutorial that extends synthetic difference-in-differences (SDID) to staggered adoption, where units adopt treatment at different times. The single file is quota_example.dta, the balanced panel distributed with the sdid package (Bhalotra, Clarke, Gomes & Venkataramani, 2023): 119 countries observed annually from 1990 to 2015 (3,094 observations). The outcome is the share of seats held by women in the national parliament; the treatment is the adoption of a reserved-seat gender quota (absorbing — once adopted it stays on); the covariate is log GDP per capita. Treatment is staggered: 9 countries adopt a quota across 7 cohorts (2000, 2002, 2003, 2005, 2010, 2012, 2013) and 110 countries remain never-treated, forming the donor pool. The post estimates a separate, clean SDID per cohort against the never-treated controls, aggregates the cohort effects into an overall ATT of +8.0 percentage points, and complements it with the sdid_event event study and bootstrap, jackknife, and placebo inference.

One file, balanced panel. quota_example is an annual country panel (one row per country × year), 119 countries × 26 years = 3,094 rows with no gaps in the outcome or treatment. Set with xtset country year. The treatment quota is absorbing and switches on for only ~3% of country-years; quotaYear records each adopting country's cohort (missing for the 110 never-treated countries); lngdp has 104 missing values that matter only when used as a covariate.

Data sources

SourceProvidesReference / URL
quota_example (sdid package)The analysis panel — women-in-parliament outcome, gender-quota treatment, log GDP, quota-adoption yearBhalotra, S., Clarke, D., Gomes, J. F., & Venkataramani, A. (2023). Maternal Mortality and Women's Political Power. Journal of the European Economic Association. https://doi.org/10.1093/jeea/jvad043
sdid (Stata package)The estimator and the distributed example dataset (webuse quota_example)Clarke, D., Pailañir, D., Athey, S., & Imbens, G. (2024). On Synthetic Difference-in-Differences and Related Estimation Methods in Stata. The Stata Journal, 24(4). ssc install sdid.
Method referencesEstimators and conceptsArkhangelsky, Athey, Hirshberg, Imbens & Wager (2021) — SDID; Goodman-Bacon (2021); de Chaisemartin & D'Haultfœuille (2020); Ciccia, Clarke & Pailañir (2024) — sdid_event.

Cite this data

Please cite this dataset as follows.

APA

Mendez, C. (2026). Staggered Synthetic Difference-in-Differences (SDID) in Stata: Gender Quotas and Women in Parliament [Data set]. https://carlos-mendez.org/post/stata_sdid_staggered/

Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). Synthetic Difference-in-Differences. American Economic Review, 111(12), 4088–4118. https://doi.org/10.1257/aer.20190159  ·  Clarke, D., Pailañir, D., Athey, S., & Imbens, G. (2024). On Synthetic Difference-in-Differences and Related Estimation Methods in Stata. The Stata Journal, 24(4). https://doi.org/10.1177/1536867X241297184  ·  Bhalotra, S., Clarke, D., Gomes, J. F., & Venkataramani, A. (2023). Maternal Mortality and Women's Political Power. Journal of the European Economic Association. https://doi.org/10.1093/jeea/jvad043 (source of the quota_example data).

BibTeX

@misc{mendez2026statasdidstaggered,
  author       = {Mendez, Carlos},
  title        = {Staggered Synthetic Difference-in-Differences (SDID) in Stata: Gender Quotas and Women in Parliament},
  year         = {2026},
  howpublished = {\url{https://carlos-mendez.org/post/stata_sdid_staggered/}},
  note         = {Data set}
}

@article{arkhangelsky2021sdid,
  author  = {Arkhangelsky, Dmitry and Athey, Susan and Hirshberg, David A. and Imbens, Guido W. and Wager, Stefan},
  title   = {Synthetic Difference-in-Differences},
  journal = {American Economic Review},
  volume  = {111}, number = {12}, pages = {4088--4118}, year = {2021},
  doi     = {10.1257/aer.20190159}
}
@article{clarke2024sdid,
  author  = {Clarke, Damian and Paila{\~n}ir, Daniel and Athey, Susan and Imbens, Guido},
  title   = {On Synthetic Difference-in-Differences and Related Estimation Methods in Stata},
  journal = {The Stata Journal},
  volume  = {24}, number = {4}, year = {2024},
  doi     = {10.1177/1536867X241297184}
}
@article{bhalotra2023maternal,
  author  = {Bhalotra, Sonia and Clarke, Damian and Gomes, Joseph F. and Venkataramani, Atheendar},
  title   = {Maternal Mortality and Women's Political Power},
  journal = {Journal of the European Economic Association},
  year    = {2023},
  doi     = {10.1093/jeea/jvad043}
}

Variable explorer search & filter all 7 variables

Type to filter by name or label, or use the chips to filter by type. Each row shows a mini distribution. Click a header to sort.

VariableTypeDistributionLabelDefinitionUnitsIn filesSource
country#identifierCountryCountry name — the panel unit (i).stringquota_examplequota_example (Bhalotra et al. 2023)
lngdp#continuousmin 5.87 | median 9.21 | max 11.6Log GDP per capitaNatural log of GDP per capita — the covariate (X).log GDPquota_examplequota_example (Bhalotra et al. 2023)
lnmmrt#continuousmin 1.1 | median 4.25 | max 7.24Maternal mortalityNatural log of the maternal mortality ratio (ships with the dataset; not used in the post's quota analysis).log ratioquota_examplequota_example (Bhalotra et al. 2023)
quota#dummyshare coded 1 = 0.030Parliamentary gender quota (=1)Treatment indicator: 1 once a country has a reserved-seat gender quota, 0 before / never.0/1quota_examplequota_example (Bhalotra et al. 2023)
quotaYear#yearYear quota adopted (cohort)First year a country is treated — its adoption cohort; missing for the 110 never-treated countries.yearquota_examplequota_example (Bhalotra et al. 2023)
womparl#continuousmin 0 | median 12 | max 63.8Women in parliamentPercentage of seats held by women in the national (lower) parliament — the outcome.% of seatsquota_examplequota_example (Bhalotra et al. 2023)
year#yearYearCalendar year — the panel time index (t).yearquota_examplequota_example (Bhalotra et al. 2023)

Cross-file variable index

Which file each variable appears in (● = present).

Variablequota_example
country
lngdp
lnmmrt
quota
quotaYear
womparl
year

Construction & formulas

The estimand is the average treatment effect on the treated (ATT) — the effect of adopting a quota on the women-in-parliament share, in the countries that adopted one, averaged over their post-adoption years:

τ = (1 / N_tr · T_post) · Σ_(i: W_i=1) Σ_(t>T_pre) [ Y_it(1) − Y_it(0) ]

SDID (Arkhangelsky et al., 2021) is a weighted two-way fixed-effects regression that chooses the ATT plus a constant, unit fixed effects, and time fixed effects to minimize a weighted sum of squared residuals, weighting each observation by a unit weight ω_i times a time weight λ_t:

Staggered extension. Run single-cohort SDID once per adoption cohort a (cohort's treated units + never-treated controls only), obtaining τ_a, then aggregate with non-negative treated-period-share weights: ATT = Σ_a [ N_tr^a · T_post^a / Σ_b N_tr^b · T_post^b ] · τ_a. Because each cohort is compared only to never-treated controls, an already-treated unit is never used as a control for a later adopter — the contamination that breaks naive TWFE under staggered timing.

The datasets

Switch datasets with the tabs. Each shows the full variable dictionary plus a sortable statistics table with mini distributions and data coverage.

expand to search (Ctrl/⌘+F) or print across all datasets

country-year  3,094 × 7 · 1990-2015 · 119 countries (9 ever-treated, 110 never-treated)

Panel key: country x year · Estimate the effect of gender quotas on women in parliament via staggered SDID.

Variable dictionary

VariableLabelDefinitionConstructionUnitsSourceCoverage
womparl continuousWomen in parliamentPercentage of seats held by women in the national (lower) parliament — the outcome.Distributed with the quota_example dataset; observed annually per country.% of seatsquota_example (Bhalotra et al. 2023)all 3,094 country-years
lnmmrt continuousMaternal mortalityNatural log of the maternal mortality ratio (ships with the dataset; not used in the post's quota analysis).Distributed with the quota_example dataset.log ratioquota_example (Bhalotra et al. 2023)3,068 country-years (26 missing)
country identifierCountryCountry name — the panel unit (i).119 countries; 9 ever adopt a quota, 110 never treated (the donor pool).stringquota_example (Bhalotra et al. 2023)119 countries
year yearYearCalendar year — the panel time index (t).Annual, 1990-2015 (26 years), balanced across all countries.yearquota_example (Bhalotra et al. 2023)1990-2015
quota dummyParliamentary gender quota (=1)Treatment indicator: 1 once a country has a reserved-seat gender quota, 0 before / never.Absorbing — switches to 1 in the adoption year and stays on; 1 for ~3% of country-years.0/1quota_example (Bhalotra et al. 2023)all 3,094 country-years
lngdp continuousLog GDP per capitaNatural log of GDP per capita — the covariate (X).Distributed with the quota_example dataset; used in the optimized/projected covariate specifications.log GDPquota_example (Bhalotra et al. 2023)2,990 country-years (104 missing)
quotaYear yearYear quota adopted (cohort)First year a country is treated — its adoption cohort; missing for the 110 never-treated countries.Cohorts: 2000, 2002, 2003, 2005, 2010, 2012, 2013 (two countries each in 2002 and 2003, one in the rest).yearquota_example (Bhalotra et al. 2023)234 treated country-years (9 countries); missing for 110 never-treated

Distribution & statistics (click a header to sort)

VariableDistributionCoverageNDistinctMinMeanMedianMaxSD
womparlmin 0 | median 12 | max 63.8100%3,094449014.9712.0063.8010.97
lnmmrtmin 1.1 | median 4.25 | max 7.2499%3,0686801.104.194.257.241.59
country100%3,094119
year100%3,0942619902002.5200220157.50
quotashare coded 1 = 0.030100%3,094200.03001.000.172
lngdpmin 5.87 | median 9.21 | max 11.697%2,9902,9565.879.159.2111.621.14
quotaYear8%234720002005.6200320134.56

Known limitations & caveats