Partial Identification — Interactive Lab

A pedagogical companion to Introduction to Partial Identification: Bounding Causal Effects Under Unmeasured Confounding ↗ Back to the post

Why bounds, not a single number?

A job-training program may help workers find jobs, or it may not — and an unmeasured confounder (prior work experience) makes a single estimate impossible to trust. Partial identification answers the next-best question: what range of values is the true effect consistent with, given only the data and minimal assumptions? The honest answer in this post is a wide interval [-0.298, +0.702] — Manski's no-assumption bounds — that contains the true ATE of 0.27 but is too wide to settle the question.

The animation below shows the three layers of certainty. A naive point estimate pretends the confounder is gone. The Manski bounds never move — they reflect fundamental uncertainty. The entropy bounds shrink as you add an information-theoretic assumption, all the way to a point if you are willing to assume everything. Different tabs let you turn those dials yourself.

Three layers of certainty about the ATE

The orange dot only appears when assumptions are strong enough to collapse the bound to a single point — a fantasy in this post since the confounder is real. Notice how the steel-blue Manski bar never budges: more assumptions, not more data, is what narrows bounds.

Manski (steel) stays fixed. Entropy (teal) shrinks as you tighten the assumption. The point estimate (orange) requires the strongest assumption: no confounding at all.
Tab 2

Bounds Showdown

The real numbers from the post. Toggle Manski, Autobound, Tian-Pearl, and entropy. Hover for width, lower/upper, and whether the bound covers the true ATE.

Tab 3

Confounding Simulator

Tune confounder prevalence, treatment effect, and confounder strength. Watch the naive estimate drift while the Manski bounds bracket the truth every time.

Tab 4

Sample-size Sensitivity

The headline misconception: bigger N does not narrow identification bounds. See the Manski width stay flat at 1.0 from N = 100 to N = 5,000.

Glossary (open a card if a term is unfamiliar)

Point identification
Under strong assumptions (e.g. random assignment, no unmeasured confounders), the data pin the causal effect to a single number.
Partial identification
When assumptions fail, the data alone deliver a range [L, U] consistent with everything you can observe. The narrower, the more informative.
Manski bounds
No-assumption bounds. Plug observed conditional probabilities into the law of total probability and use the worst case for unobservables. Width = 1 for binary outcomes by construction.
Autobound (LP)
Linear-programming bounds that find the tightest interval consistent with every probability constraint. For binary confounded scenarios, equal to Manski — confirming Manski is already sharp.
Entropy bounds
Add the constraint H(U | X, Y) ≤ θ. Smaller θ = the hidden confounder can only redistribute mass a little = tighter bounds.
Tian-Pearl bounds
Sharp closed-form bounds for counterfactual quantities like PNS (probability of necessity and sufficiency). Use the joint structure more aggressively than Manski.
PNS
P(Y(1)=1 ∩ Y(0)=0). The probability that treatment both would succeed and needed to happen for success — individual-level causation.
Bound width
U − L. Width 1.0 is useless for binary outcomes; width 0.2 might be enough to act on. Smaller is more decision-relevant.
Coverage
Across many simulated datasets, does the bound contain the true parameter at the advertised rate? In this post all three methods reach 100% coverage over 100 sims.
ATE
Average Treatment Effect. E[Y(1)] − E[Y(0)]. True ATE in this post's DGP = 0.27.

Bounds Showdown — the post's real numbers

These are the exact bounds computed in the post on the same N = 1,000 confounded dataset (true ATE = 0.27). Toggle the outcome (ATE vs. PNS) to switch between the average effect and the probability of necessity-and-sufficiency. Hover a bar for the upper/lower endpoints, width, and whether the bound contains the true ATE.

Outcomes

true ATE
from known DGP
naive estimate
upward biased by confounding
sample size N
simulated workers
coverage @ 100 sims
all three methods

What to look for

  • Manski and Autobound coincide on the ATE. The blue and orange bars overlap exactly at [-0.298, 0.702] — linear programming confirms Manski is already sharp. No clever LP trick can do better without extra assumptions.
  • Entropy buys a 32% width reduction. Adding H(U | X, Y) ≤ 0.1 narrows the ATE band to [-0.228, 0.454] (width 0.68). It still crosses zero — the data, even with the entropy assumption, cannot confidently sign the effect.
  • Switch to PNS. Tian-Pearl gives [0, 0.702] — the lower bound pinning at zero means we cannot rule out that training is never individually necessary-and-sufficient. Interestingly, the entropy constraint is less effective for counterfactual PNS than for ATE.

Confounding Simulator — turn the dials yourself

Generate fresh binary data with confounding. The Manski bounds are computed in closed form from the observed contingency table. Slide the confounder strength up and watch the naive estimate drift, while the Manski bounds stay anchored around the truth.

More data sharpens observed probabilities but does not shrink identification bounds.
Share of "experienced" workers in the population.
Direct effect of training on the probability of a job.
Effect of prior experience on getting a job. Larger ⇒ larger naive bias.
true ATE
from your slider settings
naive ATE
difference in observed means
bias (naive − true)
grows with β_U
Manski width
stays ≈ 1.0

What to look for

  • Raise the confounder strength β_U. The orange "naive" line drifts away from the teal "true" line. The Manski bound never loses the truth.
  • Drop β_U to 0. The naive estimate snaps to the truth — there is no confounding. The Manski bound is still width ≈ 1 because the data cannot prove that no confounder exists.
  • Change N from 100 to 5,000. The bars barely move. Identification bounds, unlike confidence intervals, do not shrink with sample size.

Sample-size Sensitivity — what more data buys you

A common misconception: collecting more data narrows partial-identification bounds. This is generally not true. Identification bounds reflect what is fundamentally unknowable without observing the confounder. Below: Manski width vs. N from the post's experiment (30 reps per grid point). The Manski band sits flat at 1.0 across two orders of magnitude of N; entropy bounds also stabilise around 0.68.

Manski width @ N = 100
no assumptions
Manski width @ N = 5,000
50× more data, same width
Entropy width @ N = 100
θ = 0.1
Entropy width @ N = 5,000
variance shrinks, mean does not

What to look for

  • Both bands are flat in N. The mean Manski width is exactly 1.0 from N = 100 to N = 5,000. Mean entropy width is ≈ 0.68. Only the variance around those means shrinks with N.
  • Identification ≠ statistical estimation. Confidence intervals shrink at rate 1/√N. Identification bounds do not move at all with N — they encode "what the data could ever tell us, even with infinite samples."
  • To narrow bounds you need structure, not size. Options listed in the post: an instrument (BinaryIV), monotone treatment response, or direct measurement of U. The Tab 3 simulator shows the data-size lever; nothing you do with it narrows the bound.

Why does this surprise people?

Most empirical work uses 95% confidence intervals, which behave like [β̂ − 1.96 SE, β̂ + 1.96 SE] and shrink as N grows. Identification bounds answer a different question: what is the set of effects consistent with the data and the assumed structure, even in the limit of infinite data? When the assumed structure is weak (Manski), that set is wide and stays wide. The path to narrower bounds runs through assumptions — not through bigger samples.