RDD Interactive Lab — Tutoring & Exit Exams

A pedagogical companion to Regression Discontinuity Design (RDD) in Stata: Evaluating a Tutoring Program ↗ Back to the post

What is regression discontinuity?

A school district gave every student a standardised entrance exam. Anyone who scored 70 or below was automatically enrolled in a free tutoring program; anyone above was not. The post asks the central question: did the tutoring program improve exit exam scores? Because assignment is a sharp, rule-based jump at score 70, students just below the cutoff are nearly identical to those just above — and the discontinuity at the cutoff identifies the causal effect.

This app lets you turn the dials yourself. In four tabs you will: watch the discontinuity appear as the treatment effect grows; simulate the entire RD pipeline with your own sample size, noise, polynomial order and bandwidth; sweep the bandwidth from very tight to very wide and watch the LATE move just a tiny bit; and explore the post's robustness forest — bandwidths, kernels, parametric vs nonparametric, and the placebo cutoffs.

The jump at the cutoff

The animation below shows two regression lines — blue for tutored students (left of the cutoff), orange for non-tutored students (right). As the treatment effect τ grows, the blue line rises and the gap at the cutoff opens up. RDD reads off τ from exactly that gap. When τ = 0 the two lines align: no program effect. When τ ≈ 8.6 (the post's rdrobust estimate) the gap is meaningfully large.

Tab 2

RD Simulator

Draw new data with your chosen sample size, noise, polynomial order and bandwidth. Watch τ̂ move and see the local-linear fits on each side.

Tab 3

Bandwidth Lab

Sweep the bandwidth on the post's actual numbers. The estimate barely moves between h = 5 and h = 20 — that's the robustness story.

Tab 4

Robustness Forest

The post's headline robustness check, interactively. Toggle parametric vs nonparametric, bandwidths, kernels and placebo cutoffs.

Three key takeaways from the post

1. Tutoring works: τ̂ ≈ 9–11 points
Parametric OLS gives +10.80 (95% CI 9.22 to 12.38); nonparametric rdrobust gives 8.58 (95% CI 4.54 to 12.14). Both are highly significant (p < 0.001) and represent a 13–16% improvement over the mean exit score of 66.2.
2. The design is credible
100% compliance at the cutoff (perfectly sharp). McCrary density test p = 0.58 rules out manipulation. Placebo cutoffs at 50, 55, 60, 65, 75, 80, 85, 90 all fail to reject the null — the discontinuity is unique to the true cutoff at 70.
3. The estimate is robust
Across bandwidths 5 → 20, τ̂ ranges from -8.20 to -9.16 — a spread of less than 1 point. Triangular, Uniform and Epanechnikov kernels all yield estimates between -8.20 and -8.58. Linear, slope-flexible and quadratic parametric specs all land between 9.2 and 10.8.
Running variable
The continuous score that determines treatment. Here it is entrance_exam (range 28.8–99.8). Must be continuous-ish for nearby units to be comparable.
Cutoff (c = 70)
The threshold value. Below or equal: tutoring. Above: no tutoring. Set by program rule, not by the analyst.
Sharp vs fuzzy
Sharp: 100% of students below the cutoff get treatment, 0% above (this study). Fuzzy: probability of treatment jumps but not from 0 to 1; would require IV.
LATE — τ
The Local Average Treatment Effect — the jump in E[Y | X = x] at x = c. The effect for students at the cutoff. Cannot be extrapolated to students at score 30 or 95.
Bandwidth — h
The window around the cutoff used for local-polynomial estimation. Tighter h = less bias, more variance. rdrobust picks h = 9.98 here via MSE optimisation.
Continuity assumption
Potential outcomes must be smooth through the cutoff. Equivalently, absent the program, students just below 70 would score similarly on the exit exam to students just above 70. The McCrary test gives indirect evidence.
McCrary density test
A formal test for manipulation. Asks: is the density of the running variable continuous at the cutoff? Here T = -0.55, p = 0.58 — no evidence of bunching.
Placebo cutoffs
Run the RDD estimator at non-true cutoff values. If the design is credible, only the true cutoff should show a significant jump. Here only c = 70 has p < 0.001; all others have p > 0.05.

RD Simulator — turn the dials yourself

This panel draws a fresh dataset every time you move a slider. The true treatment effect is fixed at τ = 8.6 (the post's nonparametric estimate). The simulator fits a local polynomial on each side of the cutoff using only observations inside the bandwidth window. Try making the noise large or the bandwidth tiny — watch τ̂ wobble.

More observations = tighter local-linear fits, smaller SE.
Idiosyncratic variation in exit scores. Larger σ = harder to see τ.
Window of X around c = 70 used for local-poly fitting. Outside h, points fade out.
1 = local linear (rdrobust default); 2 = local quadratic; 3 = local cubic.
τ̂ (estimated LATE)
true τ = 8.60
SE(τ̂)
heteroskedasticity-robust
obs in bandwidth
of 800 total
95% CI for τ̂
τ̂ ± 1.96 · SE

What to look for

  • Bias–variance trade-off. Slide the bandwidth from 3 to 25. Tight h → fewer points → bigger SE but smaller bias (closer to the cutoff = more comparable). Wide h → smaller SE but the fits start mixing in students far from the cutoff.
  • Polynomial order does not save you. A higher-order polynomial can over-fit local noise. Try p = 3 with a tight h — you will see the fitted lines bend wildly near the cutoff. rdrobust recommends p = 1 (local linear) for a reason.
  • Faded points are outside the bandwidth. The cutoff line is the velvet rope. The local-poly fit only sees the bright points inside h.
  • Reseed to feel the noise. τ̂ should hover near 8.6 — that's the unbiased target. The wobble around 8.6 is the variance.

Bandwidth Lab — how sensitive is τ̂ to h?

The most common concern about RDD is: "What if the bandwidth I pick changes the answer?" The post answers that on the real data with a sweep from h = 5 to h = 20. The curve below is τ̂(h) on the actual tutoring dataset. Drag the orange dot to see how the point estimate and CI move as you change h.

Linearly interpolated from the post's 6 anchor bandwidths (5, 7, 10, 12, 15, 20).
τ̂ at this bandwidth
positive in post's convention
(= jump from treated to untreated side)
95% CI
CL — CR
SE(τ̂)
narrower h → larger SE
MSE-optimal h
9.98
what rdrobust picks automatically

What to look for

  • The curve is nearly flat. τ̂ moves by less than 1 point across a 4× range of bandwidths. That is the gold-standard robustness story: the answer does not depend on analyst discretion.
  • The teal CI band fans out at low h. Tight bandwidths use few observations — the standard error grows. Wide bandwidths trade a little bit of bias for a lot of precision.
  • The MSE-optimal point (h = 9.98) lives near the middle. rdrobust picks this h automatically to balance squared bias and variance — it is not the bandwidth that minimises SE.
  • All CIs exclude zero. Even at h = 5 (the noisiest end), the upper end of the CI is -3.62, well below zero. The effect is detected regardless of bandwidth.

The post's robustness forest, interactively

These numbers come straight from the post's analysis.do Stata pipeline — parametric OLS, nonparametric rdrobust, bandwidth sweep, kernel comparison and the 9-cutoff placebo. Toggle outcomes and methods. Hover any point to see its SE, CI and observation count.

Outcomes

Methods (filter rows within each outcome)

What the forest tells us

  • Sign conventions. The parametric panel reports the coefficient on treat (positive: tutoring helps). The bandwidth, kernel and placebo panels use rdrobust's left-to-right convention (negative: jumping down from the treated to the untreated side). Both encode the same underlying causal claim.
  • Placebos straddle zero. Toggle off everything except the placebo cutoff panel: only the orange point at cutoff 70 has a CI that excludes zero. The eight placebo cutoffs (CIs in grey) all cross the dashed zero line.
  • The four headline estimates cluster. Toggle just the top panel: linear (10.80), slope-flexible linear (10.80), quadratic (9.22), and rdrobust (8.58) all land between roughly 8.5 and 11. The story is invariant to specification.

Connecting back to the post

Three messages live in this single plot, one per panel:

  • Top panel (parametric): see Sections 7 and 8 of the post. Four specifications, same conclusion. The treatment effect is roughly 9–11 points.
  • Middle two panels (BW + kernel): Section 9.1 of the post. Robustness to analyst discretion.
  • Bottom panel (placebos): Section 9.3. The discontinuity is unique to the actual program threshold.

Combined with the McCrary test (p = 0.58, see Section 9.2) and the perfectly sharp design (Section 5), the four robustness panels here are the formal back-up for the "tutoring works, the design is credible" conclusion in Section 10.