ImputationDiD methodology validation (PR-B): exact FE variance + unit-clustered Eq.8 + R parity by igerber · Pull Request #533 · igerber/diff-diff · GitHub
Skip to content

ImputationDiD methodology validation (PR-B): exact FE variance + unit-clustered Eq.8 + R parity#533

Merged
igerber merged 1 commit into
mainfrom
feature/imputation-eq8-methodology
Jun 6, 2026
Merged

ImputationDiD methodology validation (PR-B): exact FE variance + unit-clustered Eq.8 + R parity#533
igerber merged 1 commit into
mainfrom
feature/imputation-eq8-methodology

Conversation

@igerber

@igerber igerber commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Summary

PR-B of the ImputationDiD methodology validation — the source-validation pass of the Borusyak, Jaravel & Spiess (2024, REStud 91(6)) audit (PR-A #529 added the paper review). Validating against R didimputation uncovered and fixed a ~27% downward bias in the analytical standard errors without covariates (a real correctness bug; point estimates were always correct).

Three code corrections in diff_diff/imputation.pybehavior change: SE / t / p / CI values change without covariates; point estimates unchanged:

  1. Untreated v_it weights (Theorem 3 variance). The covariate-free path used a balanced two-way closed form -(w_i/n0_i + w_t/n0_t - w/N0), wrong for the always-unbalanced Ω₀ in staggered designs → SEs ~27% too small. Replaced with the exact projection -A₀(A₀'A₀)⁻¹A₁'w (the covariate path's method), and kept all unit dummies in the design (the prior drop-first-unit/no-intercept design was one rank short → a further ~1.6% bias). SEs now match didimputation to ~1e-10. A singular Ω₀ routes to a dense-lstsq fallback (SciPy spsolve returns NaN + MatrixRankWarning without raising — promoted to an error so the fallback fires in production).
  2. Auxiliary model (Equation 8): observation-level mean → the paper's unit-clustered Σ_i(Σ_t v)(Σ_t v·τ̂)/Σ_i(Σ_t v)², NaN-safe.
  3. Untreated Step-1 residuals preserve NaN for missing FE (symmetric with the treated path) instead of a silent fillna(0.0).

The multiplier bootstrap resamples the same Theorem-3 influence function, so bootstrap SEs may also shift.

Methodology references

  • Method: ImputationDiD. Source: Borusyak, Jaravel & Spiess (2024), Revisiting Event-Study Designs: Robust and Efficient Estimation, Review of Economic Studies 91(6), 3253–3285 (DOI). R reference: didimputation v0.5.0.
  • Deviations (documented in REGISTRY.md ## ImputationDiD): R didimputation implements Equation 8 only at the cohort×event-time partition (= diff-diff's default aux_partition="cohort_horizon"); diff-diff additionally offers coarser cohort/horizon partitions (no R analogue, hand-calc validated). Multiplier bootstrap + survey-design TSL variance are library extensions. Leave-one-out variance (Supp. App. A.9) is not implemented (tracked).

Validation

  • New tests/test_methodology_imputation.py — paper-equation Verified Components (Theorem 1/2; Theorem 3 / eqs 6-8 + white-box unit-clustered Eq. 8 hand-calc + NaN-co-group edge + singular-Ω₀ dense-fallback regression; Proposition 5 K≥H̄ non-identification; Test 1 / eq 9 + Proposition 9) and TestImputationDiDParityR (overall + per-horizon ATT and SE vs didimputation, no silent skips).
  • R parity goldens: benchmarks/data/didimputation_golden.json (generator benchmarks/R/generate_didimputation_golden.R).
  • tests/test_imputation.py: tightened the coarser-partition conservatism test.
  • Full fast suite: 7585 passed, 0 failed (the SE change breaks nothing downstream). 6 fresh local AI-review rounds → converged clean.
  • METHODOLOGY_REVIEW.md row → Complete (Verified Components / Corrections Made / Deviations / R Comparison Results).

Security / privacy

  • Confirm no secrets/PII in this PR: Yes (the source PDF + R install logs are not committed).

🤖 Generated with Claude Code

@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown

@igerber igerber added the ready-for-ci Triggers CI test workflows label Jun 6, 2026
…-clustered Eq.8 + R parity

Source-validation pass of the Borusyak, Jaravel & Spiess (2024, REStud 91(6))
audit (PR-A #529 added the paper review). Three code corrections in
diff_diff/imputation.py (behavior = SE values change; point estimates unchanged):

1. Untreated v_it weights (Theorem 3 conservative variance). The covariate-free
   path used the BALANCED two-way closed form -(w_i/n0_i + w_t/n0_t - w/N0), wrong
   for the always-unbalanced Omega_0 in staggered designs -> analytical SEs ~27%
   too small. Replaced with the exact projection -A0 (A0'A0)^-1 A1' w (the
   covariate path's method), and fixed that design to keep all unit dummies (the
   prior drop-first-unit/no-intercept design was one rank short -> a further ~1.6%
   bias). SEs now match R didimputation::did_imputation (observed ~1e-10; tests
   assert abs=1e-7). A singular Omega_0 routes to a dense-lstsq fallback (SciPy
   spsolve returns NaN + MatrixRankWarning without raising; promoted to an error
   so the fallback fires under production filters). Bootstrap SEs (which resample
   the same Theorem-3 influence function) may also shift.
2. Auxiliary model (Equation 8): observation-level mean sum(v*tau)/sum(v) -> the
   paper's unit-clustered sum_i(sum_t v)(sum_t v*tau)/sum_i(sum_t v)^2, NaN-safe.
3. Untreated Step-1 residuals preserve NaN for missing FE (symmetric with the
   treated path) instead of a silent fillna(0.0).

Validation:
- tests/test_methodology_imputation.py: paper-equation Verified Components
  (Theorem 1/2; Theorem 3/eqs 6-8 + white-box unit-clustered Eq.8 hand-calc +
  NaN-co-group edge + singular-Omega0 dense-fallback regression; Proposition 5
  K>=H_bar non-ID; Test 1/eq 9 + Proposition 9) and TestImputationDiDParityR
  (overall + per-horizon ATT and SE vs didimputation, no silent skips).
- benchmarks/R/generate_didimputation_golden.R + benchmarks/data/didimputation_*
  (didimputation v0.5.0 goldens).
- tests/test_imputation.py: tightened the coarser-partition conservatism test.
- Full fast suite: 7585 passed (the SE change breaks nothing downstream).

Docs/tracker: REGISTRY ## ImputationDiD (Eq.8 now exact unit-clustered + a
Deviation-from-R note; v_it observation-weights bullet updated to the exact
projection); paper review flipped to "implemented"; METHODOLOGY_REVIEW.md row ->
Complete (Verified Components / Corrections Made / Deviations / R Comparison
Results); CHANGELOG entry; TODO PR-B rows removed + follow-ups tracked (LOO
refinement, projection-factorization caching, covariate-path R parity).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the feature/imputation-eq8-methodology branch from 0647207 to 9ccb59f Compare June 6, 2026 20:37
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown

@igerber igerber merged commit fbdcbb9 into main Jun 6, 2026
26 checks passed
@igerber igerber deleted the feature/imputation-eq8-methodology branch June 6, 2026 23:13
TDL77 pushed a commit to TDL77/diff-diff that referenced this pull request Jun 10, 2026
Bump version 3.5.1 -> 3.5.2 across __init__.py, pyproject.toml,
rust/Cargo.toml, llms-full.txt, and CITATION.cff (date-released
2026-06-08).

Reconcile the CHANGELOG: the Firpo & Possebom (2018) confidence-sets
-by-test-inversion feature (PR igerber#527) was filed under [3.5.1] but
merged AFTER the v3.5.1 tag was cut, so the tagged v3.5.1 did not
actually contain it. Move that entry into the new [3.5.2] section
alongside everything else that landed post-tag (CBWSDID balancing
igerber#534, SyntheticControl conformal inference igerber#530, the
placebo_effects -> variance_effects rename/deprecation igerber#532, and the
ImputationDiD validation + SE fixes igerber#533). The Firpo PR-A paper
review (igerber#524, docs-only) stays in [3.5.1] since it was in that tag.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant