{{ message }}
ImputationDiD methodology validation (PR-B): exact FE variance + unit-clustered Eq.8 + R parity#533
Merged
Merged
Conversation
…-clustered Eq.8 + R parity Source-validation pass of the Borusyak, Jaravel & Spiess (2024, REStud 91(6)) audit (PR-A #529 added the paper review). Three code corrections in diff_diff/imputation.py (behavior = SE values change; point estimates unchanged): 1. Untreated v_it weights (Theorem 3 conservative variance). The covariate-free path used the BALANCED two-way closed form -(w_i/n0_i + w_t/n0_t - w/N0), wrong for the always-unbalanced Omega_0 in staggered designs -> analytical SEs ~27% too small. Replaced with the exact projection -A0 (A0'A0)^-1 A1' w (the covariate path's method), and fixed that design to keep all unit dummies (the prior drop-first-unit/no-intercept design was one rank short -> a further ~1.6% bias). SEs now match R didimputation::did_imputation (observed ~1e-10; tests assert abs=1e-7). A singular Omega_0 routes to a dense-lstsq fallback (SciPy spsolve returns NaN + MatrixRankWarning without raising; promoted to an error so the fallback fires under production filters). Bootstrap SEs (which resample the same Theorem-3 influence function) may also shift. 2. Auxiliary model (Equation 8): observation-level mean sum(v*tau)/sum(v) -> the paper's unit-clustered sum_i(sum_t v)(sum_t v*tau)/sum_i(sum_t v)^2, NaN-safe. 3. Untreated Step-1 residuals preserve NaN for missing FE (symmetric with the treated path) instead of a silent fillna(0.0). Validation: - tests/test_methodology_imputation.py: paper-equation Verified Components (Theorem 1/2; Theorem 3/eqs 6-8 + white-box unit-clustered Eq.8 hand-calc + NaN-co-group edge + singular-Omega0 dense-fallback regression; Proposition 5 K>=H_bar non-ID; Test 1/eq 9 + Proposition 9) and TestImputationDiDParityR (overall + per-horizon ATT and SE vs didimputation, no silent skips). - benchmarks/R/generate_didimputation_golden.R + benchmarks/data/didimputation_* (didimputation v0.5.0 goldens). - tests/test_imputation.py: tightened the coarser-partition conservatism test. - Full fast suite: 7585 passed (the SE change breaks nothing downstream). Docs/tracker: REGISTRY ## ImputationDiD (Eq.8 now exact unit-clustered + a Deviation-from-R note; v_it observation-weights bullet updated to the exact projection); paper review flipped to "implemented"; METHODOLOGY_REVIEW.md row -> Complete (Verified Components / Corrections Made / Deviations / R Comparison Results); CHANGELOG entry; TODO PR-B rows removed + follow-ups tracked (LOO refinement, projection-factorization caching, covariate-path R parity). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0647207 to
9ccb59f
Compare
TDL77
pushed a commit
to TDL77/diff-diff
that referenced
this pull request
Jun 10, 2026
Bump version 3.5.1 -> 3.5.2 across __init__.py, pyproject.toml, rust/Cargo.toml, llms-full.txt, and CITATION.cff (date-released 2026-06-08). Reconcile the CHANGELOG: the Firpo & Possebom (2018) confidence-sets -by-test-inversion feature (PR igerber#527) was filed under [3.5.1] but merged AFTER the v3.5.1 tag was cut, so the tagged v3.5.1 did not actually contain it. Move that entry into the new [3.5.2] section alongside everything else that landed post-tag (CBWSDID balancing igerber#534, SyntheticControl conformal inference igerber#530, the placebo_effects -> variance_effects rename/deprecation igerber#532, and the ImputationDiD validation + SE fixes igerber#533). The Firpo PR-A paper review (igerber#524, docs-only) stays in [3.5.1] since it was in that tag. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
PR-B of the ImputationDiD methodology validation — the source-validation pass of the Borusyak, Jaravel & Spiess (2024, REStud 91(6)) audit (PR-A #529 added the paper review). Validating against R
didimputationuncovered and fixed a ~27% downward bias in the analytical standard errors without covariates (a real correctness bug; point estimates were always correct).Three code corrections in
diff_diff/imputation.py— behavior change: SE / t / p / CI values change without covariates; point estimates unchanged:v_itweights (Theorem 3 variance). The covariate-free path used a balanced two-way closed form-(w_i/n0_i + w_t/n0_t - w/N0), wrong for the always-unbalancedΩ₀in staggered designs → SEs ~27% too small. Replaced with the exact projection-A₀(A₀'A₀)⁻¹A₁'w(the covariate path's method), and kept all unit dummies in the design (the prior drop-first-unit/no-intercept design was one rank short → a further ~1.6% bias). SEs now matchdidimputationto ~1e-10. A singularΩ₀routes to a dense-lstsqfallback (SciPyspsolvereturns NaN +MatrixRankWarningwithout raising — promoted to an error so the fallback fires in production).Σ_i(Σ_t v)(Σ_t v·τ̂)/Σ_i(Σ_t v)², NaN-safe.fillna(0.0).The multiplier bootstrap resamples the same Theorem-3 influence function, so bootstrap SEs may also shift.
Methodology references
didimputationv0.5.0.REGISTRY.md## ImputationDiD): Rdidimputationimplements Equation 8 only at the cohort×event-time partition (= diff-diff's defaultaux_partition="cohort_horizon"); diff-diff additionally offers coarsercohort/horizonpartitions (no R analogue, hand-calc validated). Multiplier bootstrap + survey-design TSL variance are library extensions. Leave-one-out variance (Supp. App. A.9) is not implemented (tracked).Validation
tests/test_methodology_imputation.py— paper-equation Verified Components (Theorem 1/2; Theorem 3 / eqs 6-8 + white-box unit-clustered Eq. 8 hand-calc + NaN-co-group edge + singular-Ω₀dense-fallback regression; Proposition 5K≥H̄non-identification; Test 1 / eq 9 + Proposition 9) andTestImputationDiDParityR(overall + per-horizon ATT and SE vsdidimputation, no silent skips).benchmarks/data/didimputation_golden.json(generatorbenchmarks/R/generate_didimputation_golden.R).tests/test_imputation.py: tightened the coarser-partition conservatism test.METHODOLOGY_REVIEW.mdrow → Complete (Verified Components / Corrections Made / Deviations / R Comparison Results).Security / privacy
🤖 Generated with Claude Code