Lift MultiPeriodDiD-absorb HC2/HC2-BM gate via auto-route by igerber · Pull Request #459 · igerber/diff-diff · GitHub
Skip to content

Lift MultiPeriodDiD-absorb HC2/HC2-BM gate via auto-route#459

Merged
igerber merged 4 commits into
mainfrom
mpd-absorb-hc2-auto-route
May 17, 2026
Merged

Lift MultiPeriodDiD-absorb HC2/HC2-BM gate via auto-route#459
igerber merged 4 commits into
mainfrom
mpd-absorb-hc2-auto-route

Conversation

@igerber

@igerber igerber commented May 17, 2026

Copy link
Copy Markdown
Owner

Summary

  • Lift MultiPeriodDiD(absorb=..., vcov_type in {"hc2","hc2_bm"}) NotImplementedError gate at estimators.py:1476 via 5-LoC auto-route to fixed_effects= internally — mirrors PR DiD-absorb HC2/HC2-BM: auto-route to fixed_effects internally #458's DiD-absorb pattern; same algebraic justification (FWL preserves coefficients/residuals but not the hat matrix; HC2/CR2 leverage corrections need the full FE-dummy hat).
  • Three-guard reorder so the auto-route sits BETWEEN the absorb+fixed_effects mutual-exclusion check (above) and the multi-absorb+survey-weights reject (below), matching the DiD ordering.
  • Survey-replicate absorb-refit branch at estimators.py:1693 is correctly short-circuited under the auto-route (standard compute_replicate_vcov path applies to the fixed full-dummy design; no per-replicate refit needed) — verified by new JK1 replicate-weights regression test.
  • New TestMPDAbsorbedFERParity class (10 tests): bit-equality auto-route invariants (single-absorb, multi-absorb, HC2-BM), R-parity vs sandwich::vcovHC and clubSandwich::vcovCR (1e-10), df-sensitive inference on both period_effects and avg_att, survey-weighted multi-absorb auto-route, JK1 replicate-weight regression, and mutual-exclusion preservation.
  • After this PR, TODO row 99 narrows to TWFE only (separate surgery — TWFE has no fixed_effects= equivalent path).

Methodology references (required if estimator / math changes)

  • Method name(s): Eicker-Huber-White HC2 leverage correction; Bell-McCaffrey (2002) / Imbens-Kolesar (2016) / Pustejovsky-Tipton (2018) Satterthwaite DOF; Frisch-Waugh-Lovell theorem
  • Paper / source link(s): clubSandwich R/CR-adjustments.R (unweighted CR2 algebra reference); PT2018 §3.3 singleton-cluster CR2 trick for one-way HC2-BM; sandwich vcovHC for HC2 anchor. See docs/methodology/REGISTRY.md § HC2 + Bell-McCaffrey scope-limitation block (now-updated MultiPeriodDiD status).
  • Any intentional deviations from the source: None. Under the auto-route the full-dummy design is bit-identical to lm(y ~ treated + period_X dummies + treated:period_X + factor(unit), data=d); HC2/HC2-BM SEs match R at ~1e-10 (smoke test ≤ 1e-15).

Validation

  • Tests added/updated: tests/test_estimators_vcov_type.py::TestMPDAbsorbedFERParity (10 new tests); existing test_multi_period_absorb_rejects_hc2_and_hc2_bm removed (its gate is now lifted, and the new class covers the same single-absorb shape with the new contract).
  • R goldens: new mpd_absorbed_fe_did scenario in benchmarks/R/generate_clubsandwich_golden.R + regenerated benchmarks/data/clubsandwich_cr2_golden.json — 5-cohort × 5-period event-study fixture (4 ever-treated + 1 never-treated cohort) with HC2 (sandwich) + HC2-BM singleton-cluster CR2 (clubSandwich) targets pinned on treated_period_4.
  • No regressions: 247 tests pass across test_estimators_vcov_type.py (67), test_estimators.py (180), test_linalg_hc2_bm.py.
  • Backtest / simulation / notebook evidence: N/A (analytical-sandwich methodology only; no tutorials updated).

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

igerber and others added 3 commits May 17, 2026 06:44
…ixed_effects=

Mirrors PR #458 (DiD-absorb auto-route) on MultiPeriodDiD: when absorb=
is paired with vcov_type in {hc2, hc2_bm}, the fit promotes the absorb
columns to fixed_effects= internally so the existing full-dummy-design
code path computes the algebraically correct vcov on the event-study
design (treated + period_X dummies + treated:period_X interactions +
factor(unit)). Verified at ~1e-15 vs lm() + sandwich::vcovHC(type="HC2")
and lm() + clubSandwich::vcovCR(cluster=1:n, type="CR2") on a new
5-cohort x 5-period mpd_absorbed_fe_did fixture.

Includes three-guard reorder so the auto-route sits BETWEEN the absorb +
fixed_effects mutual-exclusion check (above) and the multi-absorb +
survey-weights reject (below), matching the DiD ordering.

The survey-replicate absorb-refit branch at estimators.py:1689 is short-
circuited under the auto-route (the standard compute_replicate_vcov path
applies on the fixed full-dummy design; no per-replicate refit needed).

Tests: new TestMPDAbsorbedFERParity class (7 tests) mirrors PR #458's
TestDiDAbsorbedFERParity, pinning parity targets on per-period
interaction coefficients (treated:period_4) to avoid the treated x unit
collinearity baked into MPD's time-invariant ever-treated indicator.
Existing test_multi_period_absorb_rejects_hc2_and_hc2_bm deleted.

REGISTRY.md per-estimator status block updated (MPD moves REJECT ->
SUPPORTED; TWFE remains the only REJECT case). TODO row 99 narrowed to
TWFE-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex local review surfaced two findings on the MPD-absorb gate lift:

P3 (methodology accuracy): REGISTRY/CHANGELOG/test-class docstring
claimed the `treated` main-effect coefficient becomes NaN under
rank deficiency. Empirically false — in the shipped parity fixture
solve_ols drops a never-treated unit dummy (`unit_25`) and keeps
`treated` finite. The pivot order is data-dependent. Rewrite to
say one column from the collinear set is dropped under R-style
rank-deficiency handling; per-period interactions and avg_att are
identified and invariant to that choice.

P2 (test coverage): the new MPD test class missed two surfaces that
the DiD analogue covers:
  1. Survey-weighted multi-absorb auto-route bypass of the
     `len(absorb) > 1 + survey_weights` reject — exercised by new
     `test_absorb_hc2_bm_survey_multi_absorb_auto_routes` with
     parity assertion against the explicit `fixed_effects=` path
     on both `period_effects[target]` and `avg_att`.
  2. The MPD-specific `avg_att` (post-period-average) contrast did
     not have a direct HC2-vs-HC2-BM inference pin. Added
     `test_absorb_hc2_bm_avg_att_df_sensitive_inference` asserting
     same avg_se but different avg_p_value / wider avg_conf_int
     under HC2-BM (the contract guard the per-period
     df_sensitive test cannot reach).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex R2 returned ✅ with two P3s.

P3 (test coverage upgraded to actionable per
feedback_test_coverage_gap_treat_as_actionable.md): the survey test
added in R1 used aweight (generic survey-vcov path), but the
CHANGELOG/REGISTRY claim specifically that the auto-route
short-circuits the absorb-refit replicate-variance branch at
estimators.py:1693. Added test_absorb_hc2_bm_replicate_weights_auto_routes
using SurveyDesign(replicate_method="JK1", replicate_weights=[...])
that exercises the replicate path and pins SE parity vs the explicit
fixed_effects= form on both period_effects[target_period] and avg_att.
Passing at atol=1e-12 confirms the documented short-circuit works as
claimed.

P3 (doc accuracy): REGISTRY/CHANGELOG/test-class docstring described
the `treated` alias as "exactly collinear with the sum of treated-cohort
unit dummies". Under pd.get_dummies(drop_first=True) the exact alias
depends on the omitted FE reference category (and the intercept), not
just on the cohort-dummy sum. Rewrite to say `treated` lies in the span
of the intercept plus the post-auto-route unit FE dummies; which
specific nuisance column gets dropped is pivot-order and dummy-coding
dependent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

CI Codex review on PR #459 surfaced a P1 newly exposed by the auto-route:
when MPD(absorb=["unit","period"]) auto-routes to fixed_effects=["unit",
"period"], the existing fixed_effects= expansion loop adds `period_X`
dummies via `pd.get_dummies(prefix="period")` that collide on name with
the event-study period dummies MPD already builds for non-reference
periods. The duplicate `var_names` entries silently collapse in
`coef_dict = {name: coef for name, coef in zip(var_names, coefficients)}`,
overwriting the real event-study coefficients with the rank-deficient
NaN drops on the redundant FE block. Result: `len(coefficients) <
vcov.shape[0]` and `coefficients["period_X"] = NaN` even though
`period_effects[X]` (read by position) was correct.

Bug was pre-existing on MPD's `fixed_effects=[<time_col>]` path; the
auto-route just made it newly reachable via `absorb=`.

Fix: in MPD's fixed_effects expansion at estimators.py:1604, skip
entries where `fe == time` — MPD's design already absorbs the time
dimension via non-reference period dummies, so the FE-block dummies
would be perfectly redundant anyway (NaN'd by solve_ols, dropping
nothing useful while corrupting the result surface).

Empirical evidence:
- Pre-fix: `MPD(absorb=["unit","period"])` -> len(coefs)=34,
  vcov.shape=(38,38), coefs["period_2"]=NaN.
- Post-fix: same call -> len(coefs)=34, vcov.shape=(34,34),
  coefs["period_2"]=0.345 (finite, matches MPD's event-study fit).

Tests: new `test_absorb_hc2_result_surface_invariants_multi_absorb`
asserts `len(coefficients) == vcov.shape[0]`, no duplicate names, and
finite event-study `period_X` on BOTH the auto-route and the explicit
`fixed_effects=` paths (Codex P2: regression coverage for the result-
surface contract on the newly reachable path). 11/11 MPD tests pass;
249/249 in the broader sweep (test_estimators.py / test_linalg_hc2_bm.py
unchanged).

REGISTRY/CHANGELOG: documented the time-FE skip rule for both auto-route
and pre-existing `fixed_effects=[<time_col>]` invocations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 17, 2026
@igerber igerber merged commit 6a3e50b into main May 17, 2026
33 of 34 checks passed
@igerber igerber deleted the mpd-absorb-hc2-auto-route branch May 17, 2026 12:48
HanomicsIMF pushed a commit to HanomicsIMF/diff-diff that referenced this pull request May 22, 2026
…vg_att inference

Closes Gate 6 of the six HC2/HC2-BM NotImplementedError gates:
MultiPeriodDiD(cluster=..., vcov_type="hc2_bm") at estimators.py:1657
previously raised NotImplementedError because _compute_cr2_bm returns
per-coefficient Satterthwaite DOF only — the post-period-average ATT
(`avg_att = (1/n_post) Sum_{t >= t_treat} beta_t`) is a compound
contrast that needed a cluster-aware contrast DOF helper.

New _compute_cr2_bm_contrast_dof in diff_diff/linalg.py generalizes the
per-coefficient loop in _compute_cr2_bm to arbitrary (k, m) contrast
matrices using the identical Pustejovsky-Tipton 2018 Section 4 algebra
(`q = X bread_inv c`, `omega_g = A_g X_g bread_inv c`,
`DOF = trace(B)^2 / trace(B^2)`). _compute_cr2_bm is refactored to
call the new helper via a private _cr2_bm_dof_inner with
`contrasts=eye(k)`; refactor regression at atol=1e-10 confirms the
per-coefficient DOFs are preserved (matmul ordering differs slightly
from the prior inline loop).

MultiPeriodDiD.fit() extends its existing avg_att DOF block (introduced
in PR igerber#459) to branch on effective_cluster_ids: one-way
_compute_bm_dof_from_contrasts when None, cluster-aware
_compute_cr2_bm_contrast_dof otherwise. Cluster IDs are per-observation
length n and are NOT subscripted by the rank-deficient column-drop
mask `_kept` (which indexes coefficients, not observations).

R parity verified at atol=1e-10 against clubSandwich's
Wald_test(constraints=matrix(c, 1), test="HTZ")$df_denom on a new
mpd_clustered_avg_att_dof fixture in
benchmarks/data/clubsandwich_cr2_golden.json. On a 1-row constraint
matrix, HTZ reduces to a Satterthwaite t-test and its df_denom IS the
BM Satterthwaite DOF. The pre-flight smoke test against this same R
target passed at atol=1e-13 before any source edits.

Tests:
- TestCR2BMContrastDOF (4 new tests): refactor regression vs library,
  R-parity for compound contrast, shape validation, cluster-count
  validation.
- test_multi_period_cluster_plus_hc2_bm_rejected flipped to
  test_multi_period_cluster_plus_hc2_bm_produces_finite_inference
  (end-to-end MPD wire-through with finite avg_att / period_effects
  inference assertions).

After this PR, 3 of 6 HC2/HC2-BM gates are lifted (DiD-absorb igerber#458,
MPD-absorb igerber#459, MPD-cluster-contrast-DOF this PR). Remaining: TWFE
absorb (Gate 1), weighted HC2-BM (Gates 4-5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant