Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 40
Comparing changes
Open a pull request
base repository: igerber/diff-diff
base: v3.6.0
head repository: igerber/diff-diff
compare: v3.6.1
- 12 commits
- 60 files changed
- 3 contributors
Commits on Jun 29, 2026
-
Configuration menu - View commit details
-
Copy full SHA for 491b8d5 - Browse repository at this point
Copy the full SHA 491b8d5View commit details -
test(lpdid): add R-parity validation harness (Dube et al. 2025), Phas…
…e B2 (#583) Pin the absorbing LPDiD estimator against the method authors' own R recipes (danielegirardi/lpdid) with an alexCardazzi/lpdid cross-check gate: - benchmarks/R/generate_lpdid_golden.R: in-R panel (+ interior-gap unit) and 6 variants (variance-weighted, reweight, pmd, direct-covariate, pooled, RA-point); writes committed lpdid_test_panel.csv + lpdid_golden.json. - tests/test_methodology_lpdid.py: skip-guarded parity (att/se to ~1e-12, cross-platform asserted at 1e-6/1e-7). - benchmarks/python/coverage_lpdid_ra.py + lpdid_ra_coverage.json: ungated Monte-Carlo study validating the RA influence-function SE calibration (~0.95). Resolves the two provisional REGISTRY deviation notes in the library's favour with no estimator change: the RA SE matches the Stata teffects convention (point-anchored, SE pinned + coverage-validated; no R-package analogue), and the pooled estimand matches the authors' fixed-composition recipe (correcting the prior "horizon-stacked" wording). no_composition documented as more paper-faithful than the R packages (B1-tested). Ticks the REGISTRY B2 checklist box. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Configuration menu - View commit details
-
Copy full SHA for 48e1f4c - Browse repository at this point
Copy the full SHA 48e1f4cView commit details -
feat(staggered): materialize non-estimable (g,t) cells as NaN in Call…
…awaySantAnna (#582) * feat(staggered): materialize non-estimable (g,t) cells as NaN in CallawaySantAnna Uniformly materialize a NaN entry (with a machine-readable skip_reason) for every non-estimable (g,t) group-time cell across all CS estimation paths (no-covariate regression, covariate regression, IPW/DR, repeated cross-section, survey-weighted) instead of omitting it. Previously only the covariate-singular case materialized NaN; the other paths dropped the cell silently from the grid. Cells carry no influence-function entry, so they are excluded from every aggregation (simple/group/calendar/event-study), balance_e, and bootstrap -- all aggregate point estimates and SEs, plus event-study n_groups / by-group n_periods, are numerically unchanged and continue to match R did's aggte(). A fit where no cell is estimable still raises ValueError. to_dataframe("group_time") now includes the NaN rows and a skip_reason column. Documented per-cell surface deviation from R's att_gt (which omits the rows). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(staggered): uniform no-IF NaN cells + cover covariate paths (review #582) - The covariate-regression non-finite cell now materializes via _nan_gt_entry with NO influence-function entry, matching the other paths and the documented REGISTRY/helper contract (previously it wrote a zero-IF entry and ran batch inference). Aggregates and SEs are unchanged -- the cell was finite-masked / IF-membership-filtered out either way; now the "NaN cells carry no IF entry" invariant holds uniformly across all paths. - Extend the materialization test to cover covariate IPW/DR (panel + RCS) paths. - Remove the now-implemented CallawaySantAnna NaN-cell row from TODO.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(staggered): treat non-finite ATT(g,t) as non-estimable in general/RCS paths (review #582 P1) The general (IPW/DR) and RCS estimable builders treated `att_gt is not None` as estimable even when att_gt was non-finite (NaN/inf): they stored effect=att_gt (which could surface inf), kept the influence-function entry, and did not count the cell in the consolidated skip total. Both now branch on `att_gt is None or not np.isfinite(att_gt)` first and materialize via _nan_gt_entry(skip_reason="non_finite_regression") with NO IF entry, so the documented contract (non-estimable cells are NaN entries, carry no IF, excluded from the bootstrap) holds uniformly across every path. Aggregate estimates and SEs are unchanged (these cells were finite-masked / IF-filtered out either way). Adds a regression test mimicking the inf-ATT-with-IF leak. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(staggered): guard non-finite ATT in the no-covariate path too (review #582 P0) Completes the uniform invariant: every CallawaySantAnna estimable-cell builder (no-covariate vectorized, covariate regression, general IPW/DR, RCS) now routes a non-finite ATT(g,t) to _nan_gt_entry(skip_reason="non_finite_regression") with NO influence-function entry, skipping batch inference and bootstrap membership. The no-covariate diff-in-means ATT is finite given n_t,n_c>0, but a non-finite outcome (inf survives the NaN-only valid mask) could otherwise store inf as the effect and produce t_stat=inf / p=0 / infinite CI via safe_inference_batch. Aggregate estimates and SEs are unchanged. Adds a regression test injecting an inf outcome through the no-covariate path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(staggered): omit all-non-estimable relative-time buckets from event study (review #582 P1) _aggregate_event_study() appended an all-NaN row (effect=NaN, se=NaN, n_groups=0) for a relative-time bucket whose cells are all non-estimable, instead of omitting the bucket. With NaN-cell materialization this surfaced new all-NaN event-study rows where the bucket previously had no cells (and thus no row) -- an aggregate- surface change vs the prior omit behavior and R did::aggte(). The bucket is now dropped when finite-filtering leaves no cell (tracked via a kept-periods list so the result lists stay aligned), matching _aggregate_by_group, which already omits all-NaN groups. Adds a test asserting an all-non-estimable relative time is absent from event_study_effects. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(staggered): drop nonexistent "calendar" aggregation from CS NaN-cell notes (review #582 P3) CallawaySantAnna's aggregation options are simple / event_study / group / all (there is no calendar aggregation). Remove "calendar" from the REGISTRY edge-case Note and the CHANGELOG entry listing which aggregations exclude materialized NaN cells. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(staggered): report real treated/control counts on non-estimable cells (review #582 P2) The per-cell helper sentinel returns (and the covariate-reg empty-control batch site) hardcoded n_treated=n_control=0 for materialized NaN cells even after the observation masks had been computed, so group_time_effects / to_dataframe could show zero counts for a cell that actually had treated (or control) observations. The zero-control / zero-weight exits in _compute_att_gt_fast and _compute_att_gt_rc, and the covariate-reg empty-control batch site, now return the observed counts; missing-period exits (masks not yet built) keep 0. Display-only metadata -- estimates, SEs, and aggregation are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(staggered): handle materialized NaN cells in ported CS tests Three test_csdid_ported.py tests relied on non-estimable (g,t) cells being OMITTED from group_time_effects; with the new NaN-cell materialization those cells are present as NaN, so the tests' membership / golden-iteration guards no longer skip them. Update them to preserve their intent on the FINITE cells: - test_some_units_treated_first_period: a first-period cohort (no base period) is now all-NaN (missing_period) rather than absent -> assert it is all-NaN. - test_zero_pretreatment_outcomes: skip NaN pre-cells (the last cohort under not_yet_treated has no controls); finite pre-cells are still ~0. - test_golden_fewer_periods: skip NaN cells (gapped panel where base g-1 is unobserved -> missing_period; R falls back to an available base) -> R-parity on the finite cells. No source change; the cells are correctly non-estimable, only now visible. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Configuration menu - View commit details
-
Copy full SHA for 81f4e84 - Browse repository at this point
Copy the full SHA 81f4e84View commit details -
Add LPDiD non-absorbing treatment (entry-effect estimands), Phase C1 (#…
…584) * feat(lpdid): non-absorbing treatment (entry-effect estimands), Phase C1 Implement Dube, Girardi, Jorda & Taylor (2025) Section 4.2 non-absorbing treatment for LPDiD via a new `non_absorbing` parameter: - "first_entry" (Eq. 12): effect of entering treatment for the first time and staying treated; reuses the absorbing clean control, restricts only the treated set. Bit-identical to the absorbing path on absorbing panels. - "effect_stabilization" (Eq. 13, `stabilization_window=L`): units whose treatment has been stable for >= L periods serve as clean controls, so estimation is feasible with few/no never-treated units. Default `non_absorbing=None` is unchanged (absorbing path, still rejects non-absorbing input). Mode-aware clean-sample masks evaluate window conditions via cumulative treatment-change/level lookups with a documented "untreated before the first observed period" boundary convention; placebo horizons use the full pre-span window so pre-trends are uncontaminated; a per-horizon clean- treated indicator threads through the estimator / RA / reweight / pooled paths so re-entry events are classified correctly. Non-absorbing modes require a gap-free panel within each unit's observed span. Pure-Python validation (tests/test_lpdid.py::TestLPDiDNonAbsorbing): absorbing reduction, single-cohort reduction, re-entry mechanism, boundary retention, negative-horizon placebos, non-negative weighting, stabilized-control admission, equal-weight recovery, and DGP recovery; absorbing tests + R-parity goldens unchanged. Exit-event dynamics, R-package parity (PR-C2), and survey-design support are tracked follow-ups. * fix(lpdid): non-absorbing pooled-pre uses deepest reach-back horizon Codex P1: `_build_pooled_sample(kind="pre")` passed horizon=0 to the non-absorbing masks, so the effect_stabilization clean window only covered [t-L, t] instead of the pooled-pre reach-back to the most-negative horizon ([t - max(L, -h), t-1]). A unit with a prior treated spell at t-3 (clean at t-1) leaked into a [-3, -2] pooled-pre placebo and biased it. Pre windows now use min(horizons); the absorbing branch keeps horizon=0 (not-yet-treated at t already implies a clean pre-span, so its R-parity goldens are unchanged). Adds a deterministic regression test (spell entrants excluded from the pooled-pre sample; verified to fail at 0.286 before the fix).
Configuration menu - View commit details
-
Copy full SHA for 4f1a0a3 - Browse repository at this point
Copy the full SHA 4f1a0a3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8b91688 - Browse repository at this point
Copy the full SHA 8b91688View commit details -
feat(estimators): iterative alternating-projection demeaning for N-wa…
…y absorbed FE (#586) N>1 absorbed fixed effects used single-pass sequential demeaning, which is the exact (weighted) Frisch-Waugh-Lovell residualization only on balanced orthogonal-FE panels; on unbalanced panels it was a biased approximation (coefficients off by ~1e-2 in tested cases). Add an N-way method-of-alternating-projections engine demean_by_groups() in utils.py; route the DiD/MultiPeriodDiD absorb= paths and the shared two-way within_transform() through it, fixing TWFE / SunAbraham / BaconDecomposition on unbalanced unweighted panels too. Lift the weighted-multi-absorb rejection (now supported via weighted MAP). Single-absorb and balanced-panel results are byte-stable; the weighted within_transform output is bit-identical; R-parity goldens unchanged. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Configuration menu - View commit details
-
Copy full SHA for 6126f9b - Browse repository at this point
Copy the full SHA 6126f9bView commit details
Commits on Jun 30, 2026
-
Configuration menu - View commit details
-
Copy full SHA for 21f0c30 - Browse repository at this point
Copy the full SHA 21f0c30View commit details -
Add LPDiD complex-survey-design support (Phase D1) (#590)
* Add LPDiD complex-survey-design support (Phase D1) Adds a `survey_design=` argument to `LPDiD.fit()` (a `SurveyDesign` with probability weights + optional strata/PSU/FPC), matching the library-wide fit()-time convention. On the variance-weighted default path each horizon's long-difference regression is fit by WLS on the survey weights, and the SE is the stratified-PSU Taylor-linearization (Binder TSL) sandwich with `df = n_PSU - n_strata`, reusing `diff_diff/survey.py` (`compute_survey_vcov`). The design is re-resolved on each realized (post-clean-control) sample so weights/strata/PSU align with the regression rows; with no explicit PSU the unit is injected as the PSU. Fails closed to NaN on under-identified samples. Rejects `survey_design` with `reweight=True` (the equally-weighted / regression-adjustment IF path), replicate-weight designs, and non-pweight types (deferred follow-ups). `LPDiDResults` gains `survey_metadata` / `n_strata` / `n_psu`, a `"survey_tsl"` vcov_type, and a Survey Design block in `summary()`. The non-survey path is byte-for-byte unchanged. Validated against `survey::svyglm` on the stacked long difference (numeric golden parity is the D2 follow-up); 15 new pure-Python invariant tests (reduction/unit-clustering, FPC-shrinks-SE, stratification, lonely-PSU, NaN-consistency, weighting-moves-point, metadata, rejection paths). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(lpdid): report survey PSU count as headline G for only_event survey fits CI-codex P2: under a survey design the effective variance cluster is the PSU (cluster_name reports the PSU column), but for only_event=True fits (pooled is None) headline_n_clusters fell back to the panel unit count -- so an explicit PSU design with n_psu != n_units could display the unit count mislabeled as G=<psu>. Per-row event-study n_clusters and inference were already computed on the realized survey design, so this was a metadata/labeling issue only, not a wrong SE/p-value. Fix: when a survey design is active, seed headline_n_clusters from the panel-level effective PSU count (the pooled-post override still prefers the realized survey-sample count when available). Regression test added (only_event=True, explicit PSU, n_psu != n_units). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(lpdid): build survey sandwich on the kept-column design (rank-deficient contract) CI-codex P1: `_estimate_survey_sample` computed the Binder TSL sandwich on the UNREDUCED design and recomputed `response - design @ coef`. When the rank handler drops a redundant direct-inclusion covariate / absorbed dummy / lag (setting that coef to NaN while `treatment_entry` stays identified), the NaN coef propagated through the residuals and the full-design X'WX bread singularized, collapsing an otherwise-identified treatment SE/t/p/CI to NaN -- violating the library's rank-deficient contract that the non-survey solve_ols path honors. Fix: keep solve_ols's returned residuals (original-scale, computed on the identified reduced design) and build `compute_survey_vcov` on `design[:, kept]` where `kept = isfinite(coef)`, mapping treatment back to its kept-column index. If treatment itself is dropped, the effect is NaN and the SE stays NaN; `rank_deficient_action="error"` still raises from solve_ols. Regression test (duplicate + constant covariate, `silent`) asserts the treatment SE stays finite and equals the non-redundant reference fit, and that `error` raises. CI-codex P2: type-check `survey_design` before `_survey_columns` accesses its attributes, so a non-SurveyDesign argument raises the intended TypeError rather than an incidental AttributeError (test added). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Configuration menu - View commit details
-
Copy full SHA for c8928d2 - Browse repository at this point
Copy the full SHA c8928d2View commit details -
docs(survey): waive zero-weight-PSU SE-invariance item; lock Lumley f…
…ull-design convention (#589) * docs(survey): waive zero-weight-PSU SE-invariance item; lock Lumley full-design convention Re-examined the TODO row proposing the survey TSL finite-sample correction count only positive-weight PSUs so the SE is invariant to zero-weight (subpopulation / padded) rows. Investigation shows the premise conflicts with the library's documented, R-validated convention: - `_compute_stratified_psu_meat`'s per-stratum correction (1 - f_h)*n_PSU_h/(n_PSU_h - 1) and PSU-mean centering intentionally keep genuine-subpopulation zero-weight PSUs. This is the full-design domain estimator of Lumley (2004 Section 3.4) / R survey::svyrecvar(subset()), already documented in REGISTRY section "Subpopulation Analysis". - The ATT is exactly invariant; the survey SE is deliberately NOT invariant to genuine-subpopulation zeroing (it should differ from a naive physical subset -- that is the whole point of subpopulation()). R produces the matching SE (only df differs). - Zero-weight rows that reuse an existing PSU label are already bit-invariant. The only invariance-violating shape -- appending synthetic new all-zero PSUs -- arises in no estimator path (domain padding goes through prep.py's zero-padded full-design cell variance, which retains the real PSU layout). Forcing the meat to positive-weight-only counting would break the documented Lumley/R parity, so the item is waived (no estimator behavior change): - TODO.md: move the row from Actionable Backlog to "Won't-fix / waived (decisions on the record)" with the Lumley/R justification. - REGISTRY.md: add a Note in section "Subpopulation Analysis" making explicit that the TSL meat finite-sample correction counts zero-weight PSUs by design. - tests/test_survey.py: add TestZeroWeightPsuConventionWaiver regression-lock (inert existing-PSU padding is bit-invariant; subpopulation zeroing keeps the full PSU structure so its SE differs from a naive subset). A future positive-weight-only change would collapse the two and trip the test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(survey): add direct _compute_stratified_psu_meat full-design unit test Addresses the CI review's actionable P3 (Documentation/Tests): the SE-level test only asserts that subpopulation zeroing differs from a physical subset, which catches a full positive-weight-only rewrite but could miss a partial edit (e.g. changing only the finite-sample denominator while still centering over the zero PSU). Adds a direct unit test on _compute_stratified_psu_meat with crafted PSU scores including one all-zero-score PSU (a fully zeroed subpopulation PSU), asserting the exact full-design meat formula (n_PSU_h including the zero PSU) and that it is NOT the positive-weight-only meat. Any change to the centering OR the denominator now trips the lock. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(survey): model a true zero-weight PSU in the direct meat fixture Addresses the re-review's actionable P3: the direct meat test represented its all-zero PSU via zero score rows only, with weights left all ones. A future denominator-only edit that reads resolved.weights to drop positive-weight PSUs would not have been caught. Set PSU 2's weights to 0 so the fixture models a true fully zero-weight subpopulation PSU. The current meat ignores weights (it operates on scores), so the expected value is unchanged; the change only hardens the regression lock. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Configuration menu - View commit details
-
Copy full SHA for 6b052e6 - Browse repository at this point
Copy the full SHA 6b052e6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 193f0ea - Browse repository at this point
Copy the full SHA 193f0eaView commit details
Commits on Jul 1, 2026
-
Configuration menu - View commit details
-
Copy full SHA for 6596d1b - Browse repository at this point
Copy the full SHA 6596d1bView commit details -
Release 3.6.1. Changes since 3.6.0: - LPDiD non-absorbing (reversible) treatment with entry-effect estimands (Dube, Girardi, Jorda & Taylor 2025) + complex-survey-design support (survey_design=), each R-parity validated. - TROP non-absorbing (on/off) treatment support, opt-in local method (Athey, Imbens, Qu & Viviano 2025). - Weighted multiple absorbed fixed effects (absorb=[a, b, ...]) via iterative alternating-projection demeaning. - CallawaySantAnna materializes non-estimable (g,t) cells as NaN. - Fix: BusinessReport appendix render failures now surfaced. - R-parity validation backfill for the LPDiD absorbing/non-absorbing/survey paths; survey zero-weight-PSU SE-invariance item waived (Lumley full-design convention); SciPy lower-bound doc alignment. Promotes the CHANGELOG [Unreleased] section to [3.6.1] - 2026-07-01 and syncs the version across __init__.py, pyproject.toml, rust/Cargo.toml, llms-full.txt, and CITATION.cff. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Configuration menu - View commit details
-
Copy full SHA for 2bb4be9 - Browse repository at this point
Copy the full SHA 2bb4be9View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v3.6.0...v3.6.1
