fix(linalg): rank-guard structural (non-covariate) matrix inverses#576
Conversation
0923411 to
013470b
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment✅ Looks good — no unmitigated P0/P1 findings. The prior P1 zero-SE issue appears resolved. Executive Summary
MethodologySeverity: P3 Severity: None Code QualitySeverity: None PerformanceSeverity: None MaintainabilitySeverity: P3 Tech DebtSeverity: P3 SecuritySeverity: None Documentation/TestsSeverity: P2 Tests were not run; this was a static PR review. |
013470b to
555e44c
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
MethodologySeverity: P3 informational Severity: None Code QualitySeverity: P3 PerformanceSeverity: None MaintainabilitySeverity: None Tech DebtSeverity: P3 informational SecuritySeverity: None Documentation/TestsSeverity: P2 Tests were not run: |
ContinuousDiD (ACRT-variance Psi'WPsi), TwoStageDiD (Stage-2 X_2'WX_2, analytical + multiplier-bootstrap surfaces), SpilloverDiD (Wave D A_22), and the Conley spatial-HAC variance (X'WX) inverted their internal design-Gram bread on a LinAlgError-only fallback: np.linalg.inv/solve raise only on an exactly-singular matrix, so a near-singular Gram returned a garbage inverse (~1e13) straight into the SE (ContinuousDiD's exact-singular fallback was a silent minimum-norm pinv; Conley raised ValueError). Route all four through the shared _rank_guarded_inv (the generalized inverse already used for the covariate IF SEs): it truncates redundant directions on the equilibrated Gram -> a finite SE on the identified subspace (the well-conditioned near-collinear limit, not minimum-norm; NaN only at rank 0) and warns when a direction is dropped. conley imports _rank_guarded_inv lazily (linalg imports conley one-way, so a top-level import would be circular). Behavior change: a rank-deficient Conley design no longer raises -- it rank-reduces with a warning, uniform with the other structural breads and the covariate convention. Well-conditioned designs are unchanged (the fast path is np.linalg.solve(A, I); R-parity preserved). Excluded after premise verification (documented in TODO.md): HeterogeneousAdoptionDiD's ZtWX is a non-symmetric IV bread (symmetric-PSD _rank_guarded_inv inapplicable); ImputationDiD's vcov is already rank-guarded upstream via solve_ols (its only raw inverse is a Wald F-test statistic with a safe NaN fallback). Tests: per-site near-singular -> finite rank-reduced SE + warning; conley direct-call singular -> finite (not raise) + rank-0 -> NaN + a column-drop == near-collinear-limit anchor; the two TwoStage bread-warning tests re-pointed from the np.linalg.solve seam to the _rank_guarded_inv seam (rank-reduce message). REGISTRY Notes (ContinuousDiD/TwoStageDiD/SpilloverDiD/Conley); CHANGELOG; TODO row retired + had/imputation closed out. No change to estimands, identifying assumptions, point estimates, or the well-conditioned SE path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
555e44c to
ab2e1b6
Compare

Summary
scale-invariant
_rank_guarded_inv(diff_diff/linalg.py) — the same generalized inversealready used for the covariate influence-function SEs (sibling of fix(staggered): scale-equilibrate CS / StaggeredTripleDiff covariate OR fits #570). Affected:
ContinuousDiDACRT-variance (
Psi'WPsi),TwoStageDiDStage-2 (X_2'WX_2, both the analyticaltwo_stage.pyand the multiplier-bootstraptwo_stage_bootstrap.pysurfaces),SpilloverDiDWave D (
A_22), and the Conley spatial-HAC variance (X'WX).np.linalg.inv/solveraise only on an exactly singular matrix, so anear-singular internal Gram returned a garbage inverse (~1e13) straight into the SE.
(ContinuousDiD's exact-singular fallback was additionally a silent minimum-norm
pinv; Conleyraised
ValueError.) These bases are internal — users cannot perturb them withcovariates=(distinct from the already-fixed covariate path).
_rank_guarded_invtruncates redundantdirections on the equilibrated Gram → a finite SE on the identified subspace (the well-conditioned
near-collinear limit, not minimum-norm; NaN only at rank 0) and emits a
UserWarningwhen adirection is dropped.
warning, uniform with the other structural breads and the covariate convention. Well-conditioned
designs are unchanged (the fast path is
np.linalg.solve(A, I); R-parity preserved).conley.pyimports
_rank_guarded_invlazily becauselinalgimportsconleyone-way (a top-level importwould be circular).
TODO.md):HeterogeneousAdoptionDiD'sZd'WXis a non-symmetric IV bread, so thesymmetric-PSD
_rank_guarded_invis methodologically inapplicable;ImputationDiD's vcov isalready rank-guarded upstream via
solve_ols(its only raw inverse is a Wald F-test statisticwith a safe
NaNfallback, not a sandwich bread).Methodology references (required if estimator / math changes)
ContinuousDiD(CGBS 2024),TwoStageDiD(Gardner 2022),SpilloverDiD(Butts 2021/2023),
ConleySpatialHAC(Conley 1999) — the influence-function / sandwichvariance (SE) path only.
docs/methodology/REGISTRY.md— a new- **Note (rank-guarded ...)**ineach of the four estimator sections, cross-referencing the CallawaySantAnna "rank-guarded IF
standard errors" Note (the established column-drop generalized-inverse semantics).
column-drop generalized-inverse convention (already used for CS / TripleDifference /
StaggeredTripleDifference covariate IF SEs) to the structural breads. The Conley raise→rank-reduce
is the documented, intentional behavior change. No change to estimands, identifying assumptions,
point estimates, or the well-conditioned SE path.
Validation
tests/test_conley_vcov.py(direct-call: singular Gram → finite rank-reducedSE, not a raise; rank-0 → NaN; column-drop == near-collinear-limit anchor),
tests/test_continuous_did.py+tests/test_spillover.py(rank-drop warning + finite SE via the_rank_guarded_invseam),tests/test_two_stage.py(the two Stage-2 bread-warning testsre-pointed from the
np.linalg.solveseam — which now collides with_rank_guarded_inv'sinternal solve — to the
_rank_guarded_invseam, asserting the rank-reduce message).test_continuous_did/test_two_stage/test_spillover/test_conley_vcov/test_methodology_conley/test_methodology_two_stage);mypy0 new errors vs baseline; source editsblack/ruff-clean. The Conley sandwichbread_inv @ meat @ bread_invis algebraically identical to the prior two symmetric solves(verified). Well-conditioned no-regression is covered by the existing suites (the fast path is the
unchanged
np.linalg.solve(A, I)).Security / privacy
🤖 Generated with Claude Code