perf(efficient-did): cache polynomial sieve basis across DR nuisance fits by igerber · Pull Request #556 · igerber/diff-diff · GitHub
Skip to content

perf(efficient-did): cache polynomial sieve basis across DR nuisance fits#556

Merged
igerber merged 1 commit into
mainfrom
perf/efficient-did-sieve-basis-cache
Jun 27, 2026
Merged

perf(efficient-did): cache polynomial sieve basis across DR nuisance fits#556
igerber merged 1 commit into
mainfrom
perf/efficient-did-sieve-basis-cache

Conversation

@igerber

@igerber igerber commented Jun 27, 2026

Copy link
Copy Markdown
Owner

Summary

  • Cache the polynomial sieve basis across the three EfficientDiD doubly-robust (DR) nuisance fits (outcome regression, propensity ratio, inverse propensity). Each helper looped K=1..k_max rebuilding _polynomial_sieve_basis(covariate_matrix, K) from scratch; since all three receive the same fit-level covariate_matrix, every shared degree's (n × n_basis) basis was recomputed once per helper, per (g,t) cell.
  • Adds a per-fit memoization (_sieve_basis_cached, keyed (id(X), degree)) that the orchestrator threads into the three helpers so each distinct degree's basis is built once and shared. basis_cache defaults to None (plain pass-through), so any standalone caller is unchanged.

Methodology references (required if estimator / math changes)

  • Method name(s): EfficientDiD (Chen, Sant'Anna & Xie 2025) — covariate DR sieve path
  • Paper / source link(s): docs/methodology/REGISTRY.md §EfficientDiD (polynomial sieve nuisances)
  • Any intentional deviations from the source (and why): None. This is a pure performance memoization — _polynomial_sieve_basis(X, K) is a referentially-transparent function of (X, degree) and the nuisance helpers only read its output (no in-place mutation), so reusing one object is bit-identical to rebuilding it. Sieve degree selection, weighting, normal equations, Ω* construction, ATT, EIF, and SE logic are all unchanged.

Validation

  • Tests added/updated: tests/test_efficient_did.py — new TestSieveBasisCache (cache-hit returns the same object and equals a fresh build bit-for-bit; cache=None pass-through; reads do not mutate the cached basis; end-to-end fit builds each distinct degree exactly once across the three helpers).
  • Bit-identity evidence: captured overall ATT + all 18 group_time effect/se on a fixed-seed covariate DR fit before vs after the change — exact match (atol=0). Full tests/test_efficient_did.py (176) and tests/test_methodology_efficient_did.py (27 + slow covariate) suites pass. No methodology/behavior change, so no REGISTRY/CHANGELOG edit.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

🤖 Generated with Claude Code

…fits

The EfficientDiD doubly-robust (covariate) path runs three sieve nuisance
estimators (outcome regression, propensity ratio, inverse propensity) that each
loop K=1..k_max and rebuild the full polynomial sieve basis at every degree. All
three receive the same fit-level covariate_matrix, so for any degree reached by
more than one helper the identical (n x n_basis) array was rebuilt from scratch
each time, across every (g,t) cell.

Add a per-fit memoization (_sieve_basis_cached, keyed (id(X), degree)) that the
orchestrator threads into the three helpers so each distinct degree's basis is
built once and shared. _polynomial_sieve_basis is a pure function of (X, degree)
and the helpers only read basis_all (no in-place mutation), so this is
bit-identical: verified by an exact (atol=0) match of overall ATT plus all 18
group_time effect/se on a fixed-seed covariate fit before vs after the change.
basis_cache defaults to None (plain pass-through), so standalone callers are
unchanged.

Resolves the EfficientDiD sieve-basis Performance row in TODO.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

@igerber igerber added the ready-for-ci Triggers CI test workflows label Jun 27, 2026
@igerber igerber merged commit 02da0eb into main Jun 27, 2026
33 of 34 checks passed
@igerber igerber deleted the perf/efficient-did-sieve-basis-cache branch June 27, 2026 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant