docs: add CBWSDID (Ustyuzhanin 2026) paper review by igerber · Pull Request #531 · igerber/diff-diff · GitHub
Skip to content

docs: add CBWSDID (Ustyuzhanin 2026) paper review#531

Merged
igerber merged 1 commit into
mainfrom
feature/cbwsdid-paper-review
Jun 6, 2026
Merged

docs: add CBWSDID (Ustyuzhanin 2026) paper review#531
igerber merged 1 commit into
mainfrom
feature/cbwsdid-paper-review

Conversation

@igerber

@igerber igerber commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add the in-repo scholarly paper review for CBWSDIDCovariate-Balanced Weighted Stacked Difference-in-Differences (Vadim Ustyuzhanin, HSE, 2026; arXiv:2604.02293v1) — at docs/methodology/papers/ustyuzhanin-2026-review.md. This is the Step-1 methodology fidelity artifact (PR-A) for prospective CBWSDID support. Implementation packaging is an open PR-B decision and is deliberately not committed here: since the estimator reduces to weighted stacked DID at b_sa = 1, it can be realized either as a new estimator class or as a covariate-balancing (b_sa) path on the existing StackedDiD (the latter attractive because the refinement is control reweighting, which preserves the estimand under treatment-effect heterogeneity, not outcome-regression adjustment).
  • CBWSDID is a design-based extension of weighted stacked DID for conditionally (rather than unconditionally) parallel untreated trends: a within-sub-experiment matching/balancing design stage produces nonnegative control design weights b_sa that compose with the Wing, Freedman & Hollingsworth (2024) corrective stacked weights into a single weighted-least-squares stacked estimator. It reduces to weighted stacked DID at b_sa = 1 and extends to repeated 0→1 / 1→0 episodes under a finite-memory assumption.
  • The review transcribes the absorbing-adoption core (sub-experiment construction, Q_sa corrective weights, design weights b_sa, final stacked weights W_sa, the pooled estimator, and the W_sa-weighted two-way-FE event-study regression), Assumptions 1–4, the repeated-treatment extension with Assumptions R1–R6, the §5 inference (unit-clustered cluster-robust conditional on the estimated design weights, plus cluster-bootstrap options and the Abadie–Imbens (2008) nonsmooth-matching caveat), the simulation, and the Trounstine (2020) / Acemoglu et al. (2019) applications.
  • Source ambiguities are surfaced rather than silently resolved: the paper's internal κ_pre sign-convention inconsistency, the §4.1 reversal-window prose vs the formal episode-set definitions + Assumption R3, the FE/regression-path structure, the unit- vs observation-count Q-weight convention relative to the library's existing StackedDiD, and the single-author-preprint status (PR-B is contingent on a separate go/no-go). The paper has no numbered equations/theorems and no algorithm boxes, so all references are pinned to section numbers.

Methodology references (required if estimator / math changes)

Validation

  • Tests added/updated: None (documentation-only).
  • Backtest / simulation / notebook evidence (if applicable): N/A — paper-review artifact only.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

🤖 Generated with Claude Code

@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown

@igerber igerber force-pushed the feature/cbwsdid-paper-review branch from 1f97089 to c89f9fd Compare June 6, 2026 19:40
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: c89f9fd1d4b5bba0ec118e98a70d3098cbb53bec


Overall Assessment

⚠️ Needs changes. One unmitigated P1: the PR removes the SyntheticControl CWZ conformal inference public surface without a compatibility path or clear removal/deferment contract.

Executive Summary

  • CBWSDID review is docs-only and broadly matches the arXiv source at the abstract level: design weights within sub-experiments, Wing-style aggregation, repeated 0→1/1→0 extension, and an R package reference. (arxiv.org)
  • The PR is not just a CBWSDID paper-review change; it deletes diff_diff/conformal.py, conformal methods, diagnostics, docs, and tests.
  • P1: current callers of SyntheticControlResults.conformal_test() / CI helpers now fail with missing attributes instead of a documented migration/deprecation.
  • P2/P3 docs drift remains around DiagnosticReport reporting and the CBWSDID κ_pre convention.
  • No security or performance issues found.

Methodology

Finding: Removed CWZ conformal inference surface without an explicit methodology/removal contract
Severity: P1
Location: diff_diff/conformal.py deleted; diff_diff/synthetic_control_results.py:L1810-L1823; docs/methodology/papers/chernozhukov-wuthrich-zhu-2021-review.md:L100-L106
Impact: This removes an implemented SyntheticControl inference method family rather than changing its math. Existing users get AttributeError for conformal_test, conformal_confidence_intervals, conformal_average_effect, and accessors, with no deprecation stub or clear “feature deferred/removed” registry note.
Concrete fix: Either restore the conformal module/methods/tests/docs, or add an explicit removal/deferment path: changelog entry, registry/paper-review status update, and compatibility methods that raise a clear NotImplementedError with the supported alternative or planned replacement.

Finding: Mixed κ_pre convention remains in implementation-facing CBWSDID prose
Severity: P3
Location: docs/methodology/papers/ustyuzhanin-2026-review.md:L48, L126, clarified later at L89 and L258
Impact: The review correctly flags the paper’s notation inconsistency, but early checklist-style prose still uses {−κ_pre,…} while later recommending signed negative κ_pre. A future implementation could copy the wrong convention.
Concrete fix: Normalize implementation-facing formulas/checklists to {κ_pre,…,κ_post} with κ_pre < 0, and keep {−κ_pre,…} only inside the explicit source-ambiguity discussion.

Code Quality

Finding: Public API removal produces raw missing-method failures
Severity: P1
Location: diff_diff/synthetic_control_results.py:L1810-L1823; deleted diff_diff/conformal.py
Impact: Consumers using the prior results API fail at attribute lookup, not with a controlled, documented error.
Concrete fix: Restore the methods or leave tombstone methods for one release that raise a clear NotImplementedError.

Performance

No findings. The removed conformal code reduces runtime surface; no new performance-sensitive code was added.

Maintainability

Finding: Reporting methodology doc no longer matches DiagnosticReport behavior
Severity: P2
Location: docs/methodology/REPORTING.md:L266-L274; implementation still emits confidence_set at diff_diff/diagnostic_report.py:L2462-L2504
Impact: The edited reporting note omits the still-supported Firpo-Possebom confidence_set block, so future maintainers may think it is not part of native SCM diagnostics.
Concrete fix: Re-add confidence_set to the SyntheticControlResults native-diagnostics list while keeping conformal omitted if that rollback is intentional.

Tech Debt

Finding: Conformal follow-up TODO was removed together with the implementation
Severity: P3
Location: TODO.md:L88-L89; docs/methodology/papers/chernozhukov-wuthrich-zhu-2021-review.md:L129-L143
Impact: If CWZ conformal inference is now deferred rather than abandoned, the remaining work is no longer tracked.
Concrete fix: Add a replacement TODO saying CWZ conformal inference is deferred, or update the CWZ paper review to state it is no longer planned.

Security

No findings. No secrets, credentials, or new executable workflow risk observed.

Documentation/Tests

Finding: CBWSDID review claims this PR touches only the review file
Severity: P3
Location: docs/methodology/papers/ustyuzhanin-2026-review.md:L12
Impact: The statement is false for this diff, which changes code, tests, docs, registry, references, and TODO.
Concrete fix: Rephrase to “this review artifact is documentation-only; implementation is deferred,” removing the PR-scope claim.

Finding: Conformal tests were deleted with the feature
Severity: P3 if rollback is intentional; P1 if conformal is meant to remain
Location: tests/test_methodology_synthetic_control.py:L3521-L3540; tests/test_diagnostic_report.py:L2114-L2135
Impact: If conformal is restored, the methodology oracle and diagnostic tests must come back with it.
Concrete fix: Restore the deleted tests alongside the implementation, or document the intentional removal/deferment.

Path to Approval

  1. Resolve the conformal removal: restore the conformal implementation/tests/docs, or add explicit removal/deferment documentation plus compatibility tombstone methods with clear errors.
  2. Update docs/methodology/REPORTING.md so it still lists the existing confidence_set native diagnostic.
  3. Normalize the CBWSDID κ_pre implementation-facing notation to the signed convention.

Add the in-repo scholarly paper review for arXiv:2604.02293v1,
"Covariate-Balanced Weighted Stacked Difference-in-Differences"
(Vadim Ustyuzhanin, HSE, 2026), as the Step-1 methodology fidelity
artifact (PR-A) for a prospective CBWSDID estimator.

CBWSDID is a design-based extension of weighted stacked DID for
conditionally (rather than unconditionally) parallel untreated trends:
a within-sub-experiment matching/balancing design stage produces
nonnegative control design weights b_sa that compose with the Wing,
Freedman & Hollingsworth (2024) corrective stacked weights into a
single weighted least-squares stacked estimator. It nests weighted
stacked DID at b_sa = 1 and extends to repeated 0->1 / 1->0 episodes
under a finite-memory assumption.

The review transcribes the absorbing-adoption core (sub-experiment
construction, Q_sa corrective weights, design weights b_sa, final
stacked weights W_sa, the pooled estimator, and the W_sa-weighted
two-way-FE event-study regression), Assumptions 1-4, the repeated-
treatment extension with Assumptions R1-R6, the Section 5 inference
(unit-clustered cluster-robust conditional on the estimated design
weights, plus cluster-bootstrap options and the Abadie-Imbens (2008)
nonsmooth-matching caveat), the simulation and the Trounstine (2020)
and Acemoglu et al. (2019) applications, and flags the saturated-FE
regression delta versus the library's means-based StackedDiD as a PR-B
design question. The paper has no numbered equations/theorems and no
algorithm boxes, so all references are pinned to section numbers.

Docs-only and self-contained: no REGISTRY.md / doc-deps.yaml /
references.rst / code wiring (deferred to PR-B), matching the
new-estimator paper-review precedent (#497).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the feature/cbwsdid-paper-review branch from c89f9fd to 0b396ac Compare June 6, 2026 19:58
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown

@igerber igerber added the ready-for-ci Triggers CI test workflows label Jun 6, 2026
@igerber igerber merged commit e84f61a into main Jun 6, 2026
11 of 12 checks passed
@igerber igerber deleted the feature/cbwsdid-paper-review branch June 6, 2026 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant