{{ message }}
PowerAnalysis methodology review (PR-A): Bloom 1995 + Burlig 2020 source audits#506
Merged
Merged
Conversation
f70b9d6 to
aa64e26
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…rce audits Step-1 fidelity artifact for the PowerAnalysis methodology validation (the paper-review PR of the 2-PR sequence; source validation + code reconciliation + tests are deferred to PR-B). No estimator logic, formula, or test changes. New source audits under docs/methodology/papers/ (sourced only from the papers): - bloom-1995-review.md: the MDE-multiplier framing — MDE = M*SE with M from the NORMAL distribution (Bloom builds 1.65 + 0.84 = 2.49 from z-quantiles, p.548-549), the cross-sectional impact-estimator SE sigma*sqrt((1-R^2)/(n*T*(1-T))) (Eq. 1/2), the T(1-T) allocation factor (optimal at 50/50), one- and two-sided multipliers; documents that Bloom explicitly excludes clustering/multi-site design effects (Note 1). - burlig-preonas-woerman-2020-review.md: the serial-correlation-robust (SCR) panel-DD variance (Eq. 2, verbatim — three covariance terms psi^B/psi^A/psi^X over m pre- and r post-periods, psi^X entering negatively), the McKenzie special case (Eq. 3), the increase-MDE condition (Eq. 4); Eq. 1 uses the t-distribution; pcpanel is the panel parity reference. The audits surfaced discrepancies between the authoritative PowerAnalysis surfaces and the source material. Per the agreed approach these are DOCUMENTED as under-review now (not yet fixed — reconciliation is deferred to PR-B and tracked in TODO.md): - REGISTRY.md ## PowerAnalysis: umbrella **Note:** enumerating the four discrepancies (t-vs-normal-z multiplier; SE R^2/cluster-m terms; missing T(1-T) allocation factor in the displayed sample-size formula; panel (1+(T-1)rho)/T is an equicorrelated/Moulton design effect, NOT Burlig SCR — an attribution overclaim). - REGISTRY.md R-equivalents table: annotate the PowerAnalysis row as under-review (analytical path is normal-based, so pwr.t.test is not the faithful parity target; panel parity ref is Stata pcpanel) — resolves the cross-reference inconsistency the audits introduced. - power.py: docstring notes on PowerAnalysis and simulate_power flagging the panel attribution and normal-vs-t approximation as under review; the class docstring panel-variance display corrected from a self-canceling factor to the implemented (1/N_treated + 1/N_control) * (1+(T-1)rho)/T (docstring-only; no logic change). - references.rst: clarify the analytical panel path uses an equicorrelated approximation, not Burlig's SCR formula. - TODO.md: tracker row (Methodology/Correctness) for the PR-B reconciliation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
aa64e26 to
2470be6
Compare
igerber
added a commit
that referenced
this pull request
May 31, 2026
… Eq.2 equicorrelated; tracker -> Complete Reconciles diff_diff/power.py with the Bloom (1995) + Burlig, Preonas & Woerman (2020) source audits (paper reviews added in PR-A #506). Behavior change: the analytical panel-DiD variance was the Moulton (1+(T-1)rho)/T factor (wrong period-scaling, ~4x too small at rho=0/m=r=5, AND opposite rho-sign). Replaced with the within-unit equicorrelated special case of Burlig Eq. 2, sigma^2 (1/n_T+1/n_C)(1/m+1/r)(1-rho), so within-unit correlation now LOWERS the MDE. The MDE multiplier stays the normal-z Bloom multiplier (documented as a deliberate large-sample approximation to Burlig's t). - power.py: equicorrelated variance in _compute_variance + _compute_required_n; input validation for ALL designs (n_pre>=1, n_post>=1, rho in [-1/(T-1), 1)) enforced BEFORE the 2x2-vs-panel router, so invalid two-period shapes no longer fall through silently; the (1-rho) factor applies at T=2 too (Burlig footnote 11, the m=r=1 case), so rho is never silently ignored and rho=0 recovers Bloom's 2*sigma^2; docstrings rewritten; PR-A under-review notes removed. - REGISTRY ## PowerAnalysis equation block rewritten (z not t; unified equicorrelated SE with the 2x2 as the m=r=1 special case; cluster-m and inverted-R^2 terms removed; both reference surfaces; checklist ticked). - New tests/test_methodology_power.py (Bloom Table 1; 2x2 + panel closed forms; literal-equicorrelated Monte-Carlo; sample_size<->mde round-trip; input-guard + rho-at-T=2 + compute_* wrapper validation; base-R qnorm parity). - benchmarks/R/generate_power_golden.R + benchmarks/data/r_power_golden.json. - tests/test_power.py: inverted test_icc_effect + test_extreme_icc to Burlig's sign. - references.rst: + Frison & Pocock (1992), McKenzie (2012) lineage. - docs/tutorials/06_power_analysis.ipynb: corrected rho cells + summary. - METHODOLOGY_REVIEW.md row -> Complete; TODO row removed; CHANGELOG. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
igerber
added a commit
that referenced
this pull request
May 31, 2026
… Eq.2 equicorrelated; tracker -> Complete Reconciles diff_diff/power.py with the Bloom (1995) + Burlig, Preonas & Woerman (2020) source audits (paper reviews added in PR-A #506). Behavior change: the analytical panel-DiD variance was the Moulton (1+(T-1)rho)/T factor (wrong period-scaling, ~4x too small at rho=0/m=r=5, AND opposite rho-sign). Replaced with the within-unit equicorrelated special case of Burlig Eq. 2, sigma^2 (1/n_T+1/n_C)(1/m+1/r)(1-rho), so within-unit correlation now LOWERS the MDE. The MDE multiplier stays the normal-z Bloom multiplier (documented as a deliberate large-sample approximation to Burlig's t). - power.py: equicorrelated variance in _compute_variance + _compute_required_n; input validation for ALL designs (n_pre>=1, n_post>=1, rho in [-1/(T-1), 1)) enforced BEFORE the 2x2-vs-panel router, so invalid two-period shapes no longer fall through silently; the (1-rho) factor applies at T=2 too (Burlig footnote 11, the m=r=1 case), so rho is never silently ignored and rho=0 recovers Bloom's 2*sigma^2; docstrings rewritten; PR-A under-review notes removed. - REGISTRY ## PowerAnalysis equation block rewritten (z not t; unified equicorrelated SE with the 2x2 as the m=r=1 special case; cluster-m and inverted-R^2 terms removed; both reference surfaces; checklist ticked). - New tests/test_methodology_power.py (Bloom Table 1; 2x2 + panel closed forms; literal-equicorrelated Monte-Carlo; sample_size<->mde round-trip; input-guard + rho-at-T=2 + compute_* wrapper validation; base-R qnorm parity). - benchmarks/R/generate_power_golden.R + benchmarks/data/r_power_golden.json. - tests/test_power.py: inverted test_icc_effect + test_extreme_icc to Burlig's sign. - references.rst: + Frison & Pocock (1992), McKenzie (2012) lineage. - docs/tutorials/06_power_analysis.ipynb: corrected rho cells + summary. - METHODOLOGY_REVIEW.md row -> Complete; TODO row removed; CHANGELOG. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
docs/methodology/papers/, sourced only from the papers.PowerAnalysissurfaces and the source material; reconciliation is deferred to PR-B and tracked inTODO.md._compute_variancealready implements — docstring only.)Methodology references (required if estimator / math changes)
**Note:**+power.pydocstring notes + areferences.rstclarification, with reconciliation tracked inTODO.mdfor PR-B:1/√(1−R²)and cluster-sizemterms are not whatpower.pyimplements;T(1−T)allocation factor the code applies;(1+(T−1)ρ)/Tfactor is an equicorrelated/Moulton design effect, not Burlig's serial-correlation-robust (SCR) variance (Eq. 2) — an attribution overclaim pending PR-B re-attribution or implementation.Validation
pcpanelfor the panel path; a normal-based reference for the analytical path) are deferred to PR-B.Security / privacy
Generated with Claude Code