[codex] PostgreSQL compatibility harness foundation#50
Conversation
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
scripts/pg_compat/run_compat.py (1)
275-275: ⚡ Quick winRemove unused
_diagnosticsvariable.The
partition_rowsfunction returns a tuple, but only the first element is used. The second element_diagnosticsis assigned but never referenced.♻️ Suggested cleanup
- accepted, _diagnostics = partition_rows(rows) + accepted, _ = partition_rows(rows)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/pg_compat/run_compat.py` at line 275, The partition_rows function call is unpacking both return values into accepted and _diagnostics, but _diagnostics is never used afterward. Remove the unused variable assignment by only capturing the first element from the partition_rows return value in the assignment statement.scripts/pg_compat/generate_report.py (1)
223-223: ⚡ Quick winConsider dynamic version numbers in section headers for consistency.
The title on line 196 uses
target_majordynamically, but the "PG18 Backlog" header and "PG17 to PG18 Release Delta" header hardcode version numbers. For consistency and future maintainability (e.g., extending to PG19), consider:# Line 223 f"## PG{target_major} Backlog", # Line 250 f"## PG{_major(previous['pg_version'])} to PG{target_major} Release Delta",Also applies to: 250-250
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/pg_compat/generate_report.py` at line 223, The section headers in the generate_report.py file hardcode version numbers (PG18, PG17 to PG18) instead of using dynamic variables for consistency. At line 223, replace the hardcoded "## PG18 Backlog" string with an f-string using the target_major variable to match the dynamic approach used in the title at line 196. At line 250, similarly replace the hardcoded "## PG17 to PG18 Release Delta" header with an f-string that constructs the version numbers dynamically using _major(previous['pg_version']) and target_major. This ensures the headers will adapt automatically if the script is extended to support future PostgreSQL versions.scripts/pg_compat/fetch_libpg_query.sh (1)
187-210: 💤 Low valuePython
os.setsid()could fail if already a session leader.Line 196 calls
os.setsid()to create a new session for proper signal handling. This will fail withEPERMif the calling process is already a session leader (rare in practice, but possible if the script is invoked through certain process managers).However, the
os.execvp()call on line 199 will still execute even ifsetsid()fails, so the subprocess will run. For this use case (interruptible git/curl operations in the compat harness), this edge case is acceptable.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/pg_compat/fetch_libpg_query.sh` around lines 187 - 210, The os.setsid() call in the Python snippet within the run_interruptible function could fail with EPERM if the process is already a session leader, causing the script to exit prematurely. Wrap the os.setsid() call in a try-except block to gracefully handle this edge case, catching OSError or other potential exceptions and continuing execution since the subprocess will still run properly even if session creation fails.scripts/pg_compat/baseline.py (1)
308-353: 💤 Low valueConsider adding
strict=Trueto zip for clarity.The
zip()call at line 324 lacks an explicitstrict=parameter. While the length validation at lines 314-318 ensures the lists have the same length, addingstrict=Truewould make the intent explicit and provide a built-in runtime check.Note: This requires Python 3.10+. If the project targets Python 3.9 or earlier, skip this suggestion.
♻️ Optional enhancement
- for index, (oracle_row, metadata_row) in enumerate( - zip(oracle_rows, metadata_rows) - ): + for index, (oracle_row, metadata_row) in enumerate( + zip(oracle_rows, metadata_rows, strict=True) + ):🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/pg_compat/baseline.py` around lines 308 - 353, The zip() call in the validate_witnesses function that iterates over oracle_rows and metadata_rows lacks an explicit strict parameter. Add strict=True to the zip() call to make the intent explicit and provide a built-in runtime check that ensures both sequences have equal length, which aligns with the length validation already performed at the start of the function. This change requires Python 3.10 or later support.tests/pg_compat/fixtures/pg17.proto (1)
2-2: Add clarifying comment or message stub for undefinedNodetype in fixture.The
Nodetype is referenced but not defined in this protobuf fixture. While the text-based extraction logic incompare_versions.pyhandles this without issue, the fixture lacks self-contained clarity. Either add a brief comment explaining thatNodeis intentionally omitted (e.g.,// Node defined separately), or include a minimal stub (message Node {}), similar to the pattern used in related tests.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/pg_compat/fixtures/pg17.proto` at line 2, The protobuf fixture file references the Node type in the target_list field but does not define it, reducing clarity and self-containment. In the pg17.proto fixture file where the repeated Node target_list field is defined, add either a brief clarifying comment (such as "// Node defined separately") above or near the target_list field definition, or include a minimal stub message definition for Node (e.g., "message Node {}") elsewhere in the file to match patterns used in related tests. This clarifies that the omission is intentional and aids anyone reviewing the fixture.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yml:
- Line 113: Pin all GitHub Actions references to commit SHAs instead of semantic
versions to comply with the repository's security policy. In
`.github/workflows/ci.yml` at lines 113 and 133, replace `actions/checkout@v4`
with `actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af172` (or the current
latest v4 commit SHA). At lines 116 and 136, replace `actions/cache@v4` with the
full commit SHA for the latest v4 release of that action. Verify each pinned SHA
corresponds to the intended version tag to prevent action mutations.
- Around line 106-124: The pg-compat and pg-compat-full jobs in
.github/workflows/ci.yml lack explicit permissions configuration and allow
credential persistence, creating unnecessary security risks. At
.github/workflows/ci.yml#L106-L124 (pg-compat job), add a permissions block with
contents: read at the job level and set persist-credentials: false on the
actions/checkout@v4 step. Apply the same changes at
.github/workflows/ci.yml#L126-L144 (pg-compat-full job) - add permissions: {
contents: read } at the job level and set persist-credentials: false on the
checkout action. Both jobs only require read access to the repository and do not
write artifacts or perform deployments.
---
Nitpick comments:
In `@scripts/pg_compat/baseline.py`:
- Around line 308-353: The zip() call in the validate_witnesses function that
iterates over oracle_rows and metadata_rows lacks an explicit strict parameter.
Add strict=True to the zip() call to make the intent explicit and provide a
built-in runtime check that ensures both sequences have equal length, which
aligns with the length validation already performed at the start of the
function. This change requires Python 3.10 or later support.
In `@scripts/pg_compat/fetch_libpg_query.sh`:
- Around line 187-210: The os.setsid() call in the Python snippet within the
run_interruptible function could fail with EPERM if the process is already a
session leader, causing the script to exit prematurely. Wrap the os.setsid()
call in a try-except block to gracefully handle this edge case, catching OSError
or other potential exceptions and continuing execution since the subprocess will
still run properly even if session creation fails.
In `@scripts/pg_compat/generate_report.py`:
- Line 223: The section headers in the generate_report.py file hardcode version
numbers (PG18, PG17 to PG18) instead of using dynamic variables for consistency.
At line 223, replace the hardcoded "## PG18 Backlog" string with an f-string
using the target_major variable to match the dynamic approach used in the title
at line 196. At line 250, similarly replace the hardcoded "## PG17 to PG18
Release Delta" header with an f-string that constructs the version numbers
dynamically using _major(previous['pg_version']) and target_major. This ensures
the headers will adapt automatically if the script is extended to support future
PostgreSQL versions.
In `@scripts/pg_compat/run_compat.py`:
- Line 275: The partition_rows function call is unpacking both return values
into accepted and _diagnostics, but _diagnostics is never used afterward. Remove
the unused variable assignment by only capturing the first element from the
partition_rows return value in the assignment statement.
In `@tests/pg_compat/fixtures/pg17.proto`:
- Line 2: The protobuf fixture file references the Node type in the target_list
field but does not define it, reducing clarity and self-containment. In the
pg17.proto fixture file where the repeated Node target_list field is defined,
add either a brief clarifying comment (such as "// Node defined separately")
above or near the target_list field definition, or include a minimal stub
message definition for Node (e.g., "message Node {}") elsewhere in the file to
match patterns used in related tests. This clarifies that the omission is
intentional and aids anyone reviewing the fixture.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 45be172d-156a-4e34-8517-09c2a7d56a32
📒 Files selected for processing (45)
.github/workflows/ci.yml.gitignoreMakefileREADME.mddocs/benchmarks/REPRODUCING.mddocs/compatibility/postgresql-18.mddocs/superpowers/plans/2026-06-11-postgresql-libpg-query-compatibility.mddocs/superpowers/specs/2026-06-11-postgresql-libpg-query-compatibility-design.mdinclude/sql_engine/local_txn.hscripts/pg_compat/baseline.pyscripts/pg_compat/common.pyscripts/pg_compat/compare_versions.pyscripts/pg_compat/extract_statements.pyscripts/pg_compat/fetch_libpg_query.shscripts/pg_compat/generate_report.pyscripts/pg_compat/run_compat.pytests/pg_compat/ci_cases.jsonltests/pg_compat/expected_results.jsonltests/pg_compat/fixtures/pg17-gram.ytests/pg_compat/fixtures/pg17-inventory.jsonltests/pg_compat/fixtures/pg17-kwlist.htests/pg_compat/fixtures/pg17-parsenodes.htests/pg_compat/fixtures/pg17.prototests/pg_compat/fixtures/pg18-gram.ytests/pg_compat/fixtures/pg18-inventory.jsonltests/pg_compat/fixtures/pg18-kwlist.htests/pg_compat/fixtures/pg18-parsenodes.htests/pg_compat/fixtures/pg18.prototests/pg_compat/runner_cases.sqltests/pg_compat/statement_type_cases.cpptests/pg_compat/structural_dispositions.jsontests/pg_compat/test_baseline.pytests/pg_compat/test_common.pytests/pg_compat/test_compare_versions.pytests/pg_compat/test_extract_statements.pytests/pg_compat/test_generate_report.pytests/pg_compat/test_run_compat.pytests/pg_compat/test_runner.pytests/pg_compat/upstream_pins.jsontests/pg_compat/witnesses.jsontests/pg_compat/witnesses.sqltools/pg_compat/pg_compat.cpptools/pg_compat/result.htools/pg_compat/statement_type_map.cpptools/pg_compat/statement_type_map.h
| pg-compat: | ||
| if: github.event_name != 'schedule' | ||
| runs-on: ubuntu-24.04 | ||
| timeout-minutes: 30 | ||
| env: | ||
| PG_COMPAT_CACHE: /tmp/parsersql-pg-compat | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Cache PostgreSQL compatibility sources | ||
| uses: actions/cache@v4 | ||
| with: | ||
| path: /tmp/parsersql-pg-compat | ||
| key: pg-compat-${{ runner.os }}-${{ hashFiles('tests/pg_compat/upstream_pins.json') }} | ||
| restore-keys: | | ||
| pg-compat-${{ runner.os }}- | ||
|
|
||
| - name: Test PostgreSQL compatibility gate | ||
| run: make -f Makefile test-pg-compat |
There was a problem hiding this comment.
Consolidation: Restrict job permissions and disable credential persistence in both pg-compat jobs.
Both the pg-compat and pg-compat-full jobs omit explicit permissions configuration and allow credential persistence, creating overly broad access and credential leakage risks.
.github/workflows/ci.yml#L106-L124(pg-compat): addpermissions: { contents: read }and setpersist-credentials: falseon checkout.github/workflows/ci.yml#L126-L144(pg-compat-full): addpermissions: { contents: read }and setpersist-credentials: falseon checkout
Both jobs only need read access to the repository; they do not write artifacts or deploy.
🧰 Tools
🪛 zizmor (1.25.2)
[warning] 113-113: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[warning] 106-124: overly broad permissions (excessive-permissions): default permissions used due to no permissions: block
(excessive-permissions)
[error] 113-113: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 116-116: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
📍 Affects 1 file
.github/workflows/ci.yml#L106-L124(this comment).github/workflows/ci.yml#L126-L144
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/ci.yml around lines 106 - 124, The pg-compat and
pg-compat-full jobs in .github/workflows/ci.yml lack explicit permissions
configuration and allow credential persistence, creating unnecessary security
risks. At .github/workflows/ci.yml#L106-L124 (pg-compat job), add a permissions
block with contents: read at the job level and set persist-credentials: false on
the actions/checkout@v4 step. Apply the same changes at
.github/workflows/ci.yml#L126-L144 (pg-compat-full job) - add permissions: {
contents: read } at the job level and set persist-credentials: false on the
checkout action. Both jobs only require read access to the repository and do not
write artifacts or perform deployments.
Source: Linters/SAST tools
There was a problem hiding this comment.
Consolidation: Pin all GitHub Actions to commit SHAs.
Four action references use unpinned semantic versions instead of commit SHAs, which violates the repository's security policy and allows unreviewed action mutations.
.github/workflows/ci.yml#L113:actions/checkout@v4→ pin to SHA.github/workflows/ci.yml#L116:actions/cache@v4→ pin to SHA.github/workflows/ci.yml#L133:actions/checkout@v4→ pin to SHA.github/workflows/ci.yml#L136:actions/cache@v4→ pin to SHA
Example: replace actions/checkout@v4 with actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af172 (or the current latest v4 commit SHA).
🧰 Tools
🪛 zizmor (1.25.2)
[warning] 113-113: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 113-113: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/ci.yml at line 113, Pin all GitHub Actions references to
commit SHAs instead of semantic versions to comply with the repository's
security policy. In `.github/workflows/ci.yml` at lines 113 and 133, replace
`actions/checkout@v4` with
`actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af172` (or the current
latest v4 commit SHA). At lines 116 and 136, replace `actions/cache@v4` with the
full commit SHA for the latest v4 release of that action. Verify each pinned SHA
corresponds to the intended version tag to prevent action mutations.
Source: Linters/SAST tools

Summary
libpg_query17-latest and 18-latest oracle sources.What is included
docs/superpowers/plans/2026-06-11-postgresql-libpg-query-compatibility.mdfeat/postgresql-compat-harnessValidation
make testwith local Homebrew dependency overrides: 1,299 tests run, 1,262 passed, 37 skipped.make test-pg-compat PG_COMPAT_CACHE=/tmp/parsersql-pg-compat-task9: 87 tests passed; checked 24,080 committed pg_compat CI cases.make pg-compat PG_COMPAT_CACHE=/tmp/parsersql-pg-compat-task9: skipped 13 known invalid/unsplittable PostgreSQL corpus files per oracle role; checked 24,080 committed pg_compat CI cases; committed PostgreSQL compatibility baseline matches.make pg-compat-refresh PG_COMPAT_CACHE=/tmp/parsersql-pg-compat-task9: regenerated PostgreSQL 18 baseline/report, 51,415 baseline rows, 24,080 CI cases, 167 structural features.python3YAML parse check for.github/workflows/ci.yml: jobs includepg-compatandpg-compat-full.git diff --check: passed after Task 12 docs.DEEP_SUPPORTED,CLASSIFIED_ONLY,PARTIAL,ERROR,TRAILING_INPUT, orTYPE_MISMATCH.Notes
tests/pg_compat/upstream_pins.json; CI does not silently advance18-latest.make test-pg-compat).make pg-compat).libpg_queryis not linked into production parser targets; it is confined to benchmarks and the pg_compat harness.DEEP_SUPPORTEDis the only full parser parity result;CLASSIFIED_ONLYis reported separately as top-level statement routing coverage.Summary by CodeRabbit
New Features
Build & CI
Documentation