feat(stats): by_technique breakdown in codegraph stats (Phase 8.6 follow-up) by carlos-alm · Pull Request #1303 · optave/ops-codegraph-tool · GitHub
Skip to content

feat(stats): by_technique breakdown in codegraph stats (Phase 8.6 follow-up)#1303

Merged
carlos-alm merged 13 commits into
mainfrom
feat/phase-8-3d-property-write-pts
Jun 4, 2026
Merged

feat(stats): by_technique breakdown in codegraph stats (Phase 8.6 follow-up)#1303
carlos-alm merged 13 commits into
mainfrom
feat/phase-8-3d-property-write-pts

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

Closes #1300

  • DB migration v17: adds technique TEXT nullable column + index to edges table
  • Edge tagging at insertion: call edges are tagged with the resolution technique that produced them:
    • ts-native — direct static/same-file calls (native engine + JS direct resolution)
    • points-to — parameter-flow pts post-pass and pts-fallback edges
    • null — import, reexport, extends, implements, and receiver edges (not call edges)
  • PTS upgrade handling: when a pts-resolved edge is later superseded by a direct call, the technique is promoted from points-to to ts-native in-place
  • Native bulkInsertEdges path: after bulkInsertEdges (which doesn't write technique), a post-insert SQL UPDATE backfills techniques — pts edges by (source_id, target_id) first, then remaining NULL calls edges as ts-native baseline
  • Incremental builds: technique is saved and restored through reverse-dep edge reconnection
  • Stats output: callerCoverage.byTechnique is now included in codegraph stats --json and displayed in the human-readable output under the caller coverage line

JSON output shape

"caller_coverage": {
  "total": 5122,
  "resolved": 3585,
  "percentage": 70,
  "byTechnique": {
    "ts-native": 2940,
    "points-to": 645
  }
}

(type-propagation, cha-rta, and barrel are reserved labels for future work — they require threading resolution-source information through resolveCallTargets.)

Test plan

  • All 2832 tests pass (npx vitest run)
  • Build passes (npm run build)
  • Lint clean on changed files (npx biome check on staged files)

…-to analysis

Walk assignment_expression nodes in extractTypeMapWalk (WASM) and
match_js_type_map (native). When LHS is a simple member_expression
(obj.prop) and RHS is an identifier, seed typeMap['obj.prop'] = { type:
fn, confidence: 0.85 }. This covers patterns like:

  handlers.auth = authMiddleware;
  router.use(handlers.auth); // -> edge: caller -> authMiddleware

Extend resolveByMethodOrGlobal (TS) and resolve_call_targets (Rust) with
a composite key lookup (step 4.5): when a call has receiver + name, check
typeMap['receiver.name'] for a direct pts target before falling through to
the no-match return. Skips BUILTIN_GLOBALS / builtin objects (console,
Math, etc.) and chained writes (a.b.c = x).

Closes #1292
…BALS (#1295)

Adds the four names present in Rust's is_js_builtin_global but absent
from the TypeScript BUILTIN_GLOBALS set, restoring dual-engine parity
for property-write pts tracking. Also adds prop.type guard in
handlePropWriteTypeMap consistent with the adjacent fnRefBindings block.
…romotion test (#1295)

Renames 'higher-confidence entry wins' to accurately describe equal-confidence
first-write behavior. Removes stale inner comment. Adds explicit test for
strict-higher-confidence promotion via setTypeMapEntry. Extends BUILTIN_GLOBALS
test to cover process/window/document/globalThis.
…low-up)

Closes #1300

- DB migration v17: adds `technique TEXT` column + index to `edges` table
- Tags call edges at insertion: `ts-native` for direct/native-path calls,
  `points-to` for pts post-pass and pts-fallback edges; non-call edges get NULL
- Upgrades pts edges to `ts-native` in-place when a direct call supersedes them
- Native bulkInsertEdges path: technique is backfilled via a post-insert SQL UPDATE
  (pts edges targeted first; remaining NULL calls tagged ts-native as baseline)
- Preserves technique through incremental-build reverse-dep edge reconnection
- `computeQualityMetrics` / `buildStatsFromNative` expose `byTechnique` in
  `callerCoverage`; the JSON output and `printQuality` display it when present

docs check acknowledged

Impact: 20 functions changed, 23 affected
@claude

claude Bot commented Jun 4, 2026

Copy link
Copy Markdown

@greptile-apps

greptile-apps Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a technique column (migration v17) to the edges table and uses it to tag call edges with their resolution strategy (ts-native, points-to, or null for non-call edges). The callerCoverage.byTechnique breakdown is surfaced in both the JSON and human-readable codegraph stats output.

  • Schema + tagging: EdgeRowTuple gains a 6th slot; all edge-emitting sites are updated inline. batchInsertEdges writes technique directly; the native bulkInsertEdges path backfills via applyEdgeTechniquesAfterNativeInsert, scoped to the batch's source IDs to avoid mis-tagging pre-migration edges from unchanged files.
  • Native orchestrator path: backfillEdgeTechniquesAfterNativeOrchestrator applies a global UPDATE on full builds, a no-op on quiet incrementals, and a changed-file-scoped subquery UPDATE on normal incrementals — correctly addressing both previously identified global-UPDATE mis-tagging issues.
  • Incremental safety: technique is now saved and restored through the reverse-dep edge purge/reconnect cycle, so technique labels survive incremental builds without reparsing unchanged files.

Confidence Score: 5/5

Safe to merge. The technique backfill logic is well-scoped, the two global-UPDATE mis-tagging issues from the prior review cycle are both resolved, and technique values survive incremental purge/reconnect correctly.

All three insert paths (JS batch, native bulkInsert in build-edges, native Rust orchestrator) now correctly tag or backfill technique on call edges. The incremental-safety properties (quiet-incremental no-op, changed-file scoping, reverse-dep technique round-trip) are all addressed. The migration is additive and nullable with no risk of breaking existing data.

No files require special attention.

Important Files Changed

Filename Overview
src/db/migrations.ts Adds migration v17: nullable technique TEXT column and index on edges. Consistent with existing migration pattern; index supports GROUP BY in stats queries.
src/domain/graph/builder/stages/build-edges.ts EdgeRowTuple extended to 6-tuple; all push sites updated; pts supersession promotion correct; applyEdgeTechniquesAfterNativeInsert scoped to current-batch source IDs, preventing mis-tagging of pre-migration edges from unchanged files.
src/domain/graph/builder/stages/native-orchestrator.ts backfillEdgeTechniquesAfterNativeOrchestrator correctly distinguishes full builds (global UPDATE), quiet incrementals (early return), and file-changing incrementals (subquery-scoped UPDATE). Called after all edge-writing phases.
src/domain/analysis/module-map.ts Adds countCallEdgesByTechnique helper; byTechnique correctly omitted when empty; buildStatsFromNative now receives noTests parameter and constructs testFilter consistently with the JS path.
src/domain/graph/builder/context.ts Adds technique: string
src/domain/graph/builder/stages/detect-changes.ts Adds e.technique to the reverse-dep edge save query and propagates it through the row type so technique survives the purge/reconnect cycle.
src/domain/graph/builder/helpers.ts Extends batchInsertEdges and prepared-statement cache to include the 6th technique column; uses r[5] ?? null safely.
src/presentation/queries-cli/overview.ts Extends QualityInfo type with optional byTechnique and prints the technique breakdown sorted by count, guarded by Object.keys().length > 0.
tests/integration/issue-1292-property-write-pts.test.ts Adds technique assertions that correctly branch on engine (ts-native for native, points-to for JS). Both test cases now verify technique tagging end-to-end.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Edge rows built in-memory"] --> B{Insert path?}
    B -->|JS batchInsertEdges| C["technique written directly to DB"]
    B -->|Native bulkInsertEdges| D["Edges inserted, technique = NULL"]
    D --> E["applyEdgeTechniquesAfterNativeInsert scoped to batch sourceIds"]
    E --> F["UPDATE: explicit technique rows first"]
    F --> G["Catch-all UPDATE: remaining NULL calls tagged ts-native"]
    H["Native Rust Orchestrator"] --> I["backfillEdgeTechniquesAfterNativeOrchestrator"]
    I --> J{Build type?}
    J -->|Full build| K["Global UPDATE all NULL calls"]
    J -->|Quiet incremental| L["No-op"]
    J -->|Incremental with changes| M["Scoped UPDATE via subquery"]
    N["Reverse-dep edge save"] --> O["technique saved"]
    O --> P["Reconnect: technique restored"]
    P --> B
    C --> Q["countCallEdgesByTechnique"]
    G --> Q
    K --> Q
    M --> Q
    Q --> R["callerCoverage.byTechnique in stats"]
Loading

Reviews (5): Last reviewed commit: "fix(stats): guard quiet incremental in b..." | Re-trigger Greptile

Comment thread src/domain/graph/builder/stages/build-edges.ts Outdated
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

23 functions changed24 callers affected across 13 files

  • countCallEdgesByTechnique in src/domain/analysis/module-map.ts:166 (5 transitive callers)
  • computeQualityMetrics in src/domain/analysis/module-map.ts:186 (3 transitive callers)
  • buildStatsFromNative in src/domain/analysis/module-map.ts:410 (3 transitive callers)
  • statsData in src/domain/analysis/module-map.ts:524 (2 transitive callers)
  • PipelineContext in src/domain/graph/builder/context.ts:21 (5 transitive callers)
  • getEdgeStmt in src/domain/graph/builder/helpers.ts:308 (6 transitive callers)
  • batchInsertEdges in src/domain/graph/builder/helpers.ts:349 (6 transitive callers)
  • emitTypeOnlySymbolEdges in src/domain/graph/builder/stages/build-edges.ts:117 (3 transitive callers)
  • emitEdgesForImport in src/domain/graph/builder/stages/build-edges.ts:143 (3 transitive callers)
  • buildBarrelEdges in src/domain/graph/builder/stages/build-edges.ts:188 (3 transitive callers)
  • buildImportEdgesNative in src/domain/graph/builder/stages/build-edges.ts:370 (3 transitive callers)
  • buildCallEdgesNative in src/domain/graph/builder/stages/build-edges.ts:458 (3 transitive callers)
  • buildParamFlowPtsPostPass in src/domain/graph/builder/stages/build-edges.ts:541 (3 transitive callers)
  • buildFileCallEdges in src/domain/graph/builder/stages/build-edges.ts:783 (3 transitive callers)
  • buildClassHierarchyEdges in src/domain/graph/builder/stages/build-edges.ts:912 (3 transitive callers)
  • applyEdgeTechniquesAfterNativeInsert in src/domain/graph/builder/stages/build-edges.ts:960 (4 transitive callers)
  • reconnectReverseDepEdges in src/domain/graph/builder/stages/build-edges.ts:997 (3 transitive callers)
  • buildEdges in src/domain/graph/builder/stages/build-edges.ts:1130 (4 transitive callers)
  • purgeAndAddReverseDeps in src/domain/graph/builder/stages/detect-changes.ts:379 (4 transitive callers)
  • backfillEdgeTechniquesAfterNativeOrchestrator in src/domain/graph/builder/stages/native-orchestrator.ts:751 (4 transitive callers)

…lter (#1303)

- applyEdgeTechniquesAfterNativeInsert: scope the catch-all 'ts-native' UPDATE
  to source_ids in the current batch — prevents mis-tagging pre-migration
  NULL-technique edges from unchanged files on the first incremental build
  after a v16->v17 migration
- countCallEdgesByTechnique: accept testFilter (JOIN on source node) so
  byTechnique counts are consistent with --no-tests
- buildStatsFromNative: thread noTests into countCallEdgesByTechnique
)

The Rust orchestrator writes edges without the technique column. Add a
post-orchestrator step that tags all new 'calls' edges as 'ts-native':
full builds use a global UPDATE, incremental builds scope to changed-file
source nodes to avoid overwriting existing technique values.
- readCallEdges: SELECT e.technique in addition to existing columns
- readEngine: read build_meta engine to detect native vs WASM path
- Assertions check technique is 'ts-native' (native orchestrator) or
  'points-to' (JS path), confirming the technique backfill is working
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed all findings from the review:

Bug: Global UPDATE mis-tags pre-migration pts edges (P1 from Greptile, Red from Claude)
Fixed in two commits (d9947a6, 1636527):

  • applyEdgeTechniquesAfterNativeInsert: the catch-all 'ts-native' UPDATE is now scoped to source_id IN (batch sourceIds) — prevents contaminating unchanged files' edges on incremental builds after v16→v17 migration
  • Added backfillEdgeTechniquesAfterNativeOrchestrator in native-orchestrator.ts: the full Rust orchestrator (which handles most builds) was never calling the technique backfill at all, leaving all edges with technique = NULL. This new step runs after all edge-writing phases (including WASM backfill for dropped languages) and tags new 'calls' edges as 'ts-native' — global for full builds, scoped to changed-file source nodes for incremental builds

Issue: byTechnique ignores noTests filter (Yellow from Claude)
Fixed in d9947a6:

  • countCallEdgesByTechnique now accepts a testFilter string parameter and JOINs source nodes to apply the file filter
  • buildStatsFromNative now receives noTests and derives testFilter before calling countCallEdgesByTechnique

Issue: Integration test missing technique assertions (P2 from Greptile)
Fixed in 2769436:

  • readCallEdges now SELECT e.technique
  • Added readEngine helper to detect native vs JS build path from build_meta
  • Both test cases assert technique is 'ts-native' (native orchestrator) or 'points-to' (JS path)

Misleading label callerCoverage.byTechnique (Low/cosmetic, Claude)
The current placement under callerCoverage is intentional — it shows how the 'covered' subset was resolved, making it contextually useful. Renaming to resolvedEdges.byTechnique would break the JSON output shape that was just introduced. Tracked as a follow-up to address in a future stats restructuring PR.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@claude

@claude

claude Bot commented Jun 4, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


Re-review: feat(stats): by_technique breakdown in codegraph stats

Tasks

  • Gather PR context
  • Read updated source files
  • Verify fixes address reported issues
  • Write final review

Follow-up Review

All three substantive issues from the initial review have been correctly addressed. Here is a targeted assessment of each fix:

Fix 1: Global UPDATE scoped in applyEdgeTechniquesAfterNativeInsert

build-edges.ts:947-971

Correct. The catch-all ts-native UPDATE now collects sourceIds from the current batch and adds AND source_id IN (${placeholders}). Pre-existing NULL-technique edges from unchanged files are no longer touched on incremental builds after a v16→v17 migration.

One minor observation: the taggedRows first-pass UPDATE (AND technique IS NULL) still carries the upgrade-promotion gap from the original review (pts-promoted-to-ts-native tuples). However, since applyEdgeTechniquesAfterNativeInsert is only called from reconnectReverseDepEdges, not from the main native path (the Rust orchestrator path now uses the separate backfillEdgeTechniquesAfterNativeOrchestrator), the scenario where an in-memory promotion is not reflected in the DB is significantly less likely. The reconnect path operates on savedReverseDepEdges, which are pure restore-from-DB entries (not pts-promoted tuples from the call resolver), so the practical impact is minimal.

Fix 2: backfillEdgeTechniquesAfterNativeOrchestrator in native-orchestrator.ts

native-orchestrator.ts:751-772 and line 975

Correct and well-structured. The function correctly distinguishes full builds (global UPDATE — safe, all rows are newly inserted) from incremental builds (scoped UPDATE via source_id IN (SELECT id FROM nodes WHERE file IN (...)) subquery). The placement at line 975 — after both backfillNativeDroppedFiles and any structure/analysis fallbacks — ensures every edge-writing phase in the native path is covered before the technique stamp runs.

The subquery approach (AND source_id IN (SELECT id FROM nodes WHERE file IN (?))) is preferable to the batch-of-IDs approach used in applyEdgeTechniquesAfterNativeInsert, as it avoids loading all node IDs into JS for large changesets.

Fix 3: countCallEdgesByTechnique respects testFilter

module-map.ts:166-184

Correct. The function now accepts testFilter: string and joins nodes n ON e.source_id = n.id to apply n.file-based filtering. Both computeQualityMetrics (JS path) and buildStatsFromNative (native path) now derive testFilter consistently before calling this function. The --no-tests filter is now consistent across the callerCoverage ratio and the byTechnique breakdown.

Fix 4: Integration test technique assertions

tests/integration/issue-1292-property-write-pts.test.ts:49-111

Correct. readCallEdges now selects e.technique. The readEngine helper correctly reads build_meta.engine. The test branches on engine === 'native' and asserts ts-native vs points-to accordingly, which reflects the actual behavior difference between the two paths (the native Rust orchestrator doesn't distinguish pts from direct calls at the edge-tagging level).

The comment at line 93-94 is accurate: "The native orchestrator resolves pts edges via the typeMap (which includes the property-write seed) but labels them 'ts-native' since Rust doesn't distinguish resolution strategy." This is honest documentation of an engine behavioral difference, not a parity gap being papered over.


Remaining Low-Severity Issue

Misleading label callerCoverage.byTechnique (unchanged from initial review)

The field counts call edges, not callers. The PR author noted this is intentional for now and will be addressed in a future stats restructuring PR. This is acceptable — the JSON shape was just introduced in this PR and there's no external API contract yet, so deferring the rename is a reasonable call.


Verdict

The critical bug (global UPDATE mis-tagging) and the filter-consistency issue are both fixed correctly. The integration test now has meaningful technique assertions. No new issues introduced.

Rating: ✅ Approved

The PR is ready to merge.


View job

Comment thread src/domain/graph/builder/stages/native-orchestrator.ts Outdated
Quiet incremental builds (no files changed) insert no new edges — running
the global UPDATE would mis-tag pre-migration NULL-technique edges from
unchanged files as 'ts-native'. Return early when changedFiles is defined
but empty.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@carlos-alm carlos-alm merged commit c2959a1 into main Jun 4, 2026
26 checks passed
@carlos-alm carlos-alm deleted the feat/phase-8-3d-property-write-pts branch June 4, 2026 22:56
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 4, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(stats): by_technique breakdown in codegraph stats (Phase 8.6 follow-up)

1 participant