perf(block-impact): byte cap + node-level parallelism (42.5s→15.4s) by aspala · Pull Request #67 · num42/codeqa-action · GitHub
Skip to content

perf(block-impact): byte cap + node-level parallelism (42.5s→15.4s)#67

Merged
aspala merged 4 commits into
mainfrom
perf/loo-byte-cap-and-multipath
Jun 10, 2026
Merged

perf(block-impact): byte cap + node-level parallelism (42.5s→15.4s)#67
aspala merged 4 commits into
mainfrom
perf/loo-byte-cap-and-multipath

Conversation

@aspala

@aspala aspala commented Jun 10, 2026

Copy link
Copy Markdown
Member

Was & Warum

health-report --view actions ohne --base-ref lief auf großen Repos minutenlang. Telemetrie (--telemetry, war schon eingebaut) zeigte: block_impact = 96% der Zeit, und die per-Node LOO ist O(file_bytes) — wenige große/generierte Files dominieren.

Drei Hebel, jeder telemetrie-verifiziert:

1. Byte-Cap für LOO (konfigurierbar)

Files über max_loo_file_bytes (Default 32 KB, CLI --max-loo-file-bytes, YAML max_loo_file_bytes, nil = aus) bekommen keine refactoring-Nodes mehr — fließen aber weiter ins Codebase-Aggregat & die Note. Lockfiles, gebundelte Assets etc. dominierten die Laufzeit und sind nie refactoring-Ziele.

Dabei einen Pass-through-Bug gefixt: der Cap wurde in Analyzer.do_analyze_codebase's Keyword.take rausgefiltert, bevor er BlockImpactAnalyzer erreichte. Regressions-Test auf der analyze_codebase-Ebene.

2. Node-Level-Parallelität

Vorher parallelisierte nur die File-Ebene; die Nodes eines Files liefen seriell — wenige große Files (hunderte Sub-Nodes) grindeten single-threaded, während andere Cores idleten. Jetzt drei Phasen: prepare (tokenize/parse/index) → ein gemeinsamer Pool über alle Nodes aller Files → reconstruct.

Dabei einen ID-Kollisions-Bug gefunden & gefixt: file-lokale Index-Pfade kollidierten im geteilten Pool (jedes File hat einen Top-Level-Node bei Index 0) → bei concurrency≥2 deterministisch falsche cosine-Werte. ID jetzt mit File-Pfad geprefixt. Guard: Node-Output bit-identisch zwischen workers: 1 und workers: 8.

3. Multi-Path-Args

health-report <path> [subpath ...] schränkt die Analyse auf Subpaths ein (. lib test skippt priv/assets/config); Git-Kontext bleibt am <path> verankert.

Messung (assets-Sample, 33 Files)

Stand Zeit
Original 42.5s
+ Byte-Cap 25.8s
+ Node-Parallelität 15.4s (−64%)

Tests

937/0, credo --strict clean. Neue Guards: byte-cap pass-through, parallel==serial bit-Identität.

aspala added 2 commits June 10, 2026 22:51
LOO is O(file_bytes) per node; a few large/generated files (lockfiles,
bundled assets) dominate runtime. Add a byte cap (default 32KB, CLI
--max-loo-file-bytes, YAML max_loo_file_bytes, nil disables) — files over
the cap get no refactoring nodes but still flow into the codebase
aggregate and grade. Telemetry-measured: assets sample 42.5s -> 25.8s.

The cap was being filtered out in Analyzer.do_analyze_codebase's
Keyword.take before reaching BlockImpactAnalyzer; added it to the take
list and a regression test at the analyze_codebase layer.

Also lands multi-path args: `health-report <path> [subpath ...]` restricts
the walk to given subpaths (e.g. `. lib test` skips priv/assets/config)
while git context stays anchored at <path>.
Per-node leave-one-out previously parallelized only over files
(Task.async_stream), running a file's nodes serially. A few large files
(hundreds of sub-nodes each) then grind single-threaded while other cores
idle. Split the work into three phases: prepare (tokenize/parse/index,
per file), a single shared pool over every node of every file, and
reconstruct (rebuild each tree, per file). Now the hundreds of nodes of
one large file compete with all other nodes for the same worker pool.

The node tree is flattened into indexed work units and rebuilt from the
results keyed by a globally-unique index path (prefixed with the file
path — a file-local index collides across files, since every file has a
top-level node at index 0, which let parallel completion order pick a
different file's result for concurrency >= 2). Guard added: node results
are bit-identical between workers: 1 and workers: 8.

Telemetry-measured on the assets sample: 25.8s -> 15.4s (and 42.5s ->
15.4s combined with the byte cap).
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Score: C+ → C+ | Δ -1 pts | 0 blocks flagged across 8 files | 8 modified, 0 added

🟠 Code Health: C+ (63/100)

193 files · codeqa-action · 2026-06-10

Combined metric scores use cosine similarity: +1 = metric profile perfectly matches healthy pattern for this behavior, 0 = no signal, −1 = anti-pattern detected. Mapped to 0–100 using breakpoints (approx: ≥0.5→A, ≥0.2→B, ≥0.0→C, ≥−0.3→D, <−0.3→F); actual letter grades use the full 15-step scale.

Metric Changes

Category Base Head Δ
Readability 88.55 97.81 +9.25
Complexity 30.50 41.58 +11.07
Duplication 0.57 0.59 +0.02
Structure 6.22 9.23 +3.01
%%{init: {'theme': 'neutral'}}%%
xychart-beta
    title "Code Health Scores"
    x-axis ["Readability", "Complexity", "Structure", "Duplication", "Naming", "Magic Numbers", "Combined Metrics"]
    y-axis "Score" 0 --> 100
    bar [94, 30, 87, 48, 96, 100, 65]
Loading
Readability       ███████████████████░   94  🟢 A
Complexity        ██████░░░░░░░░░░░░░░   30  🔴 D-
Structure         █████████████████░░░   87  🟢 A-
Duplication       ██████████░░░░░░░░░░   48  🟠 C-
Naming            ███████████████████░   96  🟢 A
Magic Numbers     ████████████████████  100  🟢 A
Combined Metrics  █████████████░░░░░░░   65  🔴 D

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor
🔍 Top Likely Issues (cosine similarity)

Most negative cosine = file's metric profile best matches this anti-pattern.

Behavior Cosine Score
dependencies.low_coupling -0.56 -12.62
file_structure.single_responsibility -0.51 -12.40
file_structure.line_count_under_300 -0.44 -9.53
code_smells.no_dead_code_after_return -0.40 -22.77
scope_and_assignment.shadowed_by_inner_scope -0.34 -5.08
file_structure.line_length_under_120 -0.30 -8.36
variable_naming.loop_var_is_single_letter -0.23 3.49
type_and_value.no_implicit_null_initial -0.21 -14.34
variable_naming.name_contains_and -0.21 -36.20
variable_naming.name_contains_type_suffix -0.20 -1.61
🟢 Readability — A (94/100)

Codebase averages: flesch_adapted=97.81, fog_adapted=4.83, avg_tokens_per_line=9.55, avg_line_length=35.77

Metric Value Score
readability.flesch_adapted 97.81 100
readability.fog_adapted 4.83 100
readability.avg_tokens_per_line 9.55 72
readability.avg_line_length 35.77 100
🔴 Complexity — D- (30/100)

Codebase averages: difficulty=41.58, effort=243496.46, volume=4126.56, estimated_bugs=1.38

Metric Value Score
halstead.difficulty 41.58 41
halstead.effort 243496.46 0
halstead.volume 4126.56 45
halstead.estimated_bugs 1.38 46
🟢 Structure — A- (87/100)

Codebase averages: branching_density=0.14, mean_depth=3.86, avg_function_lines=8.30, max_depth=9.21, max_function_lines=20.03, variance=6.84, avg_param_count=1.16, max_param_count=2.07

Metric Value Score
branching.branching_density 0.14 76
indentation.mean_depth 3.86 88
function_metrics.avg_function_lines 8.30 89
indentation.max_depth 9.21 87
function_metrics.max_function_lines 20.03 90
indentation.variance 6.84 100
function_metrics.avg_param_count 1.16 100
function_metrics.max_param_count 2.07 100
🟠 Duplication — C- (48/100)

Codebase averages: redundancy=0.59, bigram_repetition_rate=0.54, trigram_repetition_rate=0.37

Metric Value Score
compression.redundancy 0.59 58
ngram.bigram_repetition_rate 0.54 38
ngram.trigram_repetition_rate 0.37 40
🟢 Naming — A (96/100)

Codebase averages: entropy=0.89, mean=6.64, variance=18.82, avg_sub_words_per_id=1.17

Metric Value Score
casing_entropy.entropy 0.89 100
identifier_length_variance.mean 6.64 100
identifier_length_variance.variance 18.82 85
readability.avg_sub_words_per_id 1.17 100
🟢 Magic Numbers — A (100/100)

Codebase averages: density=0.00

Metric Value Score
magic_number_density.density 0.00 100
🔴 Combined Metrics — D (65/100)
Category Score Grade
Code Smells 26 🔴 D-
Consistency 81 🟡 B+
Dependencies 19 🔴 E+
Documentation 83 🟡 B+
Error Handling 91 🟢 A-
File Structure 48 🟠 C-
Function Design 81 🟡 B+
Naming Conventions 90 🟢 A-
Scope And Assignment 28 🔴 D-
Testing 83 🟡 B+
Type And Value 89 🟢 A-
Variable Naming 74 🟡 B
🔴 Code Smells — D- (26/100)

Cosine similarity scores for 1 behaviors.

Behavior Cosine Score Grade
no_dead_code_after_return -0.40 26 D-
🟡 Consistency — B+ (81/100)

Cosine similarity scores for 1 behaviors.

Behavior Cosine Score Grade
consistent_function_style 0.36 81 B+
🔴 Dependencies — E+ (19/100)

Cosine similarity scores for 1 behaviors.

Behavior Cosine Score Grade
low_coupling -0.56 19 E+
🟡 Documentation — B+ (83/100)

Cosine similarity scores for 3 behaviors.

Behavior Cosine Score Grade
file_has_module_docstring 0.30 77 B
function_has_docstring 0.45 86 A-
docstring_is_nonempty 0.45 87 A-
🟢 Error Handling — A- (91/100)

Cosine similarity scores for 3 behaviors.

Behavior Cosine Score Grade
error_message_is_descriptive 0.45 87 A-
does_not_swallow_errors 0.60 92 A-
returns_typed_error 0.69 94 A
🟠 File Structure — C- (48/100)

Cosine similarity scores for 5 behaviors.

Behavior Cosine Score Grade
single_responsibility -0.51 21 E+
line_count_under_300 -0.44 24 E+
line_length_under_120 -0.30 30 D-
has_consistent_indentation 0.26 74 B
no_magic_numbers 0.57 91 A-
🟡 Function Design — B+ (81/100)

Cosine similarity scores for 3 behaviors.

Behavior Cosine Score Grade
is_less_than_20_lines 0.33 79 B+
no_magic_numbers 0.38 82 B+
has_verb_in_name 0.40 83 B+
🟢 Naming Conventions — A- (90/100)

Cosine similarity scores for 1 behaviors.

Behavior Cosine Score Grade
function_name_is_not_single_word 0.50 90 A-
🔴 Scope And Assignment — D- (28/100)

Cosine similarity scores for 1 behaviors.

Behavior Cosine Score Grade
shadowed_by_inner_scope -0.34 28 D-
🟡 Testing — B+ (83/100)

Cosine similarity scores for 2 behaviors.

Behavior Cosine Score Grade
test_single_concept 0.27 74 B
test_name_describes_behavior 0.53 91 A-
🟢 Type And Value — A- (89/100)

Cosine similarity scores for 1 behaviors.

Behavior Cosine Score Grade
hardcoded_url_or_path 0.49 89 A-
🟡 Variable Naming — B (74/100)

Cosine similarity scores for 1 behaviors.

Behavior Cosine Score Grade
name_is_generic 0.26 74 B

@github-actions

Copy link
Copy Markdown
Contributor

kind: refactoring-tasks
path: /home/runner/work/codeqa-action/codeqa-action
timestamp: 2026-06-10T21:24:25.959377Z
overall_grade: C+
overall_score: 63
task_count: 0
critical: 0
high: 0
instructions: >-
Address the tasks below in order of severity (critical first).
After each fix, run the project's test suite and confirm it passes
before moving on.

No critical or high-severity blocks need attention. ✅

At ~3000 nodes the node pool drove peak memory to ~54GB and a hard
slowdown cliff (240 files: 103s). Two causes, both fixed:

1. Each work unit carried its file's full node_ctx (content, tokens,
   cosines). Dispatching a unit to a worker copies the message, so the
   node_ctx was copied once PER NODE. Units now carry only the file key;
   the node_ctx lives in a per-file map captured once per Flow stage —
   O(stages) copies instead of O(nodes). Dropped the unused root_tokens
   from node_ctx while here.

2. The incremental aggregate and project languages were rebuilt per file
   over all file_results — O(files^2). Built once now and shared.

Phase B switched from Task.async_stream to Flow.from_enumerable with
max_demand, bounding in-flight units per stage (backpressure).

Telemetry-measured at 240 files: 54GB -> 10.8GB peak, 103s -> 17s (6.1x).
Output stays bit-identical (parallel == serial guard).
@github-actions

Copy link
Copy Markdown
Contributor

kind: refactoring-tasks
path: /home/runner/work/codeqa-action/codeqa-action
timestamp: 2026-06-10T21:40:41.240500Z
overall_grade: C+
overall_score: 63
task_count: 0
critical: 0
high: 0
instructions: >-
Address the tasks below in order of severity (critical first).
After each fix, run the project's test suite and confirm it passes
before moving on.

No critical or high-severity blocks need attention. ✅

Phase B flat-mapped all prep units into one list before handing it to
Flow, materializing the entire unit set (~3000 maps at 240 files) up
front. Switched to Stream.flat_map so units are pulled lazily under
Flow's max_demand backpressure.

Telemetry-measured at 240 files: peak memory 10.8GB -> 7.1GB. Output
stays bit-identical (parallel == serial guard).
@github-actions

Copy link
Copy Markdown
Contributor

@aspala aspala merged commit f5d62fd into main Jun 10, 2026
8 checks passed
@aspala aspala deleted the perf/loo-byte-cap-and-multipath branch June 10, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant