iframe-proxy

aspala · 2026-06-10T21:22:50Z

Was & Warum

health-report --view actions ohne --base-ref lief auf großen Repos minutenlang. Telemetrie (--telemetry, war schon eingebaut) zeigte: block_impact = 96% der Zeit, und die per-Node LOO ist O(file_bytes) — wenige große/generierte Files dominieren.

Drei Hebel, jeder telemetrie-verifiziert:

1. Byte-Cap für LOO (konfigurierbar)

Files über max_loo_file_bytes (Default 32 KB, CLI --max-loo-file-bytes, YAML max_loo_file_bytes, nil = aus) bekommen keine refactoring-Nodes mehr — fließen aber weiter ins Codebase-Aggregat & die Note. Lockfiles, gebundelte Assets etc. dominierten die Laufzeit und sind nie refactoring-Ziele.

Dabei einen Pass-through-Bug gefixt: der Cap wurde in Analyzer.do_analyze_codebase's Keyword.take rausgefiltert, bevor er BlockImpactAnalyzer erreichte. Regressions-Test auf der analyze_codebase-Ebene.

2. Node-Level-Parallelität

Vorher parallelisierte nur die File-Ebene; die Nodes eines Files liefen seriell — wenige große Files (hunderte Sub-Nodes) grindeten single-threaded, während andere Cores idleten. Jetzt drei Phasen: prepare (tokenize/parse/index) → ein gemeinsamer Pool über alle Nodes aller Files → reconstruct.

Dabei einen ID-Kollisions-Bug gefunden & gefixt: file-lokale Index-Pfade kollidierten im geteilten Pool (jedes File hat einen Top-Level-Node bei Index 0) → bei concurrency≥2 deterministisch falsche cosine-Werte. ID jetzt mit File-Pfad geprefixt. Guard: Node-Output bit-identisch zwischen workers: 1 und workers: 8.

3. Multi-Path-Args

health-report <path> [subpath ...] schränkt die Analyse auf Subpaths ein (. lib test skippt priv/assets/config); Git-Kontext bleibt am <path> verankert.

Messung (assets-Sample, 33 Files)

Tests

937/0, credo --strict clean. Neue Guards: byte-cap pass-through, parallel==serial bit-Identität.

LOO is O(file_bytes) per node; a few large/generated files (lockfiles, bundled assets) dominate runtime. Add a byte cap (default 32KB, CLI --max-loo-file-bytes, YAML max_loo_file_bytes, nil disables) — files over the cap get no refactoring nodes but still flow into the codebase aggregate and grade. Telemetry-measured: assets sample 42.5s -> 25.8s. The cap was being filtered out in Analyzer.do_analyze_codebase's Keyword.take before reaching BlockImpactAnalyzer; added it to the take list and a regression test at the analyze_codebase layer. Also lands multi-path args: `health-report <path> [subpath ...]` restricts the walk to given subpaths (e.g. `. lib test` skips priv/assets/config) while git context stays anchored at <path>.

Per-node leave-one-out previously parallelized only over files (Task.async_stream), running a file's nodes serially. A few large files (hundreds of sub-nodes each) then grind single-threaded while other cores idle. Split the work into three phases: prepare (tokenize/parse/index, per file), a single shared pool over every node of every file, and reconstruct (rebuild each tree, per file). Now the hundreds of nodes of one large file compete with all other nodes for the same worker pool. The node tree is flattened into indexed work units and rebuilt from the results keyed by a globally-unique index path (prefixed with the file path — a file-local index collides across files, since every file has a top-level node at index 0, which let parallel completion order pick a different file's result for concurrency >= 2). Guard added: node results are bit-identical between workers: 1 and workers: 8. Telemetry-measured on the assets sample: 25.8s -> 15.4s (and 42.5s -> 15.4s combined with the byte cap).

github-actions · 2026-06-10T21:24:47Z

Score: C+ → C+ | Δ -1 pts | 0 blocks flagged across 8 files | 8 modified, 0 added

🟠 Code Health: C+ (63/100)

193 files · codeqa-action · 2026-06-10

Combined metric scores use cosine similarity: +1 = metric profile perfectly matches healthy pattern for this behavior, 0 = no signal, −1 = anti-pattern detected. Mapped to 0–100 using breakpoints (approx: ≥0.5→A, ≥0.2→B, ≥0.0→C, ≥−0.3→D, <−0.3→F); actual letter grades use the full 15-step scale.

Metric Changes

Category	Base	Head	Δ
Readability	88.55	97.81	+9.25
Complexity	30.50	41.58	+11.07
Duplication	0.57	0.59	+0.02
Structure	6.22	9.23	+3.01

%%{init: {'theme': 'neutral'}}%%
xychart-beta
    title "Code Health Scores"
    x-axis ["Readability", "Complexity", "Structure", "Duplication", "Naming", "Magic Numbers", "Combined Metrics"]
    y-axis "Score" 0 --> 100
    bar [94, 30, 87, 48, 96, 100, 65]

Readability       ███████████████████░   94  🟢 A
Complexity        ██████░░░░░░░░░░░░░░   30  🔴 D-
Structure         █████████████████░░░   87  🟢 A-
Duplication       ██████████░░░░░░░░░░   48  🟠 C-
Naming            ███████████████████░   96  🟢 A
Magic Numbers     ████████████████████  100  🟢 A
Combined Metrics  █████████████░░░░░░░   65  🔴 D

github-actions · 2026-06-10T21:24:48Z

🔍 Top Likely Issues (cosine similarity)

Most negative cosine = file's metric profile best matches this anti-pattern.

Behavior	Cosine	Score
`dependencies.low_coupling`	-0.56	-12.62
`file_structure.single_responsibility`	-0.51	-12.40
`file_structure.line_count_under_300`	-0.44	-9.53
`code_smells.no_dead_code_after_return`	-0.40	-22.77
`scope_and_assignment.shadowed_by_inner_scope`	-0.34	-5.08
`file_structure.line_length_under_120`	-0.30	-8.36
`variable_naming.loop_var_is_single_letter`	-0.23	3.49
`type_and_value.no_implicit_null_initial`	-0.21	-14.34
`variable_naming.name_contains_and`	-0.21	-36.20
`variable_naming.name_contains_type_suffix`	-0.20	-1.61

🟢 Readability — A (94/100)

Codebase averages: flesch_adapted=97.81, fog_adapted=4.83, avg_tokens_per_line=9.55, avg_line_length=35.77

Metric	Value	Score
readability.flesch_adapted	97.81	100
readability.fog_adapted	4.83	100
readability.avg_tokens_per_line	9.55	72
readability.avg_line_length	35.77	100

🔴 Complexity — D- (30/100)

Codebase averages: difficulty=41.58, effort=243496.46, volume=4126.56, estimated_bugs=1.38

Metric	Value	Score
halstead.difficulty	41.58	41
halstead.effort	243496.46	0
halstead.volume	4126.56	45
halstead.estimated_bugs	1.38	46

🟢 Structure — A- (87/100)

Codebase averages: branching_density=0.14, mean_depth=3.86, avg_function_lines=8.30, max_depth=9.21, max_function_lines=20.03, variance=6.84, avg_param_count=1.16, max_param_count=2.07

Metric	Value	Score
branching.branching_density	0.14	76
indentation.mean_depth	3.86	88
function_metrics.avg_function_lines	8.30	89
indentation.max_depth	9.21	87
function_metrics.max_function_lines	20.03	90
indentation.variance	6.84	100
function_metrics.avg_param_count	1.16	100
function_metrics.max_param_count	2.07	100

🟠 Duplication — C- (48/100)

Codebase averages: redundancy=0.59, bigram_repetition_rate=0.54, trigram_repetition_rate=0.37

Metric	Value	Score
compression.redundancy	0.59	58
ngram.bigram_repetition_rate	0.54	38
ngram.trigram_repetition_rate	0.37	40

🟢 Naming — A (96/100)

Codebase averages: entropy=0.89, mean=6.64, variance=18.82, avg_sub_words_per_id=1.17

Metric	Value	Score
casing_entropy.entropy	0.89	100
identifier_length_variance.mean	6.64	100
identifier_length_variance.variance	18.82	85
readability.avg_sub_words_per_id	1.17	100

🟢 Magic Numbers — A (100/100)

Codebase averages: density=0.00

Metric	Value	Score
magic_number_density.density	0.00	100

🔴 Combined Metrics — D (65/100)

Category	Score	Grade
Code Smells	26	🔴 D-
Consistency	81	🟡 B+
Dependencies	19	🔴 E+
Documentation	83	🟡 B+
Error Handling	91	🟢 A-
File Structure	48	🟠 C-
Function Design	81	🟡 B+
Naming Conventions	90	🟢 A-
Scope And Assignment	28	🔴 D-
Testing	83	🟡 B+
Type And Value	89	🟢 A-
Variable Naming	74	🟡 B

🔴 Code Smells — D- (26/100)

Cosine similarity scores for 1 behaviors.

Behavior	Cosine	Score	Grade
no_dead_code_after_return	-0.40	26	D-

🟡 Consistency — B+ (81/100)

Cosine similarity scores for 1 behaviors.

Behavior	Cosine	Score	Grade
consistent_function_style	0.36	81	B+

🔴 Dependencies — E+ (19/100)

Cosine similarity scores for 1 behaviors.

Behavior	Cosine	Score	Grade
low_coupling	-0.56	19	E+

🟡 Documentation — B+ (83/100)

Cosine similarity scores for 3 behaviors.

Behavior	Cosine	Score	Grade
file_has_module_docstring	0.30	77	B
function_has_docstring	0.45	86	A-
docstring_is_nonempty	0.45	87	A-

🟢 Error Handling — A- (91/100)

Cosine similarity scores for 3 behaviors.

Behavior	Cosine	Score	Grade
error_message_is_descriptive	0.45	87	A-
does_not_swallow_errors	0.60	92	A-
returns_typed_error	0.69	94	A

🟠 File Structure — C- (48/100)

Cosine similarity scores for 5 behaviors.

Behavior	Cosine	Score	Grade
single_responsibility	-0.51	21	E+
line_count_under_300	-0.44	24	E+
line_length_under_120	-0.30	30	D-
has_consistent_indentation	0.26	74	B
no_magic_numbers	0.57	91	A-

🟡 Function Design — B+ (81/100)

Cosine similarity scores for 3 behaviors.

Behavior	Cosine	Score	Grade
is_less_than_20_lines	0.33	79	B+
no_magic_numbers	0.38	82	B+
has_verb_in_name	0.40	83	B+

🟢 Naming Conventions — A- (90/100)

Cosine similarity scores for 1 behaviors.

Behavior	Cosine	Score	Grade
function_name_is_not_single_word	0.50	90	A-

🔴 Scope And Assignment — D- (28/100)

Cosine similarity scores for 1 behaviors.

Behavior	Cosine	Score	Grade
shadowed_by_inner_scope	-0.34	28	D-

🟡 Testing — B+ (83/100)

Cosine similarity scores for 2 behaviors.

Behavior	Cosine	Score	Grade
test_single_concept	0.27	74	B
test_name_describes_behavior	0.53	91	A-

🟢 Type And Value — A- (89/100)

Cosine similarity scores for 1 behaviors.

Behavior	Cosine	Score	Grade
hardcoded_url_or_path	0.49	89	A-

🟡 Variable Naming — B (74/100)

Cosine similarity scores for 1 behaviors.

Behavior	Cosine	Score	Grade
name_is_generic	0.26	74	B

github-actions · 2026-06-10T21:24:49Z

kind: refactoring-tasks
path: /home/runner/work/codeqa-action/codeqa-action
timestamp: 2026-06-10T21:24:25.959377Z
overall_grade: C+
overall_score: 63
task_count: 0
critical: 0
high: 0
instructions: >-
Address the tasks below in order of severity (critical first).
After each fix, run the project's test suite and confirm it passes
before moving on.

No critical or high-severity blocks need attention. ✅

At ~3000 nodes the node pool drove peak memory to ~54GB and a hard slowdown cliff (240 files: 103s). Two causes, both fixed: 1. Each work unit carried its file's full node_ctx (content, tokens, cosines). Dispatching a unit to a worker copies the message, so the node_ctx was copied once PER NODE. Units now carry only the file key; the node_ctx lives in a per-file map captured once per Flow stage — O(stages) copies instead of O(nodes). Dropped the unused root_tokens from node_ctx while here. 2. The incremental aggregate and project languages were rebuilt per file over all file_results — O(files^2). Built once now and shared. Phase B switched from Task.async_stream to Flow.from_enumerable with max_demand, bounding in-flight units per stage (backpressure). Telemetry-measured at 240 files: 54GB -> 10.8GB peak, 103s -> 17s (6.1x). Output stays bit-identical (parallel == serial guard).

github-actions · 2026-06-10T21:41:13Z

kind: refactoring-tasks
path: /home/runner/work/codeqa-action/codeqa-action
timestamp: 2026-06-10T21:40:41.240500Z
overall_grade: C+
overall_score: 63
task_count: 0
critical: 0
high: 0
instructions: >-
Address the tasks below in order of severity (critical first).
After each fix, run the project's test suite and confirm it passes
before moving on.

No critical or high-severity blocks need attention. ✅

Phase B flat-mapped all prep units into one list before handing it to Flow, materializing the entire unit set (~3000 maps at 240 files) up front. Switched to Stream.flat_map so units are pulled lazily under Flow's max_demand backpressure. Telemetry-measured at 240 files: peak memory 10.8GB -> 7.1GB. Output stays bit-identical (parallel == serial guard).

github-actions · 2026-06-10T21:44:51Z

aspala added 2 commits June 10, 2026 22:51

aspala merged commit f5d62fd into main Jun 10, 2026
8 checks passed

aspala deleted the perf/loo-byte-cap-and-multipath branch June 10, 2026 21:51

Stand	Zeit
Original	42.5s
+ Byte-Cap	25.8s
+ Node-Parallelität	15.4s (−64%)

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(block-impact): byte cap + node-level parallelism (42.5s→15.4s)#67

perf(block-impact): byte cap + node-level parallelism (42.5s→15.4s)#67
aspala merged 4 commits into
mainfrom
perf/loo-byte-cap-and-multipath

aspala commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

aspala commented Jun 10, 2026

Was & Warum

1. Byte-Cap für LOO (konfigurierbar)

2. Node-Level-Parallelität

3. Multi-Path-Args

Messung (assets-Sample, 33 Files)

Tests

Uh oh!

github-actions Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟠 Code Health: C+ (63/100)

Metric Changes

Uh oh!

github-actions Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 10, 2026 •

edited

Loading

github-actions Bot commented Jun 10, 2026 •

edited

Loading