perf(block-impact): byte cap + node-level parallelism (42.5s→15.4s)#67
Conversation
LOO is O(file_bytes) per node; a few large/generated files (lockfiles, bundled assets) dominate runtime. Add a byte cap (default 32KB, CLI --max-loo-file-bytes, YAML max_loo_file_bytes, nil disables) — files over the cap get no refactoring nodes but still flow into the codebase aggregate and grade. Telemetry-measured: assets sample 42.5s -> 25.8s. The cap was being filtered out in Analyzer.do_analyze_codebase's Keyword.take before reaching BlockImpactAnalyzer; added it to the take list and a regression test at the analyze_codebase layer. Also lands multi-path args: `health-report <path> [subpath ...]` restricts the walk to given subpaths (e.g. `. lib test` skips priv/assets/config) while git context stays anchored at <path>.
Per-node leave-one-out previously parallelized only over files (Task.async_stream), running a file's nodes serially. A few large files (hundreds of sub-nodes each) then grind single-threaded while other cores idle. Split the work into three phases: prepare (tokenize/parse/index, per file), a single shared pool over every node of every file, and reconstruct (rebuild each tree, per file). Now the hundreds of nodes of one large file compete with all other nodes for the same worker pool. The node tree is flattened into indexed work units and rebuilt from the results keyed by a globally-unique index path (prefixed with the file path — a file-local index collides across files, since every file has a top-level node at index 0, which let parallel completion order pick a different file's result for concurrency >= 2). Guard added: node results are bit-identical between workers: 1 and workers: 8. Telemetry-measured on the assets sample: 25.8s -> 15.4s (and 42.5s -> 15.4s combined with the byte cap).
🟠 Code Health: C+ (63/100)
Metric Changes
%%{init: {'theme': 'neutral'}}%%
xychart-beta
title "Code Health Scores"
x-axis ["Readability", "Complexity", "Structure", "Duplication", "Naming", "Magic Numbers", "Combined Metrics"]
y-axis "Score" 0 --> 100
bar [94, 30, 87, 48, 96, 100, 65]
|
🔍 Top Likely Issues (cosine similarity)
🟢 Readability — A (94/100)Codebase averages: flesch_adapted=97.81, fog_adapted=4.83, avg_tokens_per_line=9.55, avg_line_length=35.77
🔴 Complexity — D- (30/100)Codebase averages: difficulty=41.58, effort=243496.46, volume=4126.56, estimated_bugs=1.38
🟢 Structure — A- (87/100)Codebase averages: branching_density=0.14, mean_depth=3.86, avg_function_lines=8.30, max_depth=9.21, max_function_lines=20.03, variance=6.84, avg_param_count=1.16, max_param_count=2.07
🟠 Duplication — C- (48/100)Codebase averages: redundancy=0.59, bigram_repetition_rate=0.54, trigram_repetition_rate=0.37
🟢 Naming — A (96/100)Codebase averages: entropy=0.89, mean=6.64, variance=18.82, avg_sub_words_per_id=1.17
🟢 Magic Numbers — A (100/100)Codebase averages: density=0.00
🔴 Combined Metrics — D (65/100)
🔴 Code Smells — D- (26/100)
🟡 Consistency — B+ (81/100)
🔴 Dependencies — E+ (19/100)
🟡 Documentation — B+ (83/100)
🟢 Error Handling — A- (91/100)
🟠 File Structure — C- (48/100)
🟡 Function Design — B+ (81/100)
🟢 Naming Conventions — A- (90/100)
🔴 Scope And Assignment — D- (28/100)
🟡 Testing — B+ (83/100)
🟢 Type And Value — A- (89/100)
🟡 Variable Naming — B (74/100)
|
kind: refactoring-tasks
|
At ~3000 nodes the node pool drove peak memory to ~54GB and a hard slowdown cliff (240 files: 103s). Two causes, both fixed: 1. Each work unit carried its file's full node_ctx (content, tokens, cosines). Dispatching a unit to a worker copies the message, so the node_ctx was copied once PER NODE. Units now carry only the file key; the node_ctx lives in a per-file map captured once per Flow stage — O(stages) copies instead of O(nodes). Dropped the unused root_tokens from node_ctx while here. 2. The incremental aggregate and project languages were rebuilt per file over all file_results — O(files^2). Built once now and shared. Phase B switched from Task.async_stream to Flow.from_enumerable with max_demand, bounding in-flight units per stage (backpressure). Telemetry-measured at 240 files: 54GB -> 10.8GB peak, 103s -> 17s (6.1x). Output stays bit-identical (parallel == serial guard).
kind: refactoring-tasks
|
Phase B flat-mapped all prep units into one list before handing it to Flow, materializing the entire unit set (~3000 maps at 240 files) up front. Switched to Stream.flat_map so units are pulled lazily under Flow's max_demand backpressure. Telemetry-measured at 240 files: peak memory 10.8GB -> 7.1GB. Output stays bit-identical (parallel == serial guard).

Was & Warum
health-report --view actionsohne--base-reflief auf großen Repos minutenlang. Telemetrie (--telemetry, war schon eingebaut) zeigte: block_impact = 96% der Zeit, und die per-Node LOO ist O(file_bytes) — wenige große/generierte Files dominieren.Drei Hebel, jeder telemetrie-verifiziert:
1. Byte-Cap für LOO (konfigurierbar)
Files über
max_loo_file_bytes(Default 32 KB, CLI--max-loo-file-bytes, YAMLmax_loo_file_bytes,nil= aus) bekommen keine refactoring-Nodes mehr — fließen aber weiter ins Codebase-Aggregat & die Note. Lockfiles, gebundelte Assets etc. dominierten die Laufzeit und sind nie refactoring-Ziele.Dabei einen Pass-through-Bug gefixt: der Cap wurde in
Analyzer.do_analyze_codebase'sKeyword.takerausgefiltert, bevor erBlockImpactAnalyzererreichte. Regressions-Test auf deranalyze_codebase-Ebene.2. Node-Level-Parallelität
Vorher parallelisierte nur die File-Ebene; die Nodes eines Files liefen seriell — wenige große Files (hunderte Sub-Nodes) grindeten single-threaded, während andere Cores idleten. Jetzt drei Phasen: prepare (tokenize/parse/index) → ein gemeinsamer Pool über alle Nodes aller Files → reconstruct.
Dabei einen ID-Kollisions-Bug gefunden & gefixt: file-lokale Index-Pfade kollidierten im geteilten Pool (jedes File hat einen Top-Level-Node bei Index 0) → bei concurrency≥2 deterministisch falsche cosine-Werte. ID jetzt mit File-Pfad geprefixt. Guard: Node-Output bit-identisch zwischen
workers: 1undworkers: 8.3. Multi-Path-Args
health-report <path> [subpath ...]schränkt die Analyse auf Subpaths ein (. lib testskippt priv/assets/config); Git-Kontext bleibt am<path>verankert.Messung (assets-Sample, 33 Files)
Tests
937/0, credo --strict clean. Neue Guards: byte-cap pass-through, parallel==serial bit-Identität.