Optimise AggregateFunctionSumMap functions family by jh0x · Pull Request #107554 · ClickHouse/ClickHouse · GitHub
Skip to content

Optimise AggregateFunctionSumMap functions family#107554

Open
jh0x wants to merge 11 commits into
ClickHouse:masterfrom
jh0x:jh-summap-typed
Open

Optimise AggregateFunctionSumMap functions family#107554
jh0x wants to merge 11 commits into
ClickHouse:masterfrom
jh0x:jh-summap-typed

Conversation

@jh0x

@jh0x jh0x commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Improves #92093

Also fixes: https://fiddle.clickhouse.com/38de3c8b-8da9-4af0-810f-223b3ec395fa

Changelog category (leave one):

  • Performance Improvement

Changelog entry:

Improved performance of sumMap, sumMapWithOverflow, sumMapFiltered, minMap, and maxMap by using typed aggregation paths for supported key and value types.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Decide if to keep it or drop it. Dropping it would change the serialized format - by entries being unsorted.

@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Jun 15, 2026
@clickhouse-gh

clickhouse-gh Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [4241774]

Summary:

job_name test_name status info comment
AST fuzzer (amd_debug, targeted) FAIL
Logical error: Different order of columns in UNION subquery: A and B (STID: 6860-66f4) FAIL cidb
Stress test (arm_release) FAIL
Logical error: Block structure mismatch in A stream: different columns: (STID: 0993-300c) FAIL cidb

AI Review

Summary

This PR replaces the Field-based hot path for sumMap, sumMapWithOverflow, sumMapFiltered, minMap, and maxMap with typed aggregation for supported key and value types, while keeping the generic fallback for unsupported and filter-sensitive cases. It also adds aggregate-state versioning, deterministic serialization order, and broad stateless plus performance coverage for numeric, string, decimal, float, nullable, filtered, and parallel-merge cases.

I did not find unresolved correctness, compatibility, or safety issues in the current diff. I also rechecked the prior bot findings against the current code; they have been addressed, and the remaining stale bot thread was resolved.

Missing context / blind spots

I did not run the new stateless or performance tests locally in this environment. Also, .git is read-only here, so I could not fetch and check out the PR head directly; I reviewed the GitHub PR diff for head 424177405a02353b3e9899184424ff679633212b together with the local PR merge checkout.

Final Verdict

Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-performance Pull request with some performance improvements label Jun 15, 2026
Comment thread src/AggregateFunctions/AggregateFunctionSumMapTyped.h Outdated
Comment thread src/AggregateFunctions/AggregateFunctionSumMapTyped.h
Comment thread src/AggregateFunctions/AggregateFunctionSumMapTyped.h Outdated
Comment thread src/AggregateFunctions/AggregateFunctionSumMap.cpp
@clickhouse-gh

clickhouse-gh Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

📊 Cloud Performance Report

✅ AI verdict: no_change — no significant changes across 38 queries analysed

This PR adds a typed fast-path for the sumMap/minMap/maxMap/sumMapFiltered aggregate family (plus a BFloat16 quiet_NaN helper); the private and public diffs are identical. None of the flagged queries — ClickBench Q22/Q23 and TPC-H Q4/Q7/Q20 — invoke those aggregate functions, so there is no plausible mechanism for this change to speed them up by 22-49%. The base-side measurements were noisy (CV up to ~88%) and several of these queries are flagged flaky or trending, consistent with run-to-run variance rather than a real PR effect. All five flagged improvements have been downgraded to not-sure.

clickbench

⚠️ 2 inconclusive

Flagged queries (2 of 43)
Query Verdict Baseline med (ms) PR med (ms) Change q-value Hint
⚠️ 22 not_sure 334 260 -22.0% <0.0001 Q22 doesn't call the sumMap/minMap/maxMap family this PR touches; base measurements very noisy (CV ~55%). Off-path.
⚠️ 23 not_sure 297 203 -31.6% <0.0001 Q23 doesn't use the sumMap aggregate family; PR only adds a typed fast-path there. No mechanism for a 32% SELECT speedup

q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on.

tpch_adapted_1_official

⚠️ 3 inconclusive

Flagged queries (3 of 22)
Query Verdict Baseline med (ms) PR med (ms) Change q-value Hint
⚠️ 4 not_sure 3015 2244 -25.6% 0.0003 TPC-H Q4 uses no sumMap/array aggregation; this PR is off-path. Trending history + 26% base CV point to variance.
⚠️ 7 not_sure 113 72 -36.3% <0.0001 TPC-H Q7 (joins+sum/agg) never calls the sumMap family this PR changes. Query flagged noisy — off-path delta.
⚠️ 20 not_sure 248 127 -48.8% <0.0001 TPC-H Q20 doesn't exercise the sumMap path; base CV ~88% and noisy history make the 49% delta variance.

q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on.

Debug info
  • StressHouse run: 0506e7ff-0662-4732-b8ac-a93ca972a551
  • MIRAI run: 20a63bc0-3aa8-4d65-90b0-f7c4bb39b5ea
  • PR check IDs:
    • clickbench_65762_1781818301
    • clickbench_65775_1781818301
    • clickbench_65782_1781818301
    • tpch_adapted_1_official_65792_1781818301
    • tpch_adapted_1_official_65800_1781818301
    • tpch_adapted_1_official_65862_1781818301

Comment thread src/AggregateFunctions/AggregateFunctionSumMapTyped.h
Comment thread src/AggregateFunctions/AggregateFunctionSumMap.cpp Outdated
Comment thread src/AggregateFunctions/AggregateFunctionSumMapTyped.h
Comment thread src/AggregateFunctions/AggregateFunctionSumMapTyped.h Outdated
@clickhouse-gh

clickhouse-gh Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.20% 85.30% +0.10%
Functions 92.60% 92.70% +0.10%
Branches 77.50% 77.50% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 755/844 (89.45%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 58 line(s) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants