Self-deduplication for ActionsDAG by yariks5s · Pull Request #106113 · ClickHouse/ClickHouse · GitHub
Skip to content

Self-deduplication for ActionsDAG#106113

Merged
yariks5s merged 30 commits into
masterfrom
yarik/self-dedup
Jun 18, 2026
Merged

Self-deduplication for ActionsDAG#106113
yariks5s merged 30 commits into
masterfrom
yarik/self-dedup

Conversation

@yariks5s

@yariks5s yariks5s commented May 29, 2026

Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Provides the logic to remove duplicate calculations during query execution in different scenarios.

Version info

  • Merged into: 26.6.1.982

@clickhouse-gh

clickhouse-gh Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label May 29, 2026
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Processors/QueryPlan/Optimizations/mergeExpressions.cpp
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
@george-larionov george-larionov self-assigned this May 29, 2026
Comment thread tests/queries/0_stateless/04300_dedup_subtrees_on_filter_merge.sql Outdated
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread tests/queries/0_stateless/04300_dedup_subtrees_on_filter_merge.sql Outdated

@george-larionov george-larionov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me overall (I saw you fixed the final (hopefully) issue that the bot found). However I think it is worth it to add a performance test to show some perf gains for this improvement.

Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread tests/queries/0_stateless/04300_dedup_subtrees_on_filter_merge.sql Outdated
Comment thread tests/queries/0_stateless/04300_dedup_subtrees_on_filter_merge.sql Outdated
yariks5s and others added 2 commits May 30, 2026 00:32
Some functions consume their argument names (e.g. `formatRowNoNewline` puts
column names into JSON keys, `toTypeName` returns names). Looking through
ALIAS chains for those would falsely equate two calls over different aliases
of the same input.

Add an opt-in `isNameInsensitive()` on `IFunctionBase`. Default is false
(conservative). Overridden to true in `FunctionsComparison`, `FunctionsLogical`
and `FunctionBinaryArithmetic` since these depend only on argument values.

`buildStructuralEquivalenceClasses` now looks through aliases only when the
parent function declares `isNameInsensitive() = true`. Recovers the nested
SELECT optimization while keeping name-sensitive functions safe.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Resolved conflicts in favor of the new `isNameInsensitive` API while
keeping the remote's `query_plan_merge_filters=1` setting in the test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
COLUMN nodes with the same value but different names are not interchangeable
for name-sensitive parents (e.g. `formatRowNoNewline` uses argument names as
JSON keys). Defensive fix: include `result_name` in the COLUMN dedup key so
`SELECT 1 AS a, 1 AS b` stays as two distinct nodes.

Auto-named consts (like `0_UInt8` from `WHERE x > 0`) keep deduplicating as
before since their names also match.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread tests/queries/0_stateless/04300_dedup_subtrees_on_filter_merge.sql
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Functions/IFunction.h
};

using FunctionOverloadResolverPtr = std::shared_ptr<IFunctionOverloadResolver>;
/// mutable beacuse it's set after construction by FunctionFactory, resolvers themselves are otherwise immutable

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an option, we can change buildImpl to return std::shared_ptr<IFunctionBase> (or, even better, std::unique_ptr), and move it into FunctionBasePtr = std::shared_ptrduringbuild`.

I found about 35 places by searching. So let's do it in post-refactoring if needed.

Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
Comment thread src/Interpreters/ActionsDAG.cpp
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
@KochetovNicolai KochetovNicolai self-assigned this Jun 16, 2026
Comment thread src/Interpreters/ActionsDAG.h Outdated
Comment thread src/Functions/IFunction.h
Comment thread src/Interpreters/ActionsDAG.cpp
Comment thread src/Interpreters/ActionsDAG.cpp Outdated
@clickhouse-gh

clickhouse-gh Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.30% 85.30% +0.00%
Functions 92.30% 92.30% +0.00%
Branches 77.50% 77.50% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 206/213 (96.71%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 5 line(s) · Uncovered code

Full report · Diff report

@yariks5s yariks5s added this pull request to the merge queue Jun 18, 2026
Merged via the queue into master with commit 35f0bb2 Jun 18, 2026
484 of 487 checks passed
@yariks5s yariks5s deleted the yarik/self-dedup branch June 18, 2026 14:04
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants