Add `eval` table function by niyue · Pull Request #104396 · ClickHouse/ClickHouse · GitHub
Skip to content

Add eval table function#104396

Open
niyue wants to merge 50 commits into
ClickHouse:masterfrom
niyue:feat/eval-table-func
Open

Add eval table function#104396
niyue wants to merge 50 commits into
ClickHouse:masterfrom
niyue:feat/eval-table-func

Conversation

@niyue

@niyue niyue commented May 8, 2026

Copy link
Copy Markdown
Contributor

Description

  • This PR addresses Table function eval #101293
  • Add a table function eval for constructing a query from a constant expression or another SELECT query.

Changelog category (leave one):

  • Experimental Feature

Changelog entry:

  • Add experimental table function eval, which evaluates a constant expression or single-row SELECT query to a query string and executes the resulting single SELECT query. The feature is disabled by default and can be enabled with setting allow_experimental_eval_table_function.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@niyue niyue mentioned this pull request May 8, 2026
@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label May 8, 2026
@clickhouse-gh

clickhouse-gh Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-experimental Experimental Feature label May 8, 2026
Comment thread src/TableFunctions/TableFunctionEval.cpp Outdated

@alexey-milovidov alexey-milovidov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest LGTM

@alexey-milovidov alexey-milovidov self-assigned this May 8, 2026
@niyue niyue force-pushed the feat/eval-table-func branch 2 times, most recently from 962cf52 to 003efb1 Compare May 9, 2026 02:07
@niyue

niyue commented May 9, 2026

Copy link
Copy Markdown
Contributor Author

Hi @alexey-milovidov,

I've addressed the review comments above. The CI is still failing, but the remaining failed tests appear to be unrelated to this change. Could you please take another look when you have a moment? Thanks.

@niyue niyue force-pushed the feat/eval-table-func branch from 003efb1 to 5f730c6 Compare May 9, 2026 13:16
niyue added 3 commits May 16, 2026 23:20
…n or a subquery returning single string row.
Run the normal set-operation normalization visitors on the generated `SELECT` before passing it to `StorageView`.

This keeps generated `UNION ALL`, `UNION DISTINCT`, `INTERSECT`, and `EXCEPT` queries on the same path as regular queries and avoids planner exceptions from `UNION_DEFAULT`.
…ntalSettings`, add `eval` to the aspell dictionary, and make nested `eval` detection use a case-sensitive comparison.
@niyue niyue force-pushed the feat/eval-table-func branch from 5f730c6 to 86e74bb Compare May 16, 2026 15:24
Allow `StorageView` to assign a temporary table identifier for the `eval` table function before replacing it with the generated subquery. This avoids a `LOGICAL_ERROR` from `StorageView::replaceWithSubquery` for queries such as `SELECT * FROM eval(...)`.
Comment thread src/TableFunctions/TableFunctionEval.cpp Outdated
Avoid resolving nested `eval` table function arguments on the initiator when they are used under `remote` or `cluster`. Like `view`, `eval` can generate a `SELECT` that references tables available only on shards.

Reuse the `view` distributed rewrite path for table functions that expose an inner `SELECT`, and add an integration regression test for a remote-only table.

Addresses reviewer feedback on ClickHouse#104396 (comment)
Comment thread src/Parsers/ExpressionListParsers.cpp
Permit `eval(SELECT ...)` in `QueryNormalizer` for the old analyzer path, matching the parser and `TableFunctionEval` behavior.

Add a stateless regression test with `enable_analyzer` disabled.

Addresses reviewer feedback on ClickHouse#104396 (comment)
Comment thread docs/en/sql-reference/table-functions/eval.md Outdated
Document that `eval` is normally analyzed on the initiator, but nested usage under `remote` or `cluster` is resolved on remote shards.

Addresses reviewer feedback on ClickHouse#104396 (comment)
@alexey-milovidov

Copy link
Copy Markdown
Member

This was fixed by #105146. Let's update the branch.

alexey-milovidov and others added 6 commits May 17, 2026 20:54
Drop the old `kql` table function layer left in `ExpressionListParsers.cpp` after merging `master`. The layer referenced removed `ParserKQLTableFunction` code and broke the parser build, while current `master` parses `kql` through the normal table function path.

Issue: ClickHouse#101293

Tests:
- `ninja -C build clickhouse_parsers`
- `git diff --check -- src/Parsers/ExpressionListParsers.cpp`
- `build/programs/clickhouse local --multiquery < tests/queries/0_stateless/04241_kql_array_index_no_oom.sql`
# Conflicts:
#	ci/jobs/scripts/check_style/aspell-ignore/en/aspell-dict.txt
alexey-milovidov and others added 2 commits June 18, 2026 07:23
The AST fuzzer found that an `eval` argument containing a bare `SELECT`
query node — e.g. a nested `eval(SELECT ...)` used as a scalar
sub-expression — made `evaluateConstantQueryText` call `evaluateConstantExpression`,
which in turn calls `IAST::getColumnName` on the bare `ASTSelectWithUnionQuery`.
That node has no column name, so it raised `Trying to get name of not a column:
SelectWithUnionQuery` (`LOGICAL_ERROR`).

Reject such arguments up front with a clear `BAD_ARGUMENTS` exception. A
properly wrapped scalar subquery (`ASTSubquery`) remains a valid part of a
constant expression and is left untouched.

CI report (AST fuzzer, amd_debug, targeted):
ClickHouse#104396

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`04210_eval_table_function` and `04324_eval_table_function_udf_name` were
added assuming the new analyzer is the default and failed in the
`old analyzer` CI job:

* `04210`: `SELECT count() FROM eval('SELECT number FROM numbers(3) SETTINGS
  limit = 1')` returned `3` instead of `1`, because the `limit` setting
  embedded in the generated query is only scoped to the inner query under the
  new analyzer. Pin that query to `enable_analyzer = 1`.
* `04324`: `SELECT eval(1) FORMAT TSVWithNames` produced the column name
  `plus(1, 1)` instead of `eval(1)`, and `eval('SELECT count() FROM
  numbers(eval(1))')` threw `BAD_ARGUMENTS` because the old analyzer does not
  fold the SQL UDF `eval` inside a table-function argument. Make the
  new-analyzer baseline explicit with `SET enable_analyzer = 1`; the
  old-analyzer column name is still checked by the explicit `SET
  enable_analyzer = 0` case.

Also add a regression case for the nested-argument `LOGICAL_ERROR` fixed in the
previous commit.

CI reports (Stateless tests, old analyzer):
ClickHouse#104396

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Storages/StorageDistributed.cpp Outdated
niyue added 3 commits June 18, 2026 23:13
Expose `ITableFunction::hasShardSideResolvedQueryArguments` and use it in distributed query analysis and serialized-plan selection instead of hard-coded `view` and `eval` checks.

Mark `view`, `viewIfPermitted`, and `eval` as requiring shard-side query argument resolution, and add remote `viewIfPermitted` coverage.
Apply `ITableFunction::hasShardSideResolvedQueryArguments` in the normal analyzer path for nested table-function arguments. This keeps functions such as `viewIfPermitted` from resolving shard-only subqueries on the initiator when used under `remote`.

Add a regression test for `remote` with `viewIfPermitted` and the analyzer enabled.
Comment thread src/TableFunctions/TableFunctionEval.cpp
Reject generated `SELECT` queries with `FORMAT`, `INTO OUTFILE`, or output compression options in table function `eval`, because these top-level output options are not handled when the query is executed through `StorageView`.

Add stateless coverage for `FORMAT Null` and `INTO OUTFILE` generated queries.
Comment thread src/Analyzer/Resolve/QueryAnalyzer.cpp Outdated
Comment thread src/TableFunctions/TableFunctionEval.cpp Outdated
niyue and others added 4 commits June 23, 2026 17:36
Apply `SETTINGS` from the query generated by `eval` while inferring
the schema and executing the backing `StorageView`.
Detect table functions with shard-side resolved query arguments through
table-expression arguments, so wrappers such as `loop(eval(...))` keep
the generated query on the shard side.
…esHistory section

After merging `master`, the development version advanced from 26.6 to 26.7. The
`02995_new_settings_history` test compares against a v26.6.1 baseline and only
treats a setting as documented when its `SettingsChangesHistory` entry has a
version strictly greater than 26.6. The `allow_experimental_eval_table_function`
entry was in the 26.6 section, so the test would report
`PLEASE ADD THE NEW SETTING TO SettingsChangesHistory.cpp`. Move it to the 26.7
section to reflect the version it is actually introduced in.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Merged the latest master into the branch (it was ~3000 commits / 8 days behind) and fixed the one post-merge issue.

Post-merge fix: the development version advanced from 26.6 to 26.7 while this branch was open, and 02995_new_settings_history compares against a v26.6.1 baseline — it only treats a setting as documented when its SettingsChangesHistory entry has a version strictly greater than 26.6. The allow_experimental_eval_table_function entry was in the 26.6 section, so the test would have reported PLEASE ADD THE NEW SETTING TO SettingsChangesHistory.cpp. Moved it to the 26.7 section.

The merge itself was conflict-free. I verified all 12 changed translation units (including the new TableFunctionEval.cpp) still pass -fsyntax-only against a current master build, so there is no API drift from the 8 days of master movement.

Previous CI failures were all unrelated to this PR:

  • Integration tests (amd_asan_ubsan, db disk, old analyzer, 5/6): the job was externally canceled (The operation was canceled.) after a ~43-minute hang in an unrelated test. All test_eval_table_function cases passed before the hang.
  • Stress test (amd_tsan) / Hung check failed, possible deadlock found: chronic flake #107941 (1326 hits across 539 PRs incl. master in the last 20 days).
  • AST fuzzer (amd_debug) / Block structure mismatch in A stream (STID 0993-27f0): chronic flake #108142 (205 hits across 151 PRs incl. master).

CI will re-run on the updated branch.

Comment thread src/Analyzer/Resolve/QueryAnalyzer.cpp
The shard-side wrapper skip path in `QueryAnalyzer::resolveTableFunction`
deferred a wrapper such as `loop(eval(q))` to the remote shard without first
resolving the inner `eval` table function's non-query expression arguments.

The direct case `remote(..., eval(q))` already resolves `q` on the initiator
before deferring, because the outer `WITH` alias list is stripped when the
table function AST is sent to the shard. But the wrapper case
`remote(..., loop(eval(q)))` took the `hasShardSideResolvedTableFunctionArguments`
branch and shipped `eval(q)` to the shard with `q` still unresolved, so the
shard raised `UNKNOWN_IDENTIFIER` even though direct `eval(q)` and literal
`loop(eval('...'))` are both supported.

Recursively resolve the non-query expression arguments of any `eval` nested
inside shard-side wrappers before taking the skip path, descending through the
table-expression arguments of the wrapper (so `loop(loop(eval(q)))` is handled
too). Add focused distributed regressions for `remote('remote', loop(eval(q)))`
with and without `serialize_query_plan = 1`.

Addresses the AI review finding on `src/Analyzer/Resolve/QueryAnalyzer.cpp`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Addressed the AI review "Request changes" finding on `src/Analyzer/Resolve/QueryAnalyzer.cpp` in 82cc5e9.

Problem: the shard-side wrapper skip path deferred a wrapper such as `loop(eval(q))` to the remote shard without first resolving the inner `eval`'s non-query expression arguments. The direct case `remote(..., eval(q))` already resolves `q` on the initiator (the outer `WITH` alias list is stripped when the AST is shipped to the shard), but `remote(..., loop(eval(q)))` took the `hasShardSideResolvedTableFunctionArguments` branch and sent `eval(q)` to the shard with `q` unresolved → `UNKNOWN_IDENTIFIER`.

Fix: before taking the wrapper skip path, recursively resolve the non-query expression arguments of any `eval` nested inside the wrapper, descending through the wrapper's table-expression arguments (so `loop(loop(eval(q)))` is covered too).

Tests: added `test_remote_loop_eval_resolves_with_alias_argument_on_initiator` and its `serialize_query_plan = 1` variant to `test_eval_table_function`.

Verification (local debug build, single node acting as both initiator and shard via `remote('127.0.0.1:9100', ...)`):

  • `loop(eval(q))`, `loop(eval(q))` + `serialize_query_plan = 1`, `loop(eval(a || b))`, and `loop(loop(eval(q)))` all return the expected row (previously `UNKNOWN_IDENTIFIER`).
  • Direct `eval(q)` / `eval('literal')` / `loop(eval('literal'))` baselines still return the expected row.
  • Stateless tests `04210_eval_table_function`, `04241_eval_table_function_old_analyzer`, `04324_eval_table_function_udf_name` pass against the built binary.

CI: the only red was `Stress test (amd_tsan)`/`(arm_tsan)` → "Hung check failed, possible deadlock found", which the CI summary itself links to the open master-wide flake #107941; all builds and the rest of the suite were green. A fresh CI run is triggered by this push.

Comment thread src/Storages/StorageView.cpp
…ce and execution

The `eval` table function builds its inner `StorageView` from a generated query
that may carry its own `SETTINGS enable_analyzer = ...`. The generated query's
structure is inferred in `TableFunctionEval::getActualTableStructure` using that
inner context (settings copied from the generated query), but
`StorageView::readImpl` selected the interpreter (`InterpreterSelectQueryAnalyzer`
vs `InterpreterSelectWithUnionQuery`) from the outer query's context.

For a generated `SELECT ... SETTINGS enable_analyzer = 0` under an
analyzer-enabled outer query, the structure was therefore inferred with the old
analyzer while execution used the new analyzer. The two disagree on, for
example, scalar subquery column names (`_subquery1` for the old analyzer vs
`_subquery_1` for the new one), so the executed result could not be converted to
the inferred structure.

Build the view's execution context first and select the analyzer mode from it.
For ordinary views the inner and outer settings are the same, so behaviour is
unchanged; for `eval` the execution path now matches the schema-inference path.

Added regression `04491_eval_table_function_analyzer_mode`.

Addresses the AI review "Request changes" finding on `src/Storages/StorageView.cpp`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Addressed the AI review "Request changes" finding on src/Storages/StorageView.cpp in a7c5299.

Problem: eval builds its inner StorageView from a generated query that may carry its own SETTINGS enable_analyzer = .... The structure is inferred in TableFunctionEval::getActualTableStructure using that inner context, but StorageView::readImpl selected the interpreter (InterpreterSelectQueryAnalyzer vs InterpreterSelectWithUnionQuery) from the outer query's context. So a generated SELECT ... SETTINGS enable_analyzer = 0 under an analyzer-enabled outer query was inferred with the old analyzer but executed with the new one — and the two disagree on, for example, scalar subquery column names (_subquery1 vs _subquery_1), so the executed result could not be converted to the inferred structure.

Fix: build the view's execution context first and select the analyzer mode from view_context->getSettingsRef()[allow_experimental_analyzer]. For eval this is the generated query's context — the same one schema inference uses — so the two paths now agree. For ordinary views the inner and outer settings are identical, so behaviour is unchanged.

This is the analyzer-enabled outer path only. Under the old analyzer an eval view is inlined via StorageView::replaceWithSubquery, so readImpl is not reached and the inverse direction (outer disables the analyzer, generated enables it) is handled by the inlining path, not here.

Test: added 04491_eval_table_function_analyzer_mode, which exercises a generated SETTINGS enable_analyzer = 0 under an analyzer-enabled outer query (returns 42; before the fix the converting step failed because of the _subquery1/_subquery_1 mismatch).

The one CI red on 82cc5e9 was Unit tests (msan, function_prop_fuzzer) / FunctionsStress.stress, which failed on SELECT reinterpret(a, CAST('Decimal(18, 7)' AS String)) ... with function returned column of unexpected type or scale: Decimal64 instead of Decimal(18, 7). That is an unrelated reinterpret-to-Decimal-scale regression on master (introduced by merged #108551), with a fix already in flight in #108878 — it has nothing to do with the eval table function. A fresh CI run is triggered by this push.

if (!function || !function->arguments || function->arguments->children.size() <= 1)
return {};

const auto * table_function_argument = function->arguments->children[1]->as<ASTFunction>();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only marks positional remote('addr', eval(...)) / cluster('cluster', eval(...)) arguments as table-expression arguments, but parseArguments also supports the named-collection override form remote(collection, database=<table_function>) and already treats database=merge(...) as a table function (see the existing 02046_remote_table_function_named_collections coverage).

In that syntax argument 1 is the equals(database, ...) wrapper, so this method returns {}. The analyzer then misses the new shard-side handling for eval / view under remote, and non-query eval arguments are not resolved before the AST is shipped to the shard. A query like WITH 'SELECT x FROM remote_only_eval_table' AS q SELECT * FROM remote(remote1, database=eval(q)) SETTINGS allow_experimental_eval_table_function = 1, enable_analyzer = 1 can still send eval(q) to the shard without the outer WITH binding and fail with UNKNOWN_IDENTIFIER, even though the positional remote(..., eval(q)) case is fixed.

Please extend the metadata/analyzer path to handle the named override table-function value as well, and add a regression for remote(named_collection, database=eval(q)) (and ideally the serialize_query_plan = 1 variant).

…c scalar subquery name)

The regression added in the previous commit used an unaliased scalar subquery
(`SELECT (SELECT 42)`) as the generated query's output column. Under the old
analyzer the column of an unaliased subquery is named `_subquery<N>` from a
global `static std::atomic_uint64_t subquery_index` in `QueryAliasesVisitor`,
so the name produced while inferring the structure in
`TableFunctionEval::getActualTableStructure` and the name produced again while
executing the inner query in `StorageView::readImpl` are different
(`_subquery102` vs another counter value). The two never match, so the test
failed with `Unknown column: _subquery102, there are only columns .`
(`UNKNOWN_IDENTIFIER`).

This is not specific to `eval`: a plain `CREATE VIEW v AS SELECT (SELECT 42)`
fails the same way under both analyzers, because view-structure inference and
view read are separate interpreter runs and the subquery counter advances
between them. It is therefore the wrong construct to test analyzer-mode
consistency with.

Replace it with a deterministic divergence that still exercises the
`StorageView::readImpl` analyzer-mode fix: `SELECT 1 AS a, abs(a)`. The old
analyzer substitutes the alias into the auto-generated name of the second
column (`abs(1)`), while the new analyzer keeps the alias (`abs(a)`); both
names are stable across runs (no global counter). With the generated query
pinning `enable_analyzer = 0` under an analyzer-enabled outer query, the
structure is inferred as `abs(1)`, so execution must use the old analyzer too.
Without the fix `readImpl` would re-analyze with the new analyzer, produce
`abs(a)`, and fail to resolve the inferred `abs(1)` column.

CI report (Fast test):
https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104396&sha=a7c529941d71bfda429053eafb90483df6ba1f2a&name_0=PR&name_1=Fast%20test
PR: ClickHouse#104396

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Fixed the Fast test failure on 04491_eval_table_function_analyzer_mode in dfb4ea6.

Root cause: the regression I added in the previous commit used an unaliased scalar subquery (SELECT (SELECT 42)) as the generated query's output column. Under the old analyzer the column of an unaliased subquery is named _subquery<N> from a global static std::atomic_uint64_t subquery_index in QueryAliasesVisitor, so the name produced while inferring the structure in TableFunctionEval::getActualTableStructure and the name produced again while executing the inner query in StorageView::readImpl are different counter values (_subquery102 vs another). They never match, so the test failed with Unknown column: _subquery102, there are only columns . (UNKNOWN_IDENTIFIER). This is not specific to eval — a plain CREATE VIEW v AS SELECT (SELECT 42) fails identically under both analyzers, because view-structure inference and view read are separate interpreter runs and the subquery counter advances between them. It was simply the wrong construct to test analyzer-mode consistency with.

Fix: replaced it with a deterministic divergence that still exercises the StorageView::readImpl analyzer-mode selection: SELECT 1 AS a, abs(a). The old analyzer substitutes the alias into the auto-generated name of the second column (abs(1)), while the new analyzer keeps the alias (abs(a)); both names are stable across runs. With the generated query pinning enable_analyzer = 0 under an analyzer-enabled outer query, the structure is inferred as abs(1), so execution must use the old analyzer too — without the readImpl fix it would re-analyze with the new analyzer, produce abs(a), and fail to resolve the inferred abs(1) column. Verified the behavioral facts on a current binary through the equivalent StorageView read path (create-old/read-new fails with UNKNOWN_IDENTIFIER, create-old/read-old works) and that the abs(1) name is stable across repeated runs.

Remaining AI-review finding (remote(collection, database=eval(q))): this is a real but niche gap. TableFunctionRemote::getTableExpressionArgumentIndexes only reports a direct second argument, so for the named-collection override form argument 1 is equals(database, eval(q)) and the method returns {}. Returning {1} alone would not fix it: the consumers (hasShardSideResolvedTableFunctionArguments, checkNoNestedEval, and the shard-side resolution in QueryAnalyzer.cpp) call …->as<ASTFunction>() on the indexed argument and would see equals, not a table function, so they also need to descend into the database=/db= wrapper to find the table-function value. As the review notes, this needs a representation richer than a flat argument index across the interface and all of its consumers — a cross-cutting change to the distributed-analyzer path. Left for a maintainer decision (and a follow-up with the remote(named_collection, database=eval(q)) regression, including the serialize_query_plan = 1 variant), since it touches several analyzer-critical translation units that need a full build to verify.

Comment thread src/Interpreters/ApplyWithSubqueryVisitor.cpp Outdated
`ApplyWithSubqueryVisitor` substituted `WITH` literal aliases into the
arguments of every `ASTFunction` named `eval`, keyed only by the function
name. Under the old analyzer this visitor runs before `QueryNormalizer`
applies `prefer_column_name_to_alias`, so a scalar SQL UDF named `eval`
used in an expression context was rewritten too: with a source column `q`,
a UDF `CREATE FUNCTION eval AS x -> x + 1`, and
`WITH 10 AS q SELECT eval(q) FROM t SETTINGS enable_analyzer = 0,
prefer_column_name_to_alias = 1`, the argument became `eval(10)` (result
`11`) instead of using the source column (`eval(1)`, result `2`).

The substitution is now driven from the table-function position
(`ASTTableExpression::table_function`) via
`substituteWithLiteralsInEvalTableFunction`, which descends into nested
table functions (e.g. `loop(eval(q))`) but stops at subqueries, so only an
actual `eval` table-function occurrence is rewritten. A scalar SQL UDF with
the same name in an ordinary expression context is left for
`QueryNormalizer`, which honours `prefer_column_name_to_alias`.

Added `04492_eval_scalar_udf_prefer_column` covering the scalar SQL UDF
named `eval` with `prefer_column_name_to_alias` under both analyzers, and
asserting that the `eval` table function still substitutes `WITH` aliases.

Addresses the AI review finding on
`src/Interpreters/ApplyWithSubqueryVisitor.cpp`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Addressed the AI review "Request changes" finding on src/Interpreters/ApplyWithSubqueryVisitor.cpp in eae8b08.

Problem: under the old analyzer, ApplyWithSubqueryVisitor substituted WITH literal aliases into the arguments of every ASTFunction named eval, keyed only by the function name. Because this visitor runs before QueryNormalizer applies prefer_column_name_to_alias, a scalar SQL UDF named eval used in an expression context was rewritten too. With a source column q, CREATE FUNCTION eval AS x -> x + 1, and WITH 10 AS q SELECT eval(q) FROM t SETTINGS enable_analyzer = 0, prefer_column_name_to_alias = 1, the argument was rewritten to eval(10) (result 11) instead of using the source column eval(1) (result 2).

Fix: the substitution is now driven from the table-function position (ASTTableExpression::table_function) via the new substituteWithLiteralsInEvalTableFunction, which descends into nested table functions (e.g. loop(eval(q))) but stops at subqueries, so only an actual eval table-function occurrence is rewritten. A scalar SQL UDF with the same name in an ordinary expression context is left to QueryNormalizer, which honours prefer_column_name_to_alias.

Test: added 04492_eval_scalar_udf_prefer_column (old-analyzer regression for a scalar SQL UDF named eval with prefer_column_name_to_alias = 1, plus the new-analyzer mirror and an assertion that the eval table function still substitutes WITH aliases under both analyzers).

Verification (local Debug build): the six covered queries return 2 / 11 / 2 / 11 / 42 / 43 as expected (the regression returned 11 instead of 2 before the fix), and the existing 04210 WITH-alias / 04241 old-analyzer / 04324 scalar-UDF eval cases all still pass.

The remaining AI-review finding (remote(named_collection, database=eval(q))) is unchanged: as noted above it needs a richer table-expression-argument representation across TableFunctionRemote and all of its distributed-analyzer consumers, and is left for a maintainer decision plus a follow-up. A fresh CI run is triggered by this push.

@alexey-milovidov

Copy link
Copy Markdown
Member

Merged master into the branch to resolve the merge conflict — the branch had fallen ~1216 commits behind and GitHub marked it CONFLICTING/DIRTY, which blocks merging.

This is a pure master merge with no code changes: the net diff versus master is still exactly the eval feature (36 files, all eval sources/tests/docs), and the setting allow_experimental_eval_table_function remains in the 26.7 section of SettingsChangesHistory (dev version is still 26.7, so 02995_new_settings_history stays green). I verified all 17 changed translation units still compile against the merged tree with -fsyntax-only — no API drift from the master churn.

The two Stress test (amd_tsan/arm_tsan) failures on the previous run were the chronic Hung-check flake (#107941), not related to this PR. Fresh CI is now running on the up-to-date base.

@clickhouse-gh

clickhouse-gh Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.50% 85.50% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.60% 77.70% +0.10%

Changed lines: Changed C/C++ lines covered: 450/481 (93.56%) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-experimental Experimental Feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants