Make server-side AST fuzzer respect KILL, timeout and shutdown by Algunenano · Pull Request #107640 · ClickHouse/ClickHouse · GitHub
Skip to content

Make server-side AST fuzzer respect KILL, timeout and shutdown#107640

Merged
Algunenano merged 2 commits into
masterfrom
fuzzer-respect-kill-shutdown
Jun 18, 2026
Merged

Make server-side AST fuzzer respect KILL, timeout and shutdown#107640
Algunenano merged 2 commits into
masterfrom
fuzzer-respect-kill-shutdown

Conversation

@Algunenano

Copy link
Copy Markdown
Member

The server-side AST fuzzer (ast_fuzzer_runs) runs as a query finish callback, after the outer query's pipeline executor has stopped enforcing limits. The fuzzing loop never consulted the outer query status, so it ignored a KILL QUERY, the outer max_execution_time, and server shutdown (which cancels all running queries). The outer query then kept spawning fuzzed queries and lingered in the processlist, which can trigger a false-positive stress test hung check.

Now QueryStatus::checkTimeLimitSoft is checked at the top of each fuzzing iteration. It returns false without throwing on a kill, on the outer query's time limit, or on shutdown (ProcessList::killAllQueries marks every query killed), so the loop stops promptly.

Observed in the stress test hung check of an unrelated PR (#107475):
https://s3.amazonaws.com/clickhouse-test-reports/PRs/107475/6271a231daa9d7adfda7f2c725b71ef672a3feb1/stress_test_arm_asan_ubsan_s3/hung_check.log

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

`executeASTFuzzerQueries` runs as a query finish callback, after the
outer query's pipeline executor has stopped enforcing limits. Its
fuzzing loop never consulted the outer query status, so it ignored a
`KILL QUERY`, the outer `max_execution_time`, and server shutdown
(which cancels all queries). The query then lingered in the
processlist while spawning more fuzzed queries, which can trip the
stress test hung check.

Check `QueryStatus::checkTimeLimitSoft` at the top of each iteration.
It returns false without throwing on a kill, on the outer query's time
limit, and on shutdown (`ProcessList::killAllQueries` marks every query
killed), so the loop stops promptly.

Observed in the stress test hung check on an unrelated PR:
https://s3.amazonaws.com/clickhouse-test-reports/PRs/107475/6271a231daa9d7adfda7f2c725b71ef672a3feb1/stress_test_arm_asan_ubsan_s3/hung_check.log

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clickhouse-gh

clickhouse-gh Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-ci label Jun 16, 2026
Comment thread src/Interpreters/executeQuery.cpp
Some fuzzable queries (e.g. `SHOW PROCESSLIST`) are not inserted into
the `ProcessList`, so the outer `QueryStatus` is null and the
`checkTimeLimitSoft` guard is a no-op. Such a query with a large
`ast_fuzzer_runs` could keep spawning fuzzed queries through server
shutdown.

Check the `IsServerShuttingDown` metric at the top of each iteration.
It is set at the very start of shutdown, before queries are killed, and
does not depend on the outer query having a process-list owner.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Interpreters/executeQuery.cpp
@clickhouse-gh

clickhouse-gh Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.20% 85.20% +0.00%
Functions 92.30% 92.30% +0.00%
Branches 77.40% 77.50% +0.10%

Changed lines: Changed C/C++ lines covered by tests: 8/12 (66.67%) | Lost baseline coverage: none · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants