Add --chime option for ASCII BEL on slow query finish (issue #92718)#104545
Conversation
…ouse#92718) Implements the chime feature requested in ClickHouse#92718: an optional command-line flag that emits the ASCII `BEL` control character (`\x07`) on stderr after a query finishes when it ran for at least `--chime N` seconds. This lets a user attending to other work be alerted when a long-running query completes; whether the terminal makes a sound or a visual flash is decided by the terminal's own user preferences. The option is wired in `ClientBase::addCommonOptions`, so it works in `clickhouse-client`, `clickhouse-local`, and the embedded client as @alexey-milovidov requested. Usage: --chime enable with the default 10-second threshold --chime N enable with a custom threshold of N seconds (omitted) or 0 disabled (no BEL is emitted) The BEL is emitted at the end of `processParsedSingleQuery`, so it fires on both success and error paths. Test: `tests/queries/0_stateless/04219_client_chime_on_slow_query_92718.sh` covers the five interesting cases — `--chime 1` with slow query (BEL), `--chime 10` with fast query (no BEL), no `--chime` (no BEL), `--chime 1` with slow query that errors (BEL), and the same in `clickhouse-local`. Closes ClickHouse#92718.
|
cc @alexey-milovidov @azat — could you review this? It adds the |
|
Workflow [PR], commit [79c181b] Summary: ✅
AI ReviewSummaryThis PR adds a default-on Missing context / blind spots
Final Verdict✅ No blockers or majors found. No new inline review comments were posted. |
Change the `--chime` option on `clickhouse-client`, `clickhouse-local`, and the embedded client from `implicit_value(10)->default_value(0)` to `implicit_value(5)->default_value(5)`, per @alexey-milovidov's review on PR ClickHouse#104545: "Let's set both implicit and default values to 5 seconds." New semantics: * omit `--chime` -> 5-second threshold (chime enabled by default) * `--chime` -> 5-second threshold (implicit value) * `--chime N` -> N-second threshold (custom) * `--chime 0` -> disabled (no BEL emitted) Also update the option's help text accordingly and extend the regression test (`tests/queries/0_stateless/04219_client_chime_on_slow_query_92718.sh`) with a new case 6 that verifies `--chime 0` explicitly disables the chime even on a query above the default threshold. Case 3's description is updated to reflect that "no `--chime`" with a 1.5s query stays silent because 1.5s is still below the 5s default, not because chime is off. Verified locally: * `--chime 1` + `sleep(1.5)` -> BEL emitted on stderr * (no flag) + `sleep(1.5)` (< default) -> no BEL * (no flag) + `sleepEachRow(2) x 3 = 6s` -> BEL emitted (default-on) * `--chime 0` + `sleep(1.5)` -> no BEL (explicit disable) * full regression test 04219 passes (8.65s). Related thread: ClickHouse#104545 (comment)
In case ClickHouse#4 the previous assertion only checked for `BEL` on stderr; it never verified that the query actually failed. If `throwIf` semantics change or the query is rewritten so it no longer errors, case ClickHouse#4 could silently keep printing `BEL` (since the query runs longer than the threshold) and the test would still pass — losing the "chime on failure" coverage this case was added to provide. Now case ClickHouse#4 explicitly: 1. asserts a non-zero exit code (query must have failed), 2. asserts that `'expected error'` is present in stderr (the error came from `throwIf` and not from an unrelated client/network issue), 3. only then checks for `BEL`. Any of the first two conditions failing prints a diagnostic "FAIL: ..." line which won't match the reference, so the test fails loudly with a clear reason. Issue: ClickHouse#92718 PR: ClickHouse#104545 Review thread: ClickHouse#104545 (comment)
The `--chime` option emits ASCII `BEL` (`\x07`) to stderr when a query runs longer than the threshold. With the default-on behaviour at 5 seconds, this writes BEL bytes into every stderr stream — including captured ones. The result is 10+ Fast test failures (e.g. `00156_max_execution_speed_sample_merge`, `02703_max_local_write_bandwidth`, `03217_primary_index_memory_leak`) where `clickhouse-test` redirects stderr and rejects any non-empty content (`Reason: having stderror`). Add a `stderr_is_a_tty` guard at the emission site in `ClientBase`. BEL now only fires when stderr is attached to a terminal — which is the only context where a chime is meaningful anyway. Redirected stderr (file, pipe, automation) is left untouched, so captured stderr stays clean. The interactive default-on behaviour requested in the review is preserved. Strengthen test `04219_client_chime_on_slow_query_92718.sh`: - Cases 1, 4, 5: verify the TTY guard suppresses BEL when stderr is redirected to a file — exercises the fix directly. - Cases 7, 8: run the client under `script -qc` to attach stderr to a pseudo-terminal and verify BEL still fires on a TTY (positive case) and that `--chime 0` overrides it on a TTY (explicit disable). - Case 4 still asserts the error path actually fires (non-zero exit code plus `expected error` in stderr) before checking BEL. Verified locally: - `04219_client_chime_on_slow_query_92718`: all 8 cases pass. - `00156_max_execution_speed_sample_merge`: passes (45s query, stderr empty). - `02703_max_local_write_bandwidth`: passes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fix: suppress BEL when stderr is not a TTY (commit
|
| # | Scenario | Stderr | Expectation |
|---|---|---|---|
| 1 | --chime 1 + slow query |
redirected to file | no BEL (TTY guard suppresses) |
| 2 | --chime 10 + fast query |
redirected | no BEL (below threshold) |
| 3 | no flag + 1.5s query | redirected | no BEL (below default 5s) |
| 4 | --chime 1 + slow query then error |
redirected | no BEL + rc!=0 + 'expected error' present |
| 5 | clickhouse-local --chime 1 + slow |
redirected | no BEL (same suppression cross-binary) |
| 6 | --chime 0 + slow |
redirected | no BEL (explicit disable) |
| 7 | --chime 1 + slow under script -qc |
pty | BEL (positive TTY case) |
| 8 | --chime 0 + slow under script -qc |
pty | no BEL (explicit disable wins on TTY) |
Cases 7 and 8 run the client under script -qc, which allocates a pseudo-terminal and writes the combined pty output to a file we then grep. That's the only way to verify the positive case from a stateless test.
Verification
04219_client_chime_on_slow_query_92718: all 8 cases pass.00156_max_execution_speed_sample_merge: passes (45s query, captured stderr empty).02703_max_local_write_bandwidth: passes.- Manual
clickhouse-client -q 'SELECT sleep(2),sleep(2),sleep(2)' 2>/tmp/err→ stderr file is 0 bytes (was contaminated with BEL pre-fix).
Pre-PR Validation Gate
a) Deterministic repro: yes — any query >5s with redirected stderr writes BEL pre-fix.
b) Root cause explained: yes — missing TTY precondition at the emission site.
c) Fix matches root cause: yes — adds the missing precondition; semantically correct for a "terminal bell".
d) Test intent preserved: yes, strengthened — the new test exercises both the TTY-only positive path and the non-TTY suppression path.
e) Both directions demonstrated: yes — affected Fast tests pass; positive TTY case still emits BEL via script -qc.
Status: pushed as a separate commit per the no-amend rule. Existing approvals on prior commits remain visible in the review history; reviewers are welcome to re-approve on the new commit.
Adds case 9 to `04219_client_chime_on_slow_query_92718.sh`: under a pseudo-terminal allocated by `/usr/bin/script -qc`, run `clickhouse-client --chime 10` with `SELECT sleep(0.1)`. Expects no `BEL` because the query finishes well under the 10-second threshold. This is the only path where `BEL` is actually emitted (cases 7-8 also run under a pty), so it is the only place where a regression of the threshold gate (`elapsedSeconds() >= chime_threshold_seconds`) would surface. Cases 1-6 cannot detect such a regression because the TTY guard suppresses `BEL` regardless of the threshold. Addresses inline review feedback on PR ClickHouse#104545.
…8-bel-chime-on-slow-query
|
@groeneai, investigate the failure: |
|
Confirmed — this is the chronic The CI hit on commit Fix status
The residual line-2417 hangs at ~1 hit/day across all builds in the last 48h are still under investigation — they look like a smaller leftover not addressed by #104694. I'm tracking it in the existing #104877 umbrella; no separate fix PR yet for the post-#104694 residual. I rebased PR #104545 onto current master ( |
|
This was fixed by #105146. Let's update the branch. |
clickhouse-gh[bot] inline review on 2026-05-17 flagged two remaining coverage gaps in `tests/queries/0_stateless/04219_client_chime_on_slow_query_92718.sh`: 1. Default-on threshold is unverified under `pty`. Cases 3 and 5 redirect stderr, so the TTY guard suppresses `BEL` and they cannot tell whether omitting `--chime` correctly picks up the default 5-second threshold. If `implicit_value` / `default_value` regressed to `0` (i.e. accidentally disabled by default) the existing cases would still print `no BEL`. 2. Error-path `BEL` is unverified under `pty`. Case 4 captures the error contract under redirected stderr where `BEL` is suppressed anyway, so a regression that emitted `BEL` only on success would still pass. Add two new TTY cases under `script -qc`: - Case 10: no `--chime` flag, slow query (~6s) via `SELECT sleepEachRow(2) FROM numbers(3) SETTINGS function_sleep_max_microseconds_per_block = 0`, expects `BEL`. Catches regressions in the `implicit_value` / `default_value` wiring. - Case 11: `--chime 1`, slow failing query `SELECT sleep(1.5), throwIf(1 = 1, 'expected error')` under `script -eqc` (propagates child exit code), asserts non-zero rc, `'expected error'` in the pty stream, and `BEL`. Catches regressions that drop `BEL` on the error path. Verification (debug build, 5/5 manual runs): - `clickhouse-test 04219_client_chime_on_slow_query_92718` → OK 20.42s Bot review IDs: 3254645106 (default-on coverage), 3254645298 (error-path coverage).
…8-bel-chime-on-slow-query
…8-bel-chime-on-slow-query
|
@groeneai, many tests have failed, and we have no other way than investigating and fixing all of them. |
master added several 04303_* and 04304_* stateless tests since the previous 04219 -> 04303 renumber, so the chime test collided again. Renumber it to the next free prefix 04312. Also address the reviewer request on the pty matrix: - Case 8 (`--chime 0` on a TTY) now runs a ~6s query (above the default 5s threshold) instead of a 1.5s one, so it genuinely distinguishes explicit disable from default-on: if `--chime 0` were ignored and the 5s default used, this case would emit `BEL`. - Add case 12 exercising `--chime` passed without a value (the `implicit_value` path, distinct from omitting the flag in case 10) on a slow query over a pty, which must emit `BEL`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Updated the branch (
Verified on a freshly built binary ( Both previously-open review threads are now resolved. The remaining CI failures ( |
Add a --chime [N] entry to the execution-details table in the client docs, covering the default 5 second threshold, the no-value form using the default threshold, TTY-only emission, and --chime 0 to disable. Requested by the docs review on PR ClickHouse#104545 (Issue ClickHouse#92718). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Align the client docs row with the option help text and the implementation (stderr_is_a_tty check). The bell is written to stderr, not stdout, so redirecting stdout does not suppress it while redirecting stderr does. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CI Status — covered SHA
|
| Check | Test | Disposition |
|---|---|---|
Upgrade check (amd_release) |
Error message in clickhouse-server.log (see upgrade_error_messages.txt) |
Chronic upgrade-check flake. The trigger line is a StorageKafka2 / Kafka-test 408 timeout — broker / infra noise, unrelated to the --chime client option. Tracked under task 2026-04-26-ci-p2-cannot-start-clickhouse-server--. |
Integration tests (amd_tsan, 1/6) |
test_refreshable_mat_view_replicated/test.py::test_circular_dependencies_survive_restart |
Chronic flaky — 14 PRs / 30d, 1 master. Tracked under task 2026-06-04-ci-p2-testcirculardependenciessurviv (P2). |
Mergeable Check / PR / CH Inc sync |
n/a | Aggregator/infra rollup. |
Code change is purely client-side (--chime option for ASCII BEL on slow-query finish); neither failing area touches client argument parsing or output formatting.
— cron:our-pr-ci-monitor:20260604-073000
|
@groeneai, update the branch to see if the upgrade check has been fixed (we had a plenty of fixes recently). |
|
@groeneai please look at the comment above ^ |
…8-bel-chime-on-slow-query Update branch to latest master per @alexey-milovidov directive to verify recent fixes resolve the chronic Upgrade check flake. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@alexey-milovidov merged |
CI Status — covered SHA
|
| Check | Test | Disposition |
|---|---|---|
Stateless tests (arm_asan_ubsan, azure, parallel) |
04033_tpc_ds_q61 |
transient OOM, NOT PR-caused. MEMORY_LIMIT_EXCEEDED on TPC-DS Q61 join over web-disk-resident store_sales (RSS 53.70 GiB at 53.69 GiB cap). The "Failure reproducibility check: Failed 0 out of 3 reruns" line confirms it passed all 3 reruns — pure transient memory pressure on the runner. Heavy TPC-DS query path is completely unrelated to the --chime BEL client option this PR adds (only src/Client/ClientBase.cpp, docs/en/interfaces/client.md, and a stateless test were touched). |
PR is client-only — none of the failing code paths can be reached by this diff.
Session: cron:our-pr-ci-monitor:20260608-083000
|
@alexey-milovidov sent a separate PR per your directive: #106797. Disables |
|
Follow-up on the q61 OOM directive: I closed the separate PR (#106797) instead of disabling. rienath reviewed it and asked to find the regression rather than skip the test. Result: there is no q61 memory regression. q61 has 0 master failures across CIDB's full 90-day retention; its only memory-typed failure ever is the single 2026-06-08 OvercommitTracker OOM on this PR's run, which passed all 3 reruns. It was one casualty of a runner-wide memory event on So neither fix nor disable applies; q61 stays enabled. For this PR's shard flake, a rerun clears it. Full details on #106797. |
|
@groeneai, the root cause of the failure of |
…8-bel-chime-on-slow-query
|
@alexey-milovidov done. Merged latest master into the branch (clean, no conflicts); new HEAD is |
CI summary for
|
|
@groeneai, as the test for refreshable materialized views was fixed, update the branch. |
…8-bel-chime-on-slow-query
|
@alexey-milovidov branch updated to |
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered by tests: 19/19 (100.00%) | Lost baseline coverage: none · Uncovered code |

Adds the
--chimecommand-line flag toClientBaseso thatclickhouse-client,clickhouse-local, and the embedded client emit the ASCIIBELcontrol character (\x07) on stderr when a query finishes after running for at least N seconds. Whether the terminal makes a sound or a visual flash is decided by the terminal's own user preferences — exactly the design @alexey-milovidov suggested in the issue.Motivation: when a user is waiting on a long-running query and works on something else, the terminal stays silent — the user has to keep checking. Reusing the standard
BELcontrol character lets every terminal emulator handle the alert according to the user's existing audio/visual preferences, without ClickHouse needing to bundle audio assets.Usage (default-on per reviewer request):
Behaviour:
BELis emitted at the end ofClientBase::processParsedSingleQuery, after the existingElapsed:/Processed rows:reporting.progress_indication.elapsedSeconds, which is reset per query, so each statement in a multi-statement script is timed independently.Test:
tests/queries/0_stateless/04219_client_chime_on_slow_query_92718.shcovers six cases —--chime 1with a slow query (BEL),--chime 10with a fast query (no BEL), no--chimeflag with a query below the default 5-second threshold (no BEL),--chime 1with a slow query that ends in an error (BEL),clickhouse-local --chime 1with a slow query (BEL) to confirm the wiring is inClientBaserather than the network-bound client only, and--chime 0with a slow query to confirm the explicit disable path (no BEL).Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Added the
--chimecommand-line option toclickhouse-client,clickhouse-local, and the embedded client. When a query finishes (on success or on error) after running for at least N seconds, the client writes the ASCIIBELcontrol character (\x07) to stderr. Terminals decide whether to make a sound or a visual flash based on the user's preferences. Enabled by default with a 5-second threshold; pass--chime Nfor a custom threshold or--chime 0to disable. Closes #92718.Documentation entry for user-facing changes
Note
Medium Risk
Changes default CLI behavior for interactive/TTY runs by emitting
\x07after slow queries, which could affect terminal output expectations. Logic is gated by a threshold and TTY detection, but still touches the query-completion path shared byclickhouse-clientandclickhouse-local.Overview
Adds a new
--chime[=N]option (default threshold 5s,0disables) that writes ASCIIBEL(\x07) to stderr when a query finishes after running longer than the configured threshold.The chime is emitted on both success and error paths, but only when stderr is a TTY to avoid contaminating redirected/captured stderr. Includes a new stateless test validating threshold behavior, TTY suppression, and both
clickhouse-clientandclickhouse-localcoverage.Reviewed by Cursor Bugbot for commit 80f543a. Bugbot is set up for automated code reviews on this repo. Configure here.