`/schema`: fix UI issues and adopt `play.html`-style auth form by alexey-milovidov · Pull Request #108724 · ClickHouse/ClickHouse · GitHub
Skip to content

/schema: fix UI issues and adopt play.html-style auth form#108724

Merged
alexey-milovidov merged 19 commits into
masterfrom
fix-schema-handler-ui
Jul 3, 2026
Merged

/schema: fix UI issues and adopt play.html-style auth form#108724
alexey-milovidov merged 19 commits into
masterfrom
fix-schema-handler-ui

Conversation

@alexey-milovidov

Copy link
Copy Markdown
Member

Fixes a batch of user-reported usability issues in the /schema web UI (the schema visualizer served by the HTTP handler, e.g. on play.clickhouse.com), and aligns its connection/authentication form with play.html.

UI fixes:

  • Enter to load — pressing Enter in any connection field now loads the schema, not only the Load button.
  • Theme-aware scrollbarscolor-scheme is set per theme and the custom scroll areas (viewport, node column lists, sidebar) are styled with the theme colours, so scrollbars no longer render light on the dark theme.
  • Readable node headers — nodes show just the table name (the qualified name stays on hover and in the sidebar), and a long engine label such as ReplicatedReplacingMergeTree can no longer squeeze the name out of the header.
  • Consistent grouping — every database gets one section holding all of its tables: dependency chains laid out as graphs on top, independent tables in a grid below. Previously independent tables were hoisted into a single global "Independent tables" bucket while tables with dependencies were grouped per database.
  • No parasitic selection — added user-select: none on nodes so dragging a table doesn't leave a text selection behind.

Authentication form, matching play.html:

  • Credential Management API — entered credentials are remembered via PasswordCredential / navigator.credentials.store so the browser can offer to save and autofill them (guarded for browsers without support).
  • URL credentialsurl, user, and password are read from the query string, and the password is stripped back out of the address bar via history.replaceState.
  • Live validation — the user/password fields turn green on a successful SELECT version() probe and red on failure; status resets on input and re-checks on paste.

Tested against the live play.clickhouse.com schema; the grouping and auth logic were additionally exercised by loading the page's own functions under a DOM stub.

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Several usability fixes for the /schema web UI: load the schema on Enter, theme-aware scrollbars, readable table names, consistent per-database grouping of independent tables, no text selection while dragging, and a play-style authentication form (Credential Management API, URL-supplied credentials with password stripping, and live credential validation).

🤖 Generated with Claude Code

User-reported fixes for the `/schema` web UI:

- Load the schema when pressing Enter in any connection field, not only
  when clicking `Load`.
- Theme-aware scrollbars: set `color-scheme` per theme and style the custom
  scroll areas (viewport, node column lists, sidebar) with the theme colours.
- Readable node headers: show just the table name (the qualified name stays
  on hover and in the sidebar), and stop a long engine label (e.g.
  `ReplicatedReplacingMergeTree`) from squeezing the name out of the header.
- Consistent grouping: every database section now holds all of its tables —
  dependency chains laid out as graphs on top, independent tables in a grid
  below — instead of hoisting independent tables into one global bucket.
- Add `user-select: none` on nodes to prevent parasitic text selection while
  dragging tables.

Make the authentication form behave like `play.html`:

- Credential Management API: remember entered credentials via
  `PasswordCredential` and `navigator.credentials.store` so the browser can
  offer to save and autofill them (guarded for browsers without support).
- Read `url`, `user`, and `password` from the query string, and strip the
  password back out of the address bar via `history.replaceState`.
- Live credential validation: paint the user/password fields green on a
  successful `SELECT version()` probe and red on failure; reset on input and
  re-check on paste.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clickhouse-gh

clickhouse-gh Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label Jun 28, 2026
Comment thread programs/server/schema.html
Address AI review: accepting `?url` from the query string together with the
same-origin auto-load path let `/schema?url=https://attacker.example` fire
`loadAll` against an arbitrary origin on page load. If the browser's password
manager had filled credentials for the trusted page origin, `chQuery` would
forward them as `user`/`password` to the attacker-chosen server (which controls
CORS).

Gate auto-load on the target being same-origin via `isSameOriginTarget`;
cross-origin targets now require an explicit `Load` click.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the security finding in 4c4380e.

The auto-load on page load is now gated on the target being same-origin (isSameOriginTarget). Opening /schema?url=https://attacker.example no longer issues any request on load — a cross-origin target requires an explicit Load click, so password-manager-filled credentials for the trusted page origin are never auto-forwarded to an attacker-chosen server. The no-?url and explicit same-origin ?url cases still auto-load as before.

Added focused harness coverage for this path (a cross-origin ?url makes zero requests on page load, an explicit Load does issue the request, and the same-origin default still auto-loads).

Comment thread programs/server/schema.html
Address AI review (Request changes) on PR #108724: the live credential
validation could mark the current `url` / `user` / `password` fields `ok`
or `fail` based on a superseded probe.

A paste schedules a `SELECT version()` probe. `checkCredentials` only
suppressed an older probe once a *newer* probe started (`checkCredentialsNum`);
`resetCredentialsStatus` (on `input`) just cleared the classes without
invalidating the pending request. So a probe that returned after the user
edited the credentials, or after `loadAll` ran with different values, repainted
the now-stale result onto the current fields.

Snapshot the `url` / `user` / `password` the probe is testing and require them
(and the probe sequence number) to still be current before calling
`setCredentialStatus`. `chQuery` reads those fields synchronously when it builds
the request, so the snapshot is exactly what the probe queried.

Verified with a DOM-stub harness reproducing the race: a stale ok/fail probe
arriving after a field edit no longer paints, while a matching probe and the
newest of two concurrent probes still paint as before.
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Request changes) credential-probe race in fedcf27774c.

checkCredentials now snapshots the url / user / password the probe is testing and requires them — and the probe sequence number — to still be current before calling setCredentialStatus. chQuery reads those fields synchronously when it builds the request, so the snapshot is exactly what the probe queried. A SELECT version() result that arrives after the user edits a field, or after loadAll runs with different credentials, can therefore no longer repaint the now-stale ok/fail status onto the current fields. Verified with a DOM-stub harness reproducing the race (a stale ok/fail probe after a field edit no longer paints, while a matching probe and the newest of two concurrent probes still paint as before).

Also merged master (0c8d84e54fd).

The two red checks are pre-existing flakes unrelated to this HTML-only change:

alexey-milovidov and others added 2 commits June 30, 2026 00:06
…All`

Follow-up to the credential-probe race fix: the live credential status and the
Credential Management store must describe the values that were actually probed.

Two paths still leaked a stale outcome:
- Only `#user` / `#password` edits cleared the status, so changing `#url` after a
  green probe left the credential fields green for an unvalidated target. The
  `input` listener now also covers `#url`.
- `loadAll` called `setCredentialStatus` / `storeCredentials` after its awaited
  request without re-checking the fields. It now snapshots the credentials it
  used (via the shared `credentialSnapshot`) and only paints or stores while the
  fields still hold those exact values; the failure path is guarded the same way.

The snapshot logic is factored into `credentialSnapshot`, shared by
`checkCredentials` and `loadAll`. `chQuery` reads the fields synchronously when
it builds the request, so the snapshot is exactly what was queried.

Verified with a DOM-stub harness exercising the extracted functions: a URL edit
clears a green probe; a stale `loadAll` (target changed mid-load) neither paints
nor stores; a matching `loadAll` paints and stores; and the existing
stale-probe / newer-probe-supersedes invariants still hold.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Request changes) follow-up on the credential-status invariant in 61460e39635.

The remaining two stale-status paths are now closed:

  1. URL change no longer leaves a stale green. The input listener that calls resetCredentialsStatus was extended from ['user', 'password'] to ['url', 'user', 'password'], so editing the target URL after a successful probe clears the credential status — it is no longer green for an unvalidated target.

  2. loadAll is guarded. The snapshot logic from checkCredentials is factored into a shared credentialSnapshot. loadAll captures the snapshot before its awaited chQuery and only calls setCredentialStatus/storeCredentials (success) or setCredentialStatus(false) (failure) while the current url/user/password still match the values that were loaded. chQuery reads those fields synchronously when building the request, so the snapshot is exactly what was queried.

This makes every credential-status update and credential store conditional on the current fields matching the values that were actually probed or loaded — the verdict's minimum required action.

Verified with a DOM-stub harness over the extracted functions: a URL edit clears a green probe; a loadAll whose target changed mid-load neither paints nor stores; a matching loadAll paints and stores; the stale-probe and newer-probe-supersedes invariants still hold.

Also merged master (5e59dec4851, clean — master does not touch schema.html).

The one red check, 00409_shard_limit_by, is an unrelated pre-existing timeout flake: it failed 3× directly on master (PR=0) on 2026-06-17 with the same Timeout context, and this is an HTML-only change that cannot affect a LIMIT BY SQL test. The fresh push re-runs CI.

Comment thread programs/server/schema.html Outdated
The `paste` listener installed on the `url`/`user`/`password` fields
scheduled `checkCredentials` unconditionally, and `checkCredentials`
calls `chQuery`, which reads `$('url').value` as the host and forwards
`user`/`password` to it. So on a trusted `/schema` page whose credentials
were filled in by the browser's password manager, pasting a cross-origin
URL such as `https://attacker.example` into the `url` field issued a
`SELECT version()` POST with those credentials to the attacker-chosen
server before any explicit `Load` — reopening the cross-origin credential
leak through the paste path and bypassing the `isSameOriginTarget` gate
added for page-load auto-load.

Gate the paste probe with `isSameOriginTarget($('url').value)`, mirroring
the page-load auto-load invariant: a cross-origin target is now contacted
only after an explicit `Load` (or `Enter`) click, while same-origin
targets keep live credential validation on paste.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Block) cross-origin credential-leak finding on the URL paste path in abf020a8e47.

The paste listener on the url/user/password fields scheduled checkCredentials unconditionally, and checkCredentialschQuery reads $('url').value as the host and forwards user/password to it. So a trusted /schema page with password-manager-filled credentials would POST SELECT version() with those credentials to a freshly pasted https://attacker.example before any explicit Load, bypassing the isSameOriginTarget gate added for page-load auto-load.

The paste probe is now gated on isSameOriginTarget($('url').value), mirroring the page-load invariant: a cross-origin target is contacted only after an explicit Load/Enter, while same-origin targets keep live credential validation on paste. Verified with a DOM-stub harness over the extracted functions: pasting a cross-origin URL (or pasting into user/password while the URL is cross-origin) issues zero requests, while same-origin and relative/empty (default /schema) targets still issue exactly one probe.

The one red check, Hung check failed, possible deadlock found (arm_tsan stress), is the known master-wide flake tracked by #107941 — the hung stacks are Poco TCP-server threads under the AST fuzzer, and this is an HTML-only change that cannot affect a server-side deadlock.

Comment thread programs/server/schema.html Outdated
The AI review flagged a remaining cross-origin credential leak in `loadAll`:
`chQuery` re-read the live `url`/`user`/`password` fields on every call, so a
URL pasted into the form between two of the batch's awaited queries — `columnsQ`,
`dictsQ`, `refreshesQ`, or the `loadViewsLoad` query — would POST the
password-manager-filled credentials to the newly pasted, possibly cross-origin,
server without an explicit `Load`. The earlier `isSameOriginTarget` gates closed
the page-load and paste paths, but an already-running same-origin load batch was
still steerable mid-flight.

`loadAll` now captures the connection once via `connectionSnapshot` and passes it
to every `chQuery` in the batch, so the whole load targets the server the user
committed to; a mid-load field change can no longer redirect later queries — or
their credentials — to a different origin. `loadViewsLoad` reuses that captured
connection (and captures its own when invoked interactively from the lookback
selector). `chQuery` falls back to the live fields only for the single-shot
credential probe. The `isCurrent` predicate still gates the credential-status
paint/store so a stale result cannot repaint the form.

Verified with a DOM-stub harness over the page's own `loadAll`/`chQuery`: a
same-origin load whose `url` is changed to a cross-origin target after the first
query issues zero requests to the new origin (the pre-fix code leaked the three
remaining batch queries there), and no stale credential status is painted.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Block) in-flight cross-origin credential-leak finding on the loadAll batch in 250f826666e.

The earlier commits gated the page-load auto-load (4c4380e3bc8) and the paste probe (abf020a8e47) on isSameOriginTarget, but an already-running same-origin load batch was still steerable mid-flight: chQuery re-read the live url/user/password on every call, so a URL pasted into the field between two of the batch's awaited queries (columnsQ, dictsQ, refreshesQ, or the loadViewsLoad query) would POST the password-manager-filled credentials to the freshly pasted, possibly cross-origin, server without an explicit Load.

loadAll now captures the connection once via connectionSnapshot and passes that snapshot to every chQuery in the batch through the new chQuery(query, conn) parameter, so the whole load is bound to the target the user committed to — a mid-load field change can no longer redirect later queries or their credentials to a different origin. loadViewsLoad reuses the batch's captured connection (and snapshots its own when invoked interactively from the lookback selector); chQuery reads the live fields only for the single-shot credential probe. The isCurrent predicate still gates the credential-status paint/store, so a stale result cannot repaint the form.

This is the verdict's minimum required action: every query in one loadAll batch is now bound to the same captured connection values.

Verified with a DOM-stub harness over the page's own loadAll/chQuery: a same-origin load whose url is changed to a cross-origin target after the first query issues zero requests to the new origin and paints no stale credential status, whereas the pre-fix code leaked the three remaining batch queries to the attacker origin. (As with the other programs/server/*.html web UIs, there is no committed CI runner for these DOM-stub checks, so the harness is run locally.)

Comment thread programs/server/schema.html
`connectionSnapshot` bound each load batch's requests — and its credential
status — to one captured connection, but `loadAll` still applied its results
unconditionally. A same-origin auto-load could start, the user could edit `url`
/ credentials and press `Load`, and if the newer load rendered first the older
batch's awaited queries could still return and overwrite `state.tables`, the
status text, the `query_views_log` heat-map, and the rendered graph with data
from the superseded connection — leaving a schema that no longer matched the
current connection fields.

Add a monotonic `loadGeneration`: each load captures the generation it started
in and applies its results (`state`, status, heat-map, render) only while it is
still the latest one, so a superseded batch quietly drops its stale result.
`loadViewsLoad` reuses the caller's generation when invoked from `loadAll`, and
reads the current one (without bumping it) for a standalone lookback reload, so
a standalone heat-map reload never cancels an in-flight `loadAll` while a newer
`loadAll` still supersedes it.

Verified with a DOM-stub harness over the page's own
`loadAll`/`loadViewsLoad`/`chQuery`: two overlapping loads where the older
connection becomes stale before it applies — the older load updates neither
`state` nor the graph after the newer load wins, a load superseded during its
heat-map query does not overwrite the newer heat-map, and a single un-raced
load still applies and renders normally. (As with the other
`programs/server/*.html` web UIs, there is no committed CI runner for these
DOM-stub checks, so the harness is run locally.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Request changes) follow-up on the async load race in eb1f10f4d3b.

The earlier connectionSnapshot commit (250f826666e) bound a batch's requests and credential status to one captured connection, but loadAll still applied its results unconditionally: a same-origin auto-load could start, the user could edit url / credentials and press Load, and if the newer load rendered first the older batch's awaited queries could still return and overwrite state.tables, the status text, the query_views_log heat-map, and the rendered graph with data from the superseded connection.

The fix takes the verdict's monotonic-load-id option. A new loadGeneration is bumped by each loadAll; the load captures the generation it started in and applies its result (state, status, heat-map, render) only while it is still the latest one — a superseded batch returns before touching state or rendering. loadViewsLoad reuses the caller's generation from loadAll, and reads the current one (without bumping it) for a standalone lookback reload, so its heat-map application is gated the same way while a standalone reload never cancels an in-flight loadAll.

Verified with a DOM-stub harness over the page's own loadAll / loadViewsLoad / chQuery (the test the verdict asks for): two overlapping loads where the first connection goes stale before it applies — the first load updates neither state nor the graph after the newer load wins; a load superseded during its heat-map query does not overwrite the newer heat-map; and a single un-raced load still applies and renders normally. The same harness against the pre-fix code does clobber state.tables back to the superseded connection and wipes the newer heat-map, so the check is meaningful. (As with the other programs/server/*.html web UIs there is no committed CI runner for these DOM-stub checks, so the harness is run locally.)

The three red checks are pre-existing flakes unrelated to this single-file HTML change (programs/server/schema.html, +245/-118 — it cannot affect a C++ unit test or server-side mutations):

  • FunctionsStress.stress / AllTests (Unit tests, msan function_prop_fuzzer) — a master-wide function-property-fuzzer flake (a reinterpret(CAST(... AS DateTime64(2)), 'DateTime64(3)') const-folding mismatch). On 2026-06-30 it failed across ~50 PRs and master itself (pull_request_number = 0); tracked by FunctionsStress #107242.
  • Upgrade check (amd_release)Error message in clickhouse-server.log: a throwIf-based test mutation logs FUNCTION_THROW_IF_VALUE_IS_NON_ZERO from MutatePlainMergeTreeTask; a master-wide upgrade-check flake that hit 32 PRs on 2026-06-30.

The fresh push re-runs CI.

alexey-milovidov and others added 2 commits June 30, 2026 22:25
Cosmetic-only follow-up to the previous commit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ction fields change

`setCredentialStatus(true, version)` writes a `Connected to ClickHouse v...`
message to the shared `#status` line, but `resetCredentialsStatus` only removed
the `ok`/`fail` classes from the `user`/`password` fields. After a successful
paste probe, editing `url`, `user`, or `password` cleared the field border color
while the page still claimed to be connected to the previous server/version —
the credential-status invariant that live status must describe the current
fields was violated.

`#status` is shared with the load status (`Loading tables...`, `Loaded N
tables`, `Failed: ...`), so the message cannot be wiped unconditionally. Track
whether the line currently holds a credential-probe message via
`statusIsCredentialProbe`: `setStatus` clears the flag (any load status takes
over the line), `setCredentialStatus` re-claims it after writing the
`Connected` text, and `resetCredentialsStatus` clears the text only while the
flag is set. Editing a field now drops a stale `Connected` message but leaves a
load status untouched.

Addresses the AI review (Request changes) finding on
programs/server/schema.html:737.

Verified with a DOM-stub harness over the extracted functions: a field edit
after a successful probe clears the `Connected` text and the field colors; a
`Loaded N tables`, `Loading...`, or `Failed:` status survives a field edit; a
probe superseded by a load loses ownership so the load status also survives; a
failed probe does not falsely claim ownership.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

@alexey-milovidov alexey-milovidov requested a review from pamarcos July 1, 2026 06:04
Addresses the AI review (Request changes) follow-up on the async heat-map
race at `programs/server/schema.html:1055`.

`loadGeneration` (`eb1f10f4d3b`) guards a whole `loadAll` batch, but the
`system.query_views_log` heat-map is also reloaded standalone whenever the
lookback selector changes, and that path reused the current `loadGeneration`
without a token of its own. Two heat-map reloads therefore shared a generation
and both passed `isLatestLoad`, so a stale one could apply after the graph or
lookback changed:

- A standalone reload captures `quotedToKey` from the current `state.nodes`,
  then a same-generation `loadAll` rebuilds `state.nodes` and runs its own
  heat-map; when the older selector query returns, `isLatestLoad` still passed
  and it wrote keys from the superseded graph onto the new one.
- Two quick lookback changes had the same stale-apply problem because neither
  call got a unique sequence — the older query returning last corrupted the
  newer heat-map.

Add a monotonic `viewsLoadGeneration` bumped by every `loadViewsLoad` (both the
`loadAll`-driven and standalone ones). A reload is the latest only while both
its `loadGeneration` and its `viewsLoadGeneration` are still current, so a stale
batch neither resets nor applies `state.loadByMv` / `loadByEdge` / `loadMax`.
Since `state.nodes` only ever changes via a `loadAll` that also bumps the
heat-map token, "the graph changed under me" is exactly "the token moved", so
the token detects both races. The map reset is moved behind the guard so a
superseded reload leaves the current heat-map intact instead of blanking it
(the `query_views_log`-unavailable and disabled-heat-map paths still clear it,
gated the same way).

Verified with a DOM-stub harness over the page's own extracted `loadViewsLoad`:
a standalone reload superseded by a same-generation `loadAll` graph swap no
longer leaks the old-graph key (only the new node's heat-map remains), and two
quick lookback changes keep the newest result while the older query is dropped.
The same harness against the pre-fix code does leak the stale key and corrupt
the newer heat-map, so the guard is meaningful. As with the other
`programs/server/*.html` web UIs there is no committed CI runner for these
DOM-stub checks, so the harness is run locally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Request changes) follow-up on the async heat-map race (programs/server/schema.html:1055) in a5bc6caac14.

loadGeneration (eb1f10f4d3b) guards a whole loadAll batch, but the system.query_views_log heat-map is also reloaded standalone on a lookback change, and that path reused the current loadGeneration without a token of its own — so two heat-map reloads could share a generation and both pass isLatestLoad.

The fix adds a monotonic viewsLoadGeneration bumped by every loadViewsLoad (both the loadAll-driven and the standalone one). A reload is the latest only while both its loadGeneration and its viewsLoadGeneration are still current, so a stale batch neither resets nor applies state.loadByMv / loadByEdge / loadMax. Because state.nodes is only ever replaced by a loadAll that then runs its own loadViewsLoad (bumping the token), "the graph changed under me" is exactly "the token moved", so the token catches both flagged races: a standalone reload superseded by a same-generation loadAll graph swap, and two quick lookback changes where the older query returns last. The map reset moves behind the isLatestLoad guard so a superseded reload leaves the current heat-map intact instead of blanking it (the unavailable/disabled paths still clear it, gated the same way).

Verified with a DOM-stub harness over the page's own extracted loadViewsLoad (the test the verdict asks for): a standalone reload superseded by a same-generation graph swap no longer leaks the old-graph key (only the new node's heat-map remains), and two quick lookback changes keep the newest result while the older query is dropped. The same harness against the pre-fix code does leak the stale key and corrupt the newer heat-map, so the guard is meaningful. As with the other programs/server/*.html web UIs there is no committed CI runner for these DOM-stub checks, so the harness is run locally.

The one red on the previous head, Stress test (arm_asan_ubsan) / AddressSanitizer (STID: 3243-1df9), is an unrelated pre-existing flake: it is an allocation-size-too-big inside MsgPackSchemaReader::readObject (src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp) under a fuzzed schema-inference input — a member of the recurring stress-test large-allocation class (8 such stress failures across 7 distinct STIDs over 60 days), and this single-file HTML change cannot affect a C++ format reader. The fresh push re-runs CI.

@groeneai

groeneai commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

@alexey-milovidov the Stress test (arm_asan_ubsan) red (AddressSanitizer (STID: 3243-1df9)) is a real, deterministic trunk bug, not a flake — and a fix is already open: #109019.

Root cause: MsgPackSchemaReader::readObject() (src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp) unpacks with the default msgpack::unpack_limit() (cap 0xffffffff). msgpack-c's create_object_visitor eagerly malloc(count * sizeof) from a fuzzed array32/map32/str32/bin32/ext32 header before reading the payload, bypassing the CH memory tracker. An over-declared header requests a single ~128 GiB allocation — exactly the 0x2000000008 in the report.

Deterministic repro (any build):

printf '\xdd\xff\xff\xff\xff' | clickhouse local --input-format MsgPack --file - -q "desc file(stdin)"

\xdd = array32, \xff\xff\xff\xff = declared length 0xffffffff -> 0xffffffff * 8 + 8 = 0x2000000008.

#109019 bounds unpack_limit by the bytes currently buffered, so an over-declaration throws msgpack::size_overflow (routed through the existing grow-and-retry loop, rejected as UNEXPECTED_END_OF_FILE) before any eager alloc. Covers all 5 eager-alloc paths (array/map/str/bin/ext); validated both directions on an ASan build; regression test 04366_msgpack_schema_inference_huge_container.

You're right that this PR (HTML-only) cannot cause it — the fix belongs in #109019, not here. CIDB: STID 3243-1df9 = 1 hit / 1 PR (this one) / 0 master in 60d.

@alexey-milovidov

Copy link
Copy Markdown
Member Author

The new Upgrade check (amd_release) red is unrelated to this HTML-only change — it is known master-side log noise, not a regression from this PR.

The flagged test is Error message in clickhouse-server.log, and the <Error> lines are remote-database-engine connection errors from a single stateless test:

<Error> mysqlxx::Pool: Failed to connect to MySQL (fake_db@192.0.2.1:3306 as user user): mysqlxx::ConnectionFailed
<Error> Application: Connection to mysql failed 1 times
<Error> DatabaseMySQL: Code: 279. ... ALL_CONNECTION_TRIES_FAILED

192.0.2.1 is an RFC 5737 test address; a DatabaseMySQL engine pointed at it logs <Error> connection retries, and the Upgrade check asserts the server log contains no <Error> lines after the upgrade restart. This is pervasive across master and unrelated PRs (the same test failed on 12 PRs today and 38 yesterday) and cannot be affected by a change to programs/server/schema.html.

It is already being fixed by an open PR that allowlists exactly these lines in tests/docker_scripts/upgrade_runner.sh: #108560 . Once that merges, a master re-merge here will clear the Mergeable Check.

@alexey-milovidov

Copy link
Copy Markdown
Member Author

The CH Inc sync red (private #62490, run 28500606703) is two unrelated Cloud-only integration-test flakes, not a regression from this HTML-only change (programs/server/schema.html):

  • test_distributed_cache/test.py::test_idle_client_ttlAssertionError: client2 cache was purged despite continuous activity (assert 0 > 0), a distributed-cache TTL timing flake. It passed on the in-run retry ([gw0] PASSED); the shard was still marked FAIL because this test is not in the flaky allowlist.
  • test_refreshable_mv_no_multi_read/test.py::test_refreshable_mv_attach_without_multi_readassert 'Disabled' in 'Scheduled'; the harness itself marked it flaky, matching issue #62697.

Neither a distributed-cache client TTL test nor a refreshable-MV attach test can be affected by a change to programs/server/schema.html. I re-ran the two failed shards (gh run rerun 28500606703 --failed) to clear the sync gate.

@alexey-milovidov

Copy link
Copy Markdown
Member Author

Re-merged `master` (7a74bdb83af) to clear the last red check.

The only failing check was Upgrade check (amd_release) / Error message in clickhouse-server.log, i.e. the pervasive DatabaseMySQL-connection log noise (mysqlxx::Pool: Failed to connect to MySQL (fake_db@192.0.2.1:3306 …)), unrelated to this HTML-only change. Its allowlist fix, #108560, merged into master at 03:19Z today and adds exactly these lines to tests/docker_scripts/upgrade_runner.sh:

| grep -av -e "mysqlxx::Pool.*Failed to connect to MySQL" \
| grep -av -e "Application: Connection to mysql failed" \
| grep -av -e "DatabaseMySQL.*Connections to mysql failed" \

which cover all three <Error> lines from the failure. master does not touch programs/server/schema.html, so the merge was conflict-free and the net diff is unchanged (only schema.html). This should turn the Upgrade check and the Mergeable Check green on the re-run.

Comment thread programs/server/schema.html Outdated
…probes and loads

The credential-status paint (the green/red on the `user`/`password` fields and
the `Connected to ...` line) was keyed only by field equality: `checkCredentials`
used a probe-only `checkCredentialsNum`, and `loadAll` gated its paint/store on
`conn.isCurrent()` alone. Two operations against the *same* `url`/`user`/`password`
both satisfy field equality, so an older callback could repaint the status after a
newer one had already won. Reproduced (DOM-stub harness over the page's own
functions): start two `loadAll` batches for the same connection, let the newer one
succeed, then let the older one fail — the page keeps `Loaded 3 tables.` while the
stale callback flips the credential fields to `fail`. The same happens with a
paste-triggered probe racing a newer load.

Replace `checkCredentialsNum` with a single monotonic `credentialStatusGeneration`
bumped at the start of every operation that can repaint the credential status — the
`checkCredentials` probe and each `loadAll` batch. Each captures the token and
applies its paint/store only while it is still the latest (and the fields still hold
the probed values via `conn.isCurrent()`), so the latest-started probe-or-load is the
sole authority over the credential status. A superseded operation, even one on an
identical connection, no longer repaints a stale `ok`/`fail`.

Verified with a DOM-stub harness reproducing the flagged load-vs-load and
probe-vs-load races (both now suppress the stale paint) while the pre-existing
invariants still hold: a single un-raced load still paints and reports, a field edit
before a probe returns still suppresses the paint, and a newer probe still supersedes
an older one.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Request changes) credential-status freshness finding (programs/server/schema.html:776) in f1bcccfde75.

The remaining stale-status path is now closed. The credential status (the green/red on the user/password fields and the Connected to ... line) was keyed only by field equality: checkCredentials used a probe-only checkCredentialsNum, and loadAll gated its paint/store on conn.isCurrent() alone. Two operations against the same url/user/password both satisfy field equality, so an older callback could repaint the status after a newer one had already won — the finding's exact reproduction (start two loads for the same connection, let the newer succeed, then let the older fail; the page keeps Loaded 3 tables. while the stale callback flips the credential fields to fail).

The fix takes the verdict's shared-token option: checkCredentialsNum is replaced by a single monotonic credentialStatusGeneration bumped at the start of every operation that can repaint the credential status — the checkCredentials probe and each loadAll batch. Each captures the token and applies its paint/store only while it is still the latest one and the fields still hold the probed values (conn.isCurrent()), so the latest-started probe-or-load is the sole authority over the status. A superseded operation — even one on an identical connection — no longer repaints a stale ok/fail.

Verified with a DOM-stub harness over the page's own loadAll / checkCredentials / chQuery (the test the verdict asks for): the flagged load-vs-load race and the paste-probe-vs-newer-load race now both suppress the stale paint (the newer result stays), while the pre-existing invariants still hold — a single un-raced load still paints and reports, a field edit before a probe returns still suppresses the paint, and a newer probe still supersedes an older one. The same harness against the pre-fix code does flip the fields to fail on both races, so the check is meaningful. (As with the other programs/server/*.html web UIs there is no committed CI runner for these DOM-stub checks, so the harness is run locally.)

The two red checks are the known AST-fuzzer flake, unrelated to this single-file HTML change (programs/server/schema.html, which cannot affect a server-side query pipeline):

The fresh push re-runs CI.

Comment thread programs/server/schema.html Outdated
…ection

The AI review (Request changes) flagged the last open path that violates the
`play.html`-style explicit-`Load` contract this PR establishes
(`programs/server/schema.html:1074`).

`loadViewsLoad` reloads the `system.query_views_log` heat-map both at the end of
every `loadAll` (with the batch's captured connection) and standalone whenever the
lookback selector changes. The standalone path took `conn = conn || connectionSnapshot()`,
so with `conn` omitted it read the *live* form fields. After a successful load a user
could edit `url` to a cross-origin target and then merely change the lookback — without
any explicit `Load` — and `chQuery(q, conn)` would POST `user` / `password` to that
target. Even for a benign different server it overlays that server's `query_views_log`
onto the graph still belonging to the previous load.

Fix: record the connection that produced the current `state.nodes` in `loadedConnection`
(set by `loadAll` the moment it commits the batch as the latest load), and bind a
standalone `loadViewsLoad` to it instead of the live fields. A lookback change now
re-queries only the server that produced the displayed graph; contacting a newly-edited
target still requires an explicit `Load` (which runs `loadAll` and re-commits both
`state.nodes` and `loadedConnection`). Fail closed — contact nothing — if no load has
committed a connection yet.

Verified with a DOM-stub harness over the page's own extracted `loadViewsLoad` /
`connectionSnapshot` (the test the verdict asks for): after a same-origin load, editing
`url` to an attacker origin and changing the lookback no longer contacts that origin — the
reload queries the committed connection and keys the heat-map to the displayed graph;
the `loadAll`-driven path (explicit `conn`) is unchanged; and a standalone reload with no
committed connection contacts nothing. The same harness against the pre-fix code does POST
to the attacker origin, so the check is meaningful. As with the other
`programs/server/*.html` web UIs there is no committed CI runner for these DOM-stub checks,
so the harness is run locally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Request changes) blocker on the standalone heat-map reload (programs/server/schema.html:1074) in 2aa81014dfd.

loadGeneration / viewsLoadGeneration / credentialStatusGeneration bind a load's requests and status to one captured connection, but the system.query_views_log heat-map is also reloaded standalone when the lookback selector changes, and that path took conn = conn || connectionSnapshot() — with conn omitted it read the live form fields. So after a successful load a user could edit url to a cross-origin target and then merely change the lookback, without any explicit Load, and chQuery(q, conn) would POST user / password to that target. Even for a benign different server it overlays that server's query_views_log onto the graph still belonging to the previous load — the verdict's exact reproduction.

The fix takes the verdict's "bind to the last successfully loaded connection" option. A new loadedConnection records the connection that produced the current state.nodes; loadAll sets it the moment it commits its batch as the latest load (right after the isLatestLoad guard). A standalone loadViewsLoad now binds to loadedConnection instead of the live fields, so a lookback change re-queries only the server that produced the displayed graph. Contacting a newly-edited (possibly cross-origin) target still requires an explicit Load, which runs loadAll and re-commits both state.nodes and loadedConnection together. If no load has committed a connection yet it fails closed and contacts nothing (there is no graph to key a heat-map against anyway). The loadAll-driven path is unchanged — it still passes its own captured conn.

Verified with a DOM-stub harness over the page's own extracted loadViewsLoad / connectionSnapshot (the test the verdict asks for): after a same-origin load, editing url to an attacker origin and changing the lookback no longer contacts that origin — the reload queries the committed connection and keys the heat-map to the displayed graph; the loadAll-driven call (explicit conn) is unchanged; and a standalone reload with no committed connection contacts nothing. The same harness against the pre-fix code does POST to the attacker origin, so the check is meaningful. As with the other programs/server/*.html web UIs there is no committed CI runner for these DOM-stub checks, so the harness is run locally.

The fresh push re-runs CI.

Comment thread programs/server/schema.html
The AI review (Request changes) flagged that `storeCredentials` persisted the
entered credentials via `navigator.credentials.store` for **any** target,
including a cross-origin `#url` (`programs/server/schema.html:718`).

`PasswordCredential` is scoped to the page origin, not to the backend named in
`#url` — its `name` field is a display label, not an isolation key. So an
explicit cross-origin load stored those credentials under `/schema`'s own
origin. On a later plain visit (no `?url=`), `#url` defaults back to
`location.origin` and the page auto-loads it; the browser can autofill the
remembered cross-origin credentials into the fields, so that same-origin
auto-load would POST them to this page's own server instead of the cross-origin
one they belong to.

`play.html` can store unconditionally because it never auto-contacts a server on
open; `/schema` does, so it must not persist a credential that its own auto-load
could later misdirect. The fix gates `storeCredentials` on the existing
`isSameOriginTarget` invariant that already guards the page-load auto-load and
the paste probe, so only same-origin credentials are ever remembered.

Verified with a DOM-stub harness over the page's own `storeCredentials` /
`isSameOriginTarget`: a same-origin login is stored, a cross-origin login (both
`https://other` and protocol-relative `//other`) is not, an unparseable `#url`
fails closed, and empty-password / no-`PasswordCredential` cases store nothing;
the same harness against the pre-fix code does store the cross-origin
credentials, so the check is meaningful. As with the other
`programs/server/*.html` web UIs there is no committed CI runner for these
DOM-stub checks, so the harness is run locally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review (Request changes) credential-scoping blocker (programs/server/schema.html:718) in 159cc187245.

storeCredentials persisted the entered credentials via navigator.credentials.store for any target, including a cross-origin #url. PasswordCredential is scoped to the page origin, not to the backend named in #url (its name is a display label, not an isolation key), so an explicit cross-origin Load stored those credentials under /schema's own origin. On a later plain visit (no ?url=), #url defaults back to location.origin and the page auto-loads it, so the browser could autofill the remembered cross-origin credentials into the fields and that same-origin auto-load would POST them to this page's own server instead of the cross-origin one they belong to — the verdict's exact reproduction.

The fix takes the verdict's first option: storeCredentials now stores only when isSameOriginTarget($('url').value) — the same invariant that already gates the page-load auto-load and the paste probe. play.html can store unconditionally because it never auto-contacts a server on open; /schema does, so it must not persist a credential that its own auto-load could later misdirect.

Verified with a DOM-stub harness over the page's own storeCredentials / isSameOriginTarget (the regression the verdict asks for): a same-origin login is stored, a cross-origin login (both https://other and protocol-relative //other) is not, an unparseable #url fails closed, and the empty-password / no-PasswordCredential cases store nothing. The same harness against the pre-fix code does store the cross-origin credentials, so the check is meaningful. As with the other programs/server/*.html web UIs there is no committed CI runner for these DOM-stub checks, so the harness is run locally.

The fresh push re-runs CI.

Comment thread programs/server/schema.html
…in auto-load

The AI review (Request changes) flagged that `/schema` only implements the
`store()` half of the `play.html`-style remembered-auth flow
(`programs/server/schema.html:726`). Unlike `play.html`, the connection inputs
were bare `<input>`s outside a `<form>` and without `name=` attributes, and the
page never requested the stored credential before the same-origin auto-load at
the bottom of the script. So the first `loadAll` snapshotted whatever happened
to be in `#user` / `#password` at script time; a remembered login could be
missing when that automatic request fired, leaving `Failed: ...` onscreen even
though a valid saved credential existed.

This adopts `play.html`'s form semantics and adds the missing retrieval step:

- Wrap the connection inputs in a `<form>` with `name="url"` / `name="user"` /
  `name="password"` (as in `play.html`) so the browser offers to save and later
  autofill the credentials, and handle the form `submit` (Enter in a field) by
  loading the schema instead of navigating away.
- Before the same-origin auto-load, when the fields are empty, retrieve a saved
  credential via `navigator.credentials.get` and fill the fields, then start the
  load. Retrieval is gated on the same `isSameOriginTarget` invariant that guards
  the auto-load, so the credential stays bound to the origin it was stored for; a
  cross-origin `?url=` still neither retrieves a credential nor auto-loads.
  `mediation: 'silent'` never shows UI, so opening the page stays non-interactive,
  and failures are caught so the load still runs. `play.html` needs no retrieval
  step because it never auto-contacts a server on open.

Verified with a DOM-stub harness over the page's own extracted `autoLoadOnOpen` /
`isSameOriginTarget` (27 checks): a same-origin open with empty fields and a
stored credential fills the fields and loads; with no stored credential it still
loads with empty fields; URL-supplied credentials are never overridden; a
cross-origin target neither retrieves a credential nor loads; `file:` open,
missing `PasswordCredential`, and a throwing `get()` all degrade to the prior
behavior. As with the other `programs/server/*.html` web UIs there is no
committed CI runner for these DOM-stub checks, so the harness is run locally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review's sole remaining blocker (the ⚠️ Request changes on the remembered-credentials flow, programs/server/schema.html:726) in 0dc76ad2005.

The play.html-style saved login now actually works on revisit:

  • The connection inputs are wrapped in a <form> with name= attributes (as in play.html), so the browser can offer to save and later autofill them, and the form submit (Enter) loads the schema.
  • Before the same-origin auto-load, when the fields are empty, the page retrieves the saved credential via navigator.credentials.get({ password: true, mediation: 'silent' }) and fills the fields, then loads — reusing the existing isSameOriginTarget gate so a cross-origin ?url= still neither retrieves a credential nor auto-loads, and mediation: 'silent' keeps the open non-interactive.

Verified with a DOM-stub harness over the page's own autoLoadOnOpen / isSameOriginTarget (27 checks), covering the stored-credential, no-credential, URL-supplied-credential, cross-origin, file:, no-PasswordCredential, and throwing-get() cases. The remaining CI reds are the known unrelated flakes (Server died / AddressSanitizer (STID: 0883-4595) #107487 and test_rabbitmq_mv_combo timeout), which an HTML-only change cannot cause.

Comment thread programs/server/schema.html Outdated
The last remaining AI-review blocker (`programs/server/schema.html`,
`autoLoadOnOpen`) was a cross-origin credential leak via a check-then-act race.

`autoLoadOnOpen` validated `isSameOriginTarget` and the empty user/password
fields only *before* awaiting `navigator.credentials.get`. That retrieval is
async, so while it was pending the user could paste a cross-origin URL into
`#url` or start typing their own credentials; the callback then unconditionally
filled the saved same-origin login and called `loadAll`, which reads the *live*
fields via `connectionSnapshot`. A same-origin login remembered by the
Credential Management API could therefore be POSTed as `user`/`password` to a
user-chosen (possibly attacker) server on page open — the same leak
`isSameOriginTarget` guards against on this path.

Snapshot the URL up front and, after the retrieval, re-establish both
invariants (URL unchanged, fields still empty). If either changed, abort the
whole auto-load — the user is now driving and will load explicitly via the form
(`Load` / Enter). The re-check is placed after the `try`/`catch` so it also
guards the path where `navigator.credentials.get` throws.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member Author

Addressed the AI review's sole remaining blocker (the ⚠️ Request changes on the credential auto-load, programs/server/schema.html:2226) in e906759af92.

autoLoadOnOpen checked isSameOriginTarget and the empty user/password fields only before awaiting navigator.credentials.get. That retrieval is async, so while it was pending the user could paste a cross-origin URL into #url (or start typing their own credentials); the callback then unconditionally filled the saved same-origin login and called loadAll, which reads the live fields via connectionSnapshot. A same-origin login remembered by the Credential Management API could therefore be POSTed as user/password to a user-chosen (possibly attacker) target on page open — the same leak isSameOriginTarget guards against on this path.

The fix snapshots the URL up front and, after the retrieval, re-establishes both invariants (URL unchanged, fields still empty). If either changed, the whole auto-load is aborted — the user is now driving and will load explicitly via the form (Load / Enter). The re-check sits after the try/catch, so it also guards the path where navigator.credentials.get throws.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current guard only proves that the user did not edit the live fields while navigator.credentials.get was pending; it does not prove that the returned PasswordCredential belongs to the current backend URL. PasswordCredential is scoped to the page origin, while plain revisits initialize #url to location.origin at programs/server/schema.html:689. Concrete trace: explicitly load https://example.com/ch-proxy, let this PR save that login, then reopen plain /schema; get(...) can return the saved credential here and loadAll will POST it to https://example.com/?..., not to /ch-proxy. Please bind remembered-logins to the exact backend URL before auto-reusing them here (or only persist credentials for the exact default target).

@alexey-milovidov alexey-milovidov self-assigned this Jul 3, 2026
@alexey-milovidov alexey-milovidov added this pull request to the merge queue Jul 3, 2026
Merged via the queue into master with commit d384449 Jul 3, 2026
174 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-schema-handler-ui branch July 3, 2026 02:01
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants