Add jemalloc profiling web UI for ClickHouse Keeper by murphy-4o · Pull Request #100606 · ClickHouse/ClickHouse · GitHub
Skip to content

Add jemalloc profiling web UI for ClickHouse Keeper#100606

Merged
murphy-4o merged 18 commits into
ClickHouse:masterfrom
murphy-4o:keeper-jemalloc-webui
Apr 15, 2026
Merged

Add jemalloc profiling web UI for ClickHouse Keeper#100606
murphy-4o merged 18 commits into
ClickHouse:masterfrom
murphy-4o:keeper-jemalloc-webui

Conversation

@murphy-4o

@murphy-4o murphy-4o commented Mar 24, 2026

Copy link
Copy Markdown
Member

Expose jemalloc heap profiling and statistics through Keeper's HTTP control interface, matching the functionality already available in the ClickHouse Server web UI.

Closes #100544

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Added a jemalloc profiling web UI for ClickHouse Keeper, available at /jemalloc on the HTTP control port.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Version info

  • Merged into: 26.4.1.954

Expose jemalloc heap profiling and statistics through Keeper's HTTP
control interface, matching the functionality already available in the
ClickHouse Server web UI.

New REST API endpoints on the Keeper HTTP control port:
- GET /jemalloc — serves the interactive web UI (shared with server via
  `isKeeper` mode flag injected at serve time)
- GET /jemalloc/stats — returns raw `malloc_stats_print` output
- GET /jemalloc/status — returns profiling state as JSON
- GET /jemalloc/profile?format={collapsed|raw} — flushes a heap profile,
  symbolizes it, and returns collapsed stacks for flame graph rendering

The web UI reuses the server's `jemalloc.html` with a thin `isKeeper`
abstraction that swaps SQL-based data fetching for REST API calls and
hides server-specific features (Query Profiler, connection params, output
format options). This avoids duplicating ~3000 lines of HTML/CSS/JS.

New shared helpers in `Jemalloc::heapProfileToCollapsedStacks` (parses
raw jemalloc heap dump, symbolizes addresses via `StackTrace::forEachFrame`,
aggregates into FlameGraph-compatible collapsed stacks) and
`Jemalloc::getStats` (captures `malloc_stats_print` output as a string).

Closes ClickHouse#100544

Made-with: Cursor
@clickhouse-gh

clickhouse-gh Bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-feature Pull request with new product feature label Mar 24, 2026
Add Keeper tab to the jemalloc web UI section showing the HTTP control
port URL and the REST API endpoints for programmatic access.

Made-with: Cursor
Comment thread src/Server/KeeperJemallocHandler.cpp Outdated
Comment thread programs/server/jemalloc.html Outdated
…oc UI

Add config example showing that `keeper_server.http_control.port` must be
explicitly enabled, and a security warning noting the lack of
application-level auth on the Keeper HTTP control port with recommended
network-level mitigations.

Made-with: Cursor
… duplicate comment

- `handleStatus` no longer silently swallows `mallctl` exceptions.
  Failures are logged via `tryLogCurrentException` and surfaced in the
  JSON response as an `errors` array so operators can distinguish
  "profiling disabled" from "status unknown".
- Remove triple-duplicated "Substitute user name" comment in
  `jemalloc.html`.

Made-with: Cursor
Comment thread src/Server/KeeperJemallocHandler.cpp Outdated
…d integration tests

- Inject the Keeper config `<script>` after `<head>` instead of before
  `<!DOCTYPE html>` to avoid triggering quirks mode in browsers.
- Redirect `/jemalloc/` (trailing slash) to `/jemalloc` with HTTP 301 so
  the UI is reachable from both paths.
- Add integration tests for all Keeper jemalloc HTTP endpoints: web UI,
  trailing-slash redirect, /stats, /status, bad profile format (400),
  and unknown API path (404).

Made-with: Cursor
Comment thread tests/integration/test_keeper_http_jemalloc/test.py Outdated
- Avoid `bugprone-suspicious-stringview-data-usage` by building the full
  HTML string with the injected config script before writing.
- Use const reference for `path` to avoid unnecessary copy
  (`performance-unnecessary-copy-initialization`).

Made-with: Cursor
Comment thread src/Server/KeeperJemallocHandler.cpp
- Call `setResponseDefaultHeaders` on 400/404/301 responses for
  consistent CORS/cache headers across all paths.
- Use `data()` instead of `c_str()` for consistency with the rest of the
  file and the codebase.

Made-with: Cursor
Comment thread src/Server/KeeperHTTPHandlerFactory.h Outdated
- Register `KeeperJemallocAPIHandler` unconditionally so it works in
  sanitizer builds where `USE_JEMALLOC` is off.  API routes return
  HTTP 501 when jemalloc is not linked; the `/jemalloc/` trailing-slash
  redirect works in all builds.
- Short-circuit HEAD on `/jemalloc/profile` before `flushProfile` to
  avoid expensive side effects from lightweight probes.
- Fix wrong error key `thread.prof.active` → `prof.thread_active_init`
  in `handleStatus` JSON errors array.
- Move `#include <Server/KeeperJemallocHandler.h>` from the header to
  `KeeperHTTPHandlerFactory.cpp` to reduce transitive compile cost.
- Rewrite integration tests: always validate correct behavior for both
  jemalloc and non-jemalloc builds (no skips in sanitizer CI), add
  HEAD-method coverage, add positive-path profile tests, parametrize
  to eliminate duplication.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100606&sha=6ef7cf2a5d0ff347554cf731107fc69d631dc38b&name_0=PR

Made-with: Cursor
Comment thread programs/server/jemalloc.html
…h routing

Replace the monolithic `KeeperJemallocAPIHandler` with dedicated handler
classes (`KeeperJemallocProfileHandler`, `KeeperJemallocStatsHandler`,
`KeeperJemallocStatusHandler`, `KeeperJemallocRedirectHandler`,
`KeeperJemallocNotAvailableHandler`) each registered on a strict path via
the factory.  This eliminates manual URL dispatching inside `handleRequest`
and follows the same pattern used by other Keeper HTTP handlers.

Also fixes `#include` indentation inside `#if USE_JEMALLOC`.

Made-with: Cursor
Comment thread src/Server/KeeperJemallocHandler.cpp
Comment thread src/Server/KeeperJemallocHandler.cpp
- Fix `attachStrictPath` not matching URIs with query parameters: replace
  with a custom filter that matches the path portion, ignoring `?...`
  suffixes.  This caused `/jemalloc/profile?format=bad` to 404 in
  non-jemalloc builds instead of returning 501.

- Fix HEAD test assertions: HEAD responses carry no body, so checking
  `response.text` for "not available" fails on 501.  Only assert the
  status code for HEAD.

- Add HEAD short-circuit to `KeeperJemallocStatsHandler` to avoid the
  expensive `je_malloc_stats_print` call when no body is needed.

- Handle `status.errors` in the web UI's `checkProfilingEnabled`: when
  the `/jemalloc/status` response includes errors, treat profiling state
  as unknown rather than silently defaulting to disabled.

Made-with: Cursor
@murphy-4o murphy-4o force-pushed the keeper-jemalloc-webui branch from 69cb391 to ede64f4 Compare March 27, 2026 06:44
Comment thread src/Server/KeeperJemallocHandler.cpp Outdated
When `std::filesystem::remove` fails in the `SCOPE_EXIT` of
`KeeperJemallocProfileHandler`, log a `LOG_WARNING` with the file path
and error message instead of silently dropping the `std::error_code`.

https: //github.com/ClickHouse/pull/100606#discussion_r2999333464
Made-with: Cursor
`KeeperJemallocStatusHandler` previously substituted default values
(false/0) when `mallctl` reads failed, making it impossible for consumers
to distinguish "profiling is disabled" from "status unknown".

Replace bare values with `std::optional<T>` and serialize failed reads as
JSON `null` instead of silent defaults.  When `opt.prof` itself fails,
all dependent fields (`prof_active`, `thread_active_init`, `lg_sample`)
are also `null`.  When `opt.prof` is definitively `false`, dependent
fields carry authoritative false/0 values.

The `errors` array is still present for failed keys, providing
machine-readable diagnostics alongside the per-field tri-state.

https: //github.com/ClickHouse/pull/100606#discussion_r2999333464
Made-with: Cursor
@murphy-4o murphy-4o marked this pull request as ready for review April 1, 2026 03:27
@murphy-4o murphy-4o requested a review from antonio2368 April 1, 2026 03:27
@antonio2368 antonio2368 self-assigned this Apr 1, 2026
@alexey-milovidov

Copy link
Copy Markdown
Member

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

@antonio2368 antonio2368 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check src/CMakeLists.txt and how we use OBJECT_DEPENDS with html files

Comment thread src/Common/Jemalloc.cpp Outdated
}


std::string heapProfileToCollapsedStacks(const std::string & input_filename)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look symbolizeJemallocHeapProfile. Maybe we can generalize it more to use it here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread programs/server/jemalloc.html Outdated
const statusColor = isEnabled ? 'var(--success-color)' : 'var(--error-color)';
const statusText = isEnabled ? 'Enabled' : 'Disabled';
const statusIcon = isEnabled ? '✓' : '✗';
const instanceName = isKeeper ? 'Keeper' : 'server';

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@alexey-milovidov

Copy link
Copy Markdown
Member

The MSan stress test failure (MemorySanitizer: use-of-uninitialized-value, STID 4179-5154 or 4148-3044) is a known pre-existing issue unrelated to this PR. Fix: #102158

@alexey-milovidov

Copy link
Copy Markdown
Member

Azure was broken, I'm trying to untie our tests from it: #102354 (comment)

…build deps, remove dead code

- Add `OBJECT_DEPENDS` for `KeeperJemallocHandler.cpp` in
  `src/CMakeLists.txt` so that changes to the `#embed`-ed
  `jemalloc.html` trigger recompilation, matching the existing pattern
  for other handlers.

- Replace the 130-line `heapProfileToCollapsedStacks` in
  `Common/Jemalloc.cpp` with a call to the existing
  `JemallocProfileSource` collapsed-stack pipeline via the new
  `symbolizeJemallocHeapProfileToString` helper.  This eliminates
  duplicated heap-dump parsing and address symbolization logic.

- Remove the unused `instanceName` JavaScript variable from
  `jemalloc.html` (assigned but never referenced in any template).

https: //github.com/ClickHouse/pull/100606
Made-with: Cursor
@clickhouse-gh

clickhouse-gh Bot commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.00% -0.10%
Functions 90.90% 90.90% +0.00%
Branches 76.60% 76.50% -0.10%

Changed lines: 4.76% (9/189) · Uncovered code

Full report · Diff report

@murphy-4o murphy-4o enabled auto-merge April 15, 2026 10:25
@murphy-4o murphy-4o added this pull request to the merge queue Apr 15, 2026
Merged via the queue into ClickHouse:master with commit 7ce6ac7 Apr 15, 2026
161 checks passed
@murphy-4o murphy-4o deleted the keeper-jemalloc-webui branch April 15, 2026 10:53
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add jemalloc profiling web UI for ClickHouse Keeper

4 participants