Always build ClickHouse as PIE and emit stack traces as file offsets by nikitamikhaylov · Pull Request #103347 · ClickHouse/ClickHouse · GitHub
Skip to content

Always build ClickHouse as PIE and emit stack traces as file offsets#103347

Open
nikitamikhaylov wants to merge 18 commits into
masterfrom
always-pie-stack-traces-offsets
Open

Always build ClickHouse as PIE and emit stack traces as file offsets#103347
nikitamikhaylov wants to merge 18 commits into
masterfrom
always-pie-stack-traces-offsets

Conversation

@nikitamikhaylov

@nikitamikhaylov nikitamikhaylov commented Apr 22, 2026

Copy link
Copy Markdown
Member

Drop the conditional ENABLE_POSITION_INDEPENDENT_BINARY / -no-pie logic and always build as PIE via CMAKE_POSITION_INDEPENDENT_CODE ON plus explicit -pie -Wl,-pie. The previous "disabled" cases (Android, S390X, sanitizers) all required PIE anyway.

Under PIE, raw frame pointers are ASLR-randomized per run and unusable with addr2line. Symbolized fatal lines and system.stack_trace / system.trace_log already subtract the object base; the two bare dumps in SignalHandlers::onFault did not. Add a helper StackTrace::getPhysicalAddress and use it in the bare traces and in StorageSystemStackTrace so every address we print is a stable file offset that addr2line -e clickhouse <hex> can resolve directly.

Made-with: Cursor

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

ClickHouse is now always built as a Position Independent Executable.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Drop the conditional `ENABLE_POSITION_INDEPENDENT_BINARY` / `-no-pie`
logic and always build as `PIE` via `CMAKE_POSITION_INDEPENDENT_CODE ON`
plus explicit `-pie -Wl,-pie`. The previous "disabled" cases (Android,
S390X, sanitizers) all required `PIE` anyway.

Under `PIE`, raw frame pointers are `ASLR`-randomized per run and
unusable with `addr2line`. Symbolized fatal lines and `system.stack_trace`
/ `system.trace_log` already subtract the object base; the two bare
dumps in `SignalHandlers::onFault` did not. Add a helper
`StackTrace::getPhysicalAddress` and use it in the bare traces and in
`StorageSystemStackTrace` so every address we print is a stable file
offset that `addr2line -e clickhouse <hex>` can resolve directly.

Made-with: Cursor
@nikitamikhaylov nikitamikhaylov requested a review from azat April 22, 2026 10:54
@clickhouse-gh

clickhouse-gh Bot commented Apr 22, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label Apr 22, 2026
@azat azat self-assigned this Apr 22, 2026
{
writeChar(' ', bare_stacktrace);
writePointerHex(stack_trace.getFramePointers()[i], bare_stacktrace);
writePointerHex(StackTrace::getPhysicalAddress(stack_trace.getFramePointers()[i]), bare_stacktrace);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using StackTrace::getPhysicalAddress inside SignalListener::onFault makes the fallback bare trace depend on SymbolIndex::instance(), which can still throw during first-time initialization (it may allocate tens of MiB). If that happens while handling a fault, we lose the raw-address fallback and only log the exception from the handler.

Can we keep this path non-throwing by falling back to the original raw pointer when object lookup is unavailable (or by using a tryGetPhysicalAddress variant that never initializes/throws in the fault path)? This should apply to both bare-trace loops in this function.

DB::MainThreadStatus::getInstance();

/// Initialize SymbolIndex explicitly.
DB::SymbolIndex::instance();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DB::SymbolIndex::instance is called before the try block, so any exception from symbol index initialization escapes mainEntryClickHouseLocal uncaught.

This turns symbol-index warmup into a hard startup failure path for clickhouse-local (previously it was only needed for stack-trace symbolization paths). Please move this call inside the existing try block, or make it best-effort (try/catch + warning log) so local startup remains robust.

@nikitamikhaylov

Copy link
Copy Markdown
Member Author

riscv64, musl, longaarch64, ppc64le and riscv64 builds are failing because sysroot contents (libs, archives) are not build as PIC.

@clickhouse-gh clickhouse-gh Bot added the submodule changed At least one submodule changed in this PR. label Apr 22, 2026
Comment thread programs/server/Server.cpp
Comment thread src/Common/StackTrace.cpp Outdated
Switch '-static -lc' to '-static-pie -lc' for the musl build so
the clang driver picks the PIC CRT variants (rcrt1.o,
crtbeginS.o, crtendS.o) that contrib/sysroot now provides. With
plain '-static', the driver picks the non-PIC crtbeginT.o and lld
rejects its absolute relocations when linking with -pie.
@clickhouse-gh clickhouse-gh Bot added pr-performance Pull request with some performance improvements and removed pr-improvement Pull request with some product improvements labels Apr 23, 2026
(cherry picked from commit 154397c)
#include <Common/remapExecutable.h>

#if defined(OS_LINUX) && defined(__amd64__) && defined(__SSE2__) && !defined(SANITIZER) && defined(NDEBUG)
#if 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #if 0 this file now always compiles the stub remapExecutable that returns 0, so enabling the remap_executable server setting becomes a silent no-op. Server::main still logs "The code (...) has been successfully remapped.", which is now misleading.

Please either keep a working implementation for PIE or explicitly disable the setting path (and log that remapping is unsupported under PIE) instead of reporting success.

Comment thread CMakeLists.txt

include(cmake/dbms_glob_sources.cmake)

# Build ClickHouse as a Position-Independent Executable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog category in the PR body is currently Performance Improvement, but this change is primarily a build/runtime-behavior change (PIE defaults and stack trace address representation), not a measurable query/runtime speedup. Please switch it to Build/Testing/Packaging Improvement (or Improvement if you want to emphasize user-visible behavior).

Comment thread src/Common/StackTrace.cpp
const void * StackTrace::getPhysicalAddress(const void * virtual_addr)
{
#if defined(__ELF__)
/// Subtract the load base of the object the address falls into, turning a

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Changelog entry currently mentions only always-on PIE, but this hunk also changes user-visible stack trace semantics (system.stack_trace and bare fatal traces now emit normalized file offsets instead of raw virtual addresses). Please include that behavioral change explicitly so operators know why addresses look different and how to resolve them.

DB::MainThreadStatus::getInstance();

/// Initialize SymbolIndex explicitly.
DB::SymbolIndex::instance();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SymbolIndex::instance is called before the try block here. SymbolIndex initialization can throw (for example while loading/parsing symbol data), and in clickhouse-local this would bypass the existing exception reporting path and terminate early with an uncaught exception.

Please move this initialization inside the existing try/catch path (or wrap it with equivalent handling) so startup failures are reported consistently.

{
writeChar(' ', bare_stacktrace);
writePointerHex(stack_trace.getFramePointers()[i], bare_stacktrace);
writePointerHex(StackTrace::getPhysicalAddress(stack_trace.getFramePointers()[i]), bare_stacktrace);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "bare stack trace" path is intended to be a fallback, but it now depends on SymbolIndex through StackTrace::getPhysicalAddress.

If symbol-index resolution is unavailable/fails in a fault context, this fallback can fail too and we lose raw addresses entirely. Please make the address normalization best-effort here (fallback to the original pointer when conversion cannot be done) so the emergency trace remains robust.

@clickhouse-gh

clickhouse-gh Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Dear @azat, you haven't been active on this PR for 30 days. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

@clickhouse-gh clickhouse-gh Bot unassigned azat Jun 1, 2026
Comment thread ci/defs/job_configs.py
# TODO: switch to stateless-test image
run_in_docker="clickhouse/performance-comparison",
# --privileged for mlock/CAP_IPC_LOCK
run_in_docker="clickhouse/performance-comparison+--privileged",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance job now runs its whole container with --privileged, but the stated reason is only mlock/CAP_IPC_LOCK. That grants all capabilities and device access for every performance comparison run, which is much broader than the operation needs.

Can we keep this job least-privileged and use a narrower Docker option instead, e.g. +--cap-add=IPC_LOCK (and only add any extra limit option if the job proves it needs one)?

@clickhouse-gh

clickhouse-gh Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.60% 84.50% -0.10%
Functions 92.30% 92.30% +0.00%
Branches 77.20% 77.20% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 23/34 (67.65%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 3 line(s) · Uncovered code

Full report · Diff report

@clickhouse-gh

clickhouse-gh Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

📊 Cloud Performance Report

✅ AI verdict: no_change — no significant changes across 38 queries analysed

This is a build-only change: it switches the binary to a Position-Independent Executable, enables executable memory-locking/remapping in the benchmark config, and normalizes stack-trace addresses for symbolization. None of it touches query execution. A binary-wide build change would move queries uniformly, but the three flagged TPC-H queries swing in opposite directions (Q7 -29%, Q8 +18%, Q20 -43%), and Q20's measurements were very noisy with only 6 source runs. We're treating all three as run-to-run variance rather than real PR effects; no action needed.

clickbench

🟢 No significant changes

tpch_adapted_1_official

⚠️ 3 inconclusive

Flagged queries (3 of 22)
Query Verdict Baseline med (ms) PR med (ms) Change q-value Hint
⚠️ 7 not_sure 113 80 -29.2% <0.0001 PIE/mlock build change is binary-wide, can't cause a query-specific 29% gain; run-to-run variance
⚠️ 8 not_sure 158 187 +18.4% <0.0001 Build/linker-only change (PIE), no execution-path edit; opposing-sign deltas point to variance
⚠️ 20 not_sure 246 140 -43.1% <0.0001 Off-path build change; query is very noisy (wide spread) with only 6 source runs

q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on.

Debug info
  • StressHouse run: b7366487-fa9e-432d-a5a1-17e99dc80d81
  • MIRAI run: 61231b7c-3bfb-48fd-a5cc-f81c4d9d4f3e
  • PR check IDs:
    • clickbench_30735_1780936463
    • clickbench_30830_1780936494
    • clickbench_30917_1780936524
    • tpch_adapted_1_official_31013_1780936561
    • tpch_adapted_1_official_31109_1780936626
    • tpch_adapted_1_official_31205_1780936718

@nikitamikhaylov

Copy link
Copy Markdown
Member Author

@groeneai Why does this PR affect tests? Both functional and performance?

@groeneai

Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance pr-performance Pull request with some performance improvements submodule changed At least one submodule changed in this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants