Always build ClickHouse as PIE and emit stack traces as file offsets#103347
Always build ClickHouse as PIE and emit stack traces as file offsets#103347nikitamikhaylov wants to merge 18 commits into
Conversation
Drop the conditional `ENABLE_POSITION_INDEPENDENT_BINARY` / `-no-pie` logic and always build as `PIE` via `CMAKE_POSITION_INDEPENDENT_CODE ON` plus explicit `-pie -Wl,-pie`. The previous "disabled" cases (Android, S390X, sanitizers) all required `PIE` anyway. Under `PIE`, raw frame pointers are `ASLR`-randomized per run and unusable with `addr2line`. Symbolized fatal lines and `system.stack_trace` / `system.trace_log` already subtract the object base; the two bare dumps in `SignalHandlers::onFault` did not. Add a helper `StackTrace::getPhysicalAddress` and use it in the bare traces and in `StorageSystemStackTrace` so every address we print is a stable file offset that `addr2line -e clickhouse <hex>` can resolve directly. Made-with: Cursor
| { | ||
| writeChar(' ', bare_stacktrace); | ||
| writePointerHex(stack_trace.getFramePointers()[i], bare_stacktrace); | ||
| writePointerHex(StackTrace::getPhysicalAddress(stack_trace.getFramePointers()[i]), bare_stacktrace); |
There was a problem hiding this comment.
Using StackTrace::getPhysicalAddress inside SignalListener::onFault makes the fallback bare trace depend on SymbolIndex::instance(), which can still throw during first-time initialization (it may allocate tens of MiB). If that happens while handling a fault, we lose the raw-address fallback and only log the exception from the handler.
Can we keep this path non-throwing by falling back to the original raw pointer when object lookup is unavailable (or by using a tryGetPhysicalAddress variant that never initializes/throws in the fault path)? This should apply to both bare-trace loops in this function.
| DB::MainThreadStatus::getInstance(); | ||
|
|
||
| /// Initialize SymbolIndex explicitly. | ||
| DB::SymbolIndex::instance(); |
There was a problem hiding this comment.
DB::SymbolIndex::instance is called before the try block, so any exception from symbol index initialization escapes mainEntryClickHouseLocal uncaught.
This turns symbol-index warmup into a hard startup failure path for clickhouse-local (previously it was only needed for stack-trace symbolization paths). Please move this call inside the existing try block, or make it best-effort (try/catch + warning log) so local startup remains robust.
|
|
Switch '-static -lc' to '-static-pie -lc' for the musl build so the clang driver picks the PIC CRT variants (rcrt1.o, crtbeginS.o, crtendS.o) that contrib/sysroot now provides. With plain '-static', the driver picks the non-PIC crtbeginT.o and lld rejects its absolute relocations when linking with -pie.
…-pie-stack-traces-offsets
(cherry picked from commit 154397c)
| #include <Common/remapExecutable.h> | ||
|
|
||
| #if defined(OS_LINUX) && defined(__amd64__) && defined(__SSE2__) && !defined(SANITIZER) && defined(NDEBUG) | ||
| #if 0 |
There was a problem hiding this comment.
With #if 0 this file now always compiles the stub remapExecutable that returns 0, so enabling the remap_executable server setting becomes a silent no-op. Server::main still logs "The code (...) has been successfully remapped.", which is now misleading.
Please either keep a working implementation for PIE or explicitly disable the setting path (and log that remapping is unsupported under PIE) instead of reporting success.
|
|
||
| include(cmake/dbms_glob_sources.cmake) | ||
|
|
||
| # Build ClickHouse as a Position-Independent Executable. |
There was a problem hiding this comment.
Changelog category in the PR body is currently Performance Improvement, but this change is primarily a build/runtime-behavior change (PIE defaults and stack trace address representation), not a measurable query/runtime speedup. Please switch it to Build/Testing/Packaging Improvement (or Improvement if you want to emphasize user-visible behavior).
| const void * StackTrace::getPhysicalAddress(const void * virtual_addr) | ||
| { | ||
| #if defined(__ELF__) | ||
| /// Subtract the load base of the object the address falls into, turning a |
There was a problem hiding this comment.
The Changelog entry currently mentions only always-on PIE, but this hunk also changes user-visible stack trace semantics (system.stack_trace and bare fatal traces now emit normalized file offsets instead of raw virtual addresses). Please include that behavioral change explicitly so operators know why addresses look different and how to resolve them.
…-pie-stack-traces-offsets
| DB::MainThreadStatus::getInstance(); | ||
|
|
||
| /// Initialize SymbolIndex explicitly. | ||
| DB::SymbolIndex::instance(); |
There was a problem hiding this comment.
SymbolIndex::instance is called before the try block here. SymbolIndex initialization can throw (for example while loading/parsing symbol data), and in clickhouse-local this would bypass the existing exception reporting path and terminate early with an uncaught exception.
Please move this initialization inside the existing try/catch path (or wrap it with equivalent handling) so startup failures are reported consistently.
| { | ||
| writeChar(' ', bare_stacktrace); | ||
| writePointerHex(stack_trace.getFramePointers()[i], bare_stacktrace); | ||
| writePointerHex(StackTrace::getPhysicalAddress(stack_trace.getFramePointers()[i]), bare_stacktrace); |
There was a problem hiding this comment.
This "bare stack trace" path is intended to be a fallback, but it now depends on SymbolIndex through StackTrace::getPhysicalAddress.
If symbol-index resolution is unavailable/fails in a fault context, this fallback can fail too and we lose raw addresses entirely. Please make the address normalization best-effort here (fallback to the original pointer when conversion cannot be done) so the emergency trace remains robust.
|
Dear @azat, you haven't been active on this PR for 30 days. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself. |
…-pie-stack-traces-offsets
| # TODO: switch to stateless-test image | ||
| run_in_docker="clickhouse/performance-comparison", | ||
| # --privileged for mlock/CAP_IPC_LOCK | ||
| run_in_docker="clickhouse/performance-comparison+--privileged", |
There was a problem hiding this comment.
The performance job now runs its whole container with --privileged, but the stated reason is only mlock/CAP_IPC_LOCK. That grants all capabilities and device access for every performance comparison run, which is much broader than the operation needs.
Can we keep this job least-privileged and use a narrower Docker option instead, e.g. +--cap-add=IPC_LOCK (and only add any extra limit option if the job proves it needs one)?
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered by tests: 23/34 (67.65%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 3 line(s) · Uncovered code |
|
📊 Cloud Performance Report ✅ AI verdict: This is a build-only change: it switches the binary to a Position-Independent Executable, enables executable memory-locking/remapping in the benchmark config, and normalizes stack-trace addresses for symbolization. None of it touches query execution. A binary-wide build change would move queries uniformly, but the three flagged TPC-H queries swing in opposite directions (Q7 -29%, Q8 +18%, Q20 -43%), and Q20's measurements were very noisy with only 6 source runs. We're treating all three as run-to-run variance rather than real PR effects; no action needed. clickbench🟢 No significant changes tpch_adapted_1_officialFlagged queries (3 of 22)
q-value = BH-FDR adjusted p; smaller is stronger evidence. MIRAI flags a query when q < fdr_q (default 0.10) — the value the verdict is based on. Debug info
|
|
@groeneai Why does this PR affect tests? Both functional and performance? |

Drop the conditional
ENABLE_POSITION_INDEPENDENT_BINARY/-no-pielogic and always build asPIEviaCMAKE_POSITION_INDEPENDENT_CODE ONplus explicit-pie -Wl,-pie. The previous "disabled" cases (Android, S390X, sanitizers) all requiredPIEanyway.Under
PIE, raw frame pointers areASLR-randomized per run and unusable withaddr2line. Symbolized fatal lines andsystem.stack_trace/system.trace_logalready subtract the object base; the two bare dumps inSignalHandlers::onFaultdid not. Add a helperStackTrace::getPhysicalAddressand use it in the bare traces and inStorageSystemStackTraceso every address we print is a stable file offset thataddr2line -e clickhouse <hex>can resolve directly.Made-with: Cursor
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
ClickHouse is now always built as a Position Independent Executable.
Documentation entry for user-facing changes