Fix wrong shared library symbol resolution in stack traces by azat · Pull Request #101435 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix wrong shared library symbol resolution in stack traces#101435

Draft
azat wants to merge 14 commits into
ClickHouse:masterfrom
azat:fix-shared-library-symbol-resolution
Draft

Fix wrong shared library symbol resolution in stack traces#101435
azat wants to merge 14 commits into
ClickHouse:masterfrom
azat:fix-shared-library-symbol-resolution

Conversation

@azat

@azat azat commented Mar 31, 2026

Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix wrong shared library symbol resolution in stack traces

SymbolIndex stored all symbols from all loaded objects (main binary + shared libs) in a single flat vector sorted by file-relative offset (st_value). Different shared libraries have overlapping file offset ranges, so a binary search could return a symbol from the wrong library.

Fix by storing absolute virtual addresses (info->addr + st_value) instead of raw file offsets. This matches the approach already used on macOS (n_value + slide). For non-PIE builds info->addr is 0 for the main executable, so the change is a no-op for main binary symbols.

Consistently store virtual addresses in all consumers:

  • SymbolIndex: store info->addr + st_value in both collectSymbolsFromProgramHeaders and collectSymbolsFromELFSymbolTable
  • findSymbol: simplify to a direct lookup (no findObject + subtraction)
  • StackTrace: pass virtual_addr instead of physical_addr to findSymbol
  • StorageSystemStackTrace: store virtual addresses in the trace column (remove findObject + subtraction)
  • TraceCollector: remove address normalization (was subtracting address_begin, which breaks PIE builds where address_begin != 0)
  • addressToLine, TraceLog: convert virtual address to file offset before DWARF lookups (addr - address_begin), since DWARF uses file-relative addresses

Note, I've checked this and it looks valid I also manually tested it and it looks ok:

@clickhouse-gh

clickhouse-gh Bot commented Mar 31, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label Mar 31, 2026
@azat azat marked this pull request as draft April 1, 2026 06:52
`SymbolIndex` stored all symbols from all loaded objects (main binary +
shared libs) in a single flat vector sorted by file-relative offset
(`st_value`). Different shared libraries have overlapping file offset
ranges, so a binary search could return a symbol from the wrong library.

Fix by storing absolute virtual addresses (`info->addr + st_value`)
instead of raw file offsets. This matches the approach already used on
macOS (`n_value + slide`). For non-PIE builds `info->addr` is 0 for the
main executable, so the change is a no-op for main binary symbols.

Consistently store virtual addresses in all consumers:
- `SymbolIndex`: store `info->addr + st_value` in both
  `collectSymbolsFromProgramHeaders` and `collectSymbolsFromELFSymbolTable`
- `findSymbol`: simplify to a direct lookup (no `findObject` + subtraction)
- `StackTrace`: pass `virtual_addr` instead of `physical_addr` to `findSymbol`
- `StorageSystemStackTrace`: store virtual addresses in the `trace` column
  (remove `findObject` + subtraction)
- `TraceCollector`: remove address normalization (was subtracting
  `address_begin`, which breaks PIE builds where `address_begin` != 0)
- `addressToLine`, `TraceLog`: convert virtual address to file offset
  before DWARF lookups (`addr - address_begin`), since DWARF uses
  file-relative addresses

v2: fix system.symbols.symbol_demangled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@azat azat force-pushed the fix-shared-library-symbol-resolution branch from 12c1975 to 1ded14c Compare April 1, 2026 09:12
@azat azat marked this pull request as ready for review April 1, 2026 09:12
@azat

azat commented Apr 1, 2026

Copy link
Copy Markdown
Member Author

The problem with XRAY was that the symbols has relative offsets, fixed in the new patch version

@azat azat marked this pull request as draft April 2, 2026 15:08
@azat

azat commented Apr 2, 2026

Copy link
Copy Markdown
Member Author

Actually now we have a virtual addresses in trace_log, and after restart they cannot be resolved anymore

@alexey-milovidov

Copy link
Copy Markdown
Member

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

@alexey-milovidov

Copy link
Copy Markdown
Member

Merged latest master into the branch (it was ~14k commits behind and red).

  • The merge is conflict-free; the net diff is unchanged (the same 8 source files).
  • The prior CI failures were all the flaky 03172_error_log_table_not_empty, which is unrelated to this change (this PR does not touch that test) and was fixed on master by e7449c8e5eb ("fix flaky 03172_error_log_table_not_empty"), now included via the merge. The Stress test (arm_msan) fix from Fix MSan use-of-uninitialized-value in SimSIMD SVE functions #101239 (merged) is also picked up, as requested.
  • Built clickhouse cleanly (0 errors) and empirically verified symbol resolution with a fresh server:
    • system.stack_traceaddressToSymbol/demangle resolves correct names across objects (e.g. libc __syscall_cancel_arch/start_thread mixed with DB::TCPHandler::run()).
    • addressToLine resolves correct src/...:line locations (DWARF file-relative conversion via addr - address_begin).
    • system.trace_log (query profiler / TraceCollector path) resolves correct names too.

Note for reviewers: this still leaves the open question you raised — with absolute virtual addresses stored in trace_log, entries collected before a server restart can no longer be symbolized (PIE/ASLR changes the base). That is a design tradeoff for @azat / maintainers to decide; I did not change the approach here.

#endif

trace.emplace_back(static_cast<UInt64>(addr) - offset);
trace.emplace_back(static_cast<UInt64>(addr));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stores ASLR-dependent virtual addresses in durable system.trace_log.trace rows, but the row does not keep the load base needed to rebase them later. After a server restart, addressToSymbol, addressToLine, and flameGraph search the current process ranges, so rows collected before the restart can no longer symbolize even when build_id matches; before this change, main-binary frames were normalized to stable file-relative offsets.

Please keep a stable persisted representation for trace (for example, object identity plus file-relative offset, or at least the previous main-binary file offsets) or store enough load-base metadata to rebase old rows before symbolization.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a real trade-off worth a maintainer decision, so leaving it open for @azat / @al13n321 rather than resolving it here.

The concern is valid: storing absolute (ASLR-dependent) virtual addresses in the durable system.trace_log.trace means rows collected before a restart can no longer be symbolized after a PIE binary is reloaded at a different base, whereas the previous code normalized main-binary frames to stable file-relative offsets.

However, the previous representation was not stable/correct either: it only normalized main-binary frames and stored raw virtual addresses for shared-library frames, and — more importantly — file offsets overlap between objects, which is exactly the wrong-symbol bug this PR fixes. There is no single UInt64 representation that is both unambiguous across objects and stable across restarts.

A proper persisted fix would need object identity (e.g. build-id) plus a file-relative offset per frame, which is a larger trace_log schema change and out of scope for this PR. The live consumers (system.stack_trace, exception traces) are unaffected since they symbolize against the current process. Flagging for a decision on whether to accept the cross-restart limitation here or follow up with a schema change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed 73a65a9163a taking the "explicitly document" path the bot offered as an alternative to a schema change, since the proper fix (object identity + file-relative offset per frame) is a trace_log schema change that is a maintainer call:

  • Added an explanatory comment in TraceCollector::run next to where the address is stored, spelling out that we now persist the absolute (ASLR-dependent) virtual address — unambiguous across objects (file offsets overlap between the main binary and shared libraries), which is the bug this PR fixes — and the trade-off that such rows symbolize only while the collecting process is alive and may not re-symbolize after a restart of a PIE binary.
  • Documented the same on the trace column in docs/en/operations/system-tables/trace_log.md.

The cross-object correctness fix is the point of this PR and the live consumers (system.stack_trace, exception traces, in-process system.trace_log symbolization) are unaffected. Leaving this thread open for @azat / @al13n321 to decide whether the cross-restart limitation is acceptable here or warrants the follow-up schema change.

Comment thread src/Functions/addressToLine.h
Comment thread src/Interpreters/TraceLog.cpp
After switching SymbolIndex to absolute virtual addresses, a trace frame may
belong to a shared library rather than the main binary. `addressToLine`,
`addressToLineWithInlines` and `TraceLog`'s line resolution still hardcoded
`thisObject()` (the main binary) for every DWARF lookup, so a `libc` frame
would be looked up in ClickHouse DWARF and return the ClickHouse binary path or
an empty result, even though `findSymbol` already resolved the correct symbol.

Use `findObject` to resolve the object that actually contains the address and
use that object's `elf`/`address_begin` (or `dsym`/`slide` on macOS). For
main-binary frames `findObject` returns the same object as `thisObject`, so
behavior is unchanged there.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

Picked this up to move it forward (CI was green, branch ~719 commits behind master but no conflicts, so no re-merge needed).

Addressed two of the three review-bot threads (pushed 9215b8ffd54):

  • addressToLine.h and TraceLog.cpp line resolution still hardcoded thisObject() (the main binary) for every DWARF lookup. Now that SymbolIndex stores absolute virtual addresses and findSymbol resolves shared-library frames, these lookups have to use the containing object too — otherwise a libc frame symbolizes correctly but its line is queried against ClickHouse DWARF. Switched both to findObject(addr) and use that object's elf/address_begin (and dsym/slide on macOS). For main-binary frames findObject returns the same object as thisObject, so behavior is unchanged there.

Left the TraceCollector.cpp persistence thread open for a maintainer call — storing ASLR-dependent virtual addresses in the durable system.trace_log is a real trade-off (old rows can't symbolize across a restart) but the previous file-offset representation had the cross-object overlap bug this PR fixes, and a fully stable fix needs an object-identity + offset schema change. Details in the thread.

Verification (locally built clickhouse, fresh server):

  • 02420_stracktrace_debug_symbols — main-binary file:line:col resolves (Common/Exception.cpp:139:7: DB::Exception::Exception).
  • 02161_addressToLineWithInlines — matches reference (has inlines: 1) + FUNCTION_NOT_ALLOWED guard (code 446).
  • Shared-library resolution confirmed: start_thread / thread_start now resolve to the correct glibc symbol and addressToLine returns the glibc debug object path instead of the ClickHouse binary (this is the behavior the bot's addressToLine/TraceLog threads asked for).

Comment thread src/Common/SymbolIndex.cpp
Address review feedback on src/Common/SymbolIndex.cpp: add a focused unit
test that resolves a known address from a loaded shared library (the
runtime address of `getpid`, obtained via `dlsym` to avoid a main-binary
PLT stub) and verifies that `SymbolIndex::findSymbol` returns the symbol
that actually contains the address and starts exactly at the C library
symbol reported by `dladdr`.

This exercises the cross-object lookup path that the fix repairs: before
storing absolute virtual addresses, the file-relative offset ranges of
the main binary and shared libraries overlapped, so the lookup could
return a symbol from the wrong object. The test skips on fully static
builds (for example, musl) where the C library is part of the main
binary.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexey-milovidov

Copy link
Copy Markdown
Member

This is very low priority - the only shared library we use is libc, and even that we will remove soon.

alexey-milovidov and others added 5 commits June 11, 2026 05:49
…log.trace

Switching `SymbolIndex` and its consumers to absolute virtual addresses fixes
wrong cross-object symbol resolution, but it also changes the durable
`system.trace_log.trace` representation: it now stores ASLR-dependent runtime
addresses instead of the previous main-binary file-offset normalization.

Document the trade-off explicitly (as raised in the review thread): such rows
symbolize correctly only while the collecting process is alive and may no longer
symbolize after a restart of a PIE binary. Add an explanatory comment in
`TraceCollector::run` and a note on the `trace` column in the system table docs.
A representation that is both unambiguous across objects and stable across
restarts would require object identity plus a file-relative offset (a
`trace_log` schema change), left as a follow-up for maintainers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread docs/en/operations/system-tables/trace_log.md
…description

The warning that `system.trace_log.trace` stores absolute (ASLR-dependent)
virtual addresses was added only to the autogenerated block of
`docs/en/operations/system-tables/trace_log.md`, while
`TraceLogElement::getColumnsDescription` still carried the old shorter `trace`
contract.

That block is regenerated from `system.columns` comments by
`utils/generate-system-tables-docs`, so the warning would be overwritten on the
next regeneration, and `system.columns.comment` for `trace` would keep exposing
the incomplete description.

Move the exact warning text into the `trace` column description in
`src/Interpreters/TraceLog.cpp` so the canonical source carries it. The text
matches the existing `.md` block byte-for-byte, so regenerating the docs leaves
the autogenerated block unchanged.

Addresses the AI review finding on `docs/en/operations/system-tables/trace_log.md:37`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Common/SymbolIndex.h
Comment thread src/Functions/addressToLine.h
alexey-milovidov and others added 3 commits June 29, 2026 13:42
After switching `SymbolIndex` to store absolute virtual addresses instead
of file-relative offsets, `system.symbols.address_begin`/`address_end`
expose those absolute addresses. Update the `StorageSystemSymbols` column
descriptions and the matching `docs/en/operations/system-tables/symbols.md`
to say so, and spell out that for position-independent (PIE) builds these
are ASLR-dependent runtime addresses (for the default non-PIE build they
still match the address in the binary).

Addresses the AI review finding on src/Common/SymbolIndex.h.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Now that `addressToLine`/`addressToLineWithInlines` resolve the object that
contains the address via `findObject`, the argument must be an absolute
virtual address from the running process (e.g. `system.trace_log.trace` or
`system.stack_trace`). Note in the function documentation that the numeric
addresses in the examples come from one specific build and will not match
the reader's binary, and that for position-independent (PIE) builds they are
ASLR-dependent and change between restarts.

Addresses the AI review finding on src/Functions/addressToLine.h.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clickhouse-gh

clickhouse-gh Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.40% 85.40% +0.00%
Functions 92.60% 92.60% +0.00%
Branches 77.60% 77.60% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 95/104 (91.35%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 2 line(s) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants