Print newly covered code in LLVM Coverage Job by alexbakharew · Pull Request #104714 · ClickHouse/ClickHouse · GitHub
Skip to content

Print newly covered code in LLVM Coverage Job#104714

Merged
alexbakharew merged 12 commits into
masterfrom
alexey-bakharev/coverage-comment-for-test-changes
May 20, 2026
Merged

Print newly covered code in LLVM Coverage Job#104714
alexbakharew merged 12 commits into
masterfrom
alexey-bakharev/coverage-comment-for-test-changes

Conversation

@alexbakharew

@alexbakharew alexbakharew commented May 12, 2026

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Version info

  • Merged into: 26.5.1.914

@clickhouse-gh

clickhouse-gh Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-ci label May 12, 2026
alexbakharew and others added 2 commits May 12, 2026 17:34
Add a regression test for `NOT_FOUND_COLUMN_IN_BLOCK` from #103508 (column
`timestamp` renamed to `timestamp_0` in distributed CTE when underlying
table is non-empty). The bug no longer reproduces on master.

The analyzer still renames `__table1.timestamp` to `__table1.timestamp_0`
when `optimize_read_in_order = 0`, which random-settings runs in CI hit:

  Code: 10. DB::Exception: Not found column __table1.timestamp in block.
  There are only columns: __table1.timestamp_0, __global_row_index.

That is a separate, narrower path through the analyzer than the one
covered by this regression test. Pin the offending setting to `1` in the
query so the test reliably exercises the originally reported scenario.

Also cover a multi-shard cluster: the single-shard `test_shard_localhost`
does not exercise the remote read path that the original report observed.
A second query against `test_cluster_two_shards_localhost` (two shards on
`localhost:9000`) covers the case where the `Distributed` query fans out
to multiple shards and merges the results on the initiator.

Closes #103508.

Squashed from #104548 (commits
8c80956, 70a1ed2, d1e9c7e).

Co-Authored-By: Alexey Milovidov <milovidov@clickhouse.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ts-only PRs

The previous trigger required *every* changed path to be a test file, but a
PR that adds a test alongside CI script / config / doc edits (a common shape
for "add regression test" or "fix CI" PRs) leaves the production ClickHouse
binary identical to the master baseline just the same — only the set of
tests running against that binary differs.

Replace `_only_tests_changed` (all paths are tests) with
`_binary_unchanged AND _tests_changed`, where `_binary_unchanged` is true iff
no path is production C/C++ source (i.e. `.cpp` / `.h` / … outside `tests/`,
`src/**/tests/`, and `contrib/`). This admits the common mixed PR while
still skipping PRs that touch real C++ sources (where the per-file diff
coverage report is the appropriate signal).

Observed on #104548 alongside
ci/jobs/llvm_coverage_job.py changes: the analysis was skipped because the
PR was no longer "tests-only" after the script edits, even though the
binary was unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread ci/jobs/llvm_coverage_job.py Outdated
alexbakharew and others added 2 commits May 13, 2026 14:05
…essage

The truncation footer used to read "see full coverage report for the rest" —
but the [Details] link in the GH comment *is* this log, and the linked
genhtml report shows overall coverage without distinguishing newly-covered
from always-covered lines, so there is no "full report" to defer to.

Two changes:

  * The "Top files" inventory now lists every file with newly-covered code,
    not just the top 20. The inventory is one line per file, so it stays
    cheap even for large outputs and gives an honest, complete picture.
  * Raise the snippet-preview cap from 400 lines to 2000 (the previous limit
    was reached after roughly a dozen files), and replace the misleading
    truncation footer with one that points to the inventory above and
    suggests opening source files at the listed line numbers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ol list

Two follow-ups on the newly-covered-code log:

  * Print the overall line and function coverage delta vs master baseline at
    the very top of the log, with 4 decimal places of precision so that
    fractional-percentage-point movements remain visible (e.g. `+0.0004 pp,
    +96 covered`). Computed from the parsed tracefiles directly — fast, and
    consistent within the log even though it may diverge from `lcov
    --summary` at the rounding level.
  * Drop the trailing list of "Newly covered functions". It printed mangled
    C++ symbol names (`_ZNK2DB12_GLOBAL__N_1...`) that were unreadable and
    added no signal beyond the per-file function counts that the inventory
    section already shows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread ci/jobs/llvm_coverage_job.py Outdated
Two newly-covered groups separated by only 1-2 lines used to render as two
separate stanzas with overlapping context windows, so the same surrounding
code printed two or three times in a row — visually noisy and hard to read
across a long function body.

After grouping consecutive newly-covered lines, also merge groups whose
context windows would touch or overlap (gap <= 2*context + 1). Inside the
merged block, only the lines that were actually newly covered carry the ">>"
marker; gap lines are shown as ordinary context. The resulting snippet is
one contiguous, scrollable region instead of several near-duplicate stanzas.

For example, in MergeTreeReaderTextIndex.cpp where lines 183, 184, 186, 188,
190-194, 197-199, 201-203, 205-208, 210-212 are newly covered, output drops
from nine stanzas (each re-printing 5-7 surrounding lines) to one block
183-212 with the right ">>" markers in place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread ci/jobs/llvm_coverage_job.py Outdated
alexbakharew and others added 3 commits May 15, 2026 09:04
Verification on PR #104714 showed that the single-baseline "newly covered"
count is dominated by run-to-run variance of the coverage build, not by the
PR's actual test contribution. Two adjacent master commits with no PR in
between produce ~1000 lines of `0 -> >0` flutter (and ~1000 of the reverse),
caused by randomized test settings, flaky-test sampling, and timing-
sensitive code paths covering different lines on each run.

Fix: download a second, older master baseline alongside the nearest one,
and only count a line as newly covered if it is uncovered in BOTH master
baselines and covered in the current PR build. Lines that flicker between
the two master runs are dropped as flutter.

Implementation:

  * `generate_diff_coverage_report.sh`: after downloading
    `base_llvm_coverage.info`, continue scanning `PREV_30_COMMITS` for the
    next available `.info`. Save as `base_llvm_coverage_2.info`. The second
    baseline is optional — if no second commit has a published report, the
    analysis falls back to single-baseline mode with a printed warning.
  * `print_newly_covered_code.py`: when `base_llvm_coverage_2.info` exists,
    build a compact `set[(rel, ln)]` of zero-coverage lines from it (memory-
    cheaper than the full nested dict) and require every claimed transition
    to also be in that set. Same logic for function-level transitions.

Limitations: this filters out ~13% of the false positives on a typical
tests-only run (measured locally against PR #104714: 2256 -> 1970 lines
with noise filter, 3304 -> 2886 lines without). The remaining ~87% of the
variance is inside the build itself and would need per-test attribution
(llvm-cov `.profraw` mapping) to remove fully.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verification on https://s3.amazonaws.com/clickhouse-test-reports/PRs/104714/21e2b6ff1a8a727eea769ce5fa0d08d0a878afd6/llvm_coverage/job.log
showed that after merging master into the PR branch the test file got
absorbed into the merge base, so the GitHub `compare` API for the PR no
longer lists it as a changed file. The PR diff then contained only CI
script edits, the `_tests_changed` gate stayed False, and no coverage
comment was posted even though the job ran end-to-end and both baseline
plus current `.info` files were available.

Widen the gate from "tests changed" to "production binary is unchanged".
Any PR that does not touch C/C++ sources outside `tests/`,
`src/**/tests/`, and `contrib/` leaves the binary identical to baseline,
so the global delta is a meaningful signal worth surfacing on its own:

  * production C/C++ PR -> existing differential coverage report path
  * tests-only / mixed-with-scripts -> newly-covered analysis + comment
  * scripts-only / docs-only / config-only -> comment with global delta only
  * contrib-only -> comment with global delta only

The newly-covered analysis itself still requires `_tests_changed`,
because without test changes there is no per-line attribution to make.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread ci/jobs/llvm_coverage_job.py Outdated
@alexbakharew alexbakharew force-pushed the alexey-bakharev/coverage-comment-for-test-changes branch 2 times, most recently from 738e5ba to 1a8ee23 Compare May 20, 2026 11:22
@clickhouse-gh

clickhouse-gh Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.10% +0.00%
Functions 91.40% 91.40% +0.00%
Branches 76.50% 76.60% +0.10%

Changed lines: No C/C++ source files changed — skipping uncovered code analysis.

Full report

Comment thread ci/jobs/llvm_coverage_job.py
…list

Three review comments rolled into one:

  * `_is_test_path` previously returned True for everything under `tests/`,
    which included `tests/ci/**`, `tests/integration/helpers/**`,
    `tests/jepsen.clickhouse/**` and similar infrastructure. A change that
    only touched a helper would still trip `_tests_changed = True` and post
    a comment titled "newly covered by added/modified tests" even though no
    runnable test changed. Narrow `_is_test_path` to a strict allowlist:
    `tests/queries/`, `tests/integration/test_*/`, `tests/performance/`,
    `tests/stress/`, and `src/**/tests/`.

  * Replace the previous `_affects_binary` denylist with a conservative
    `_is_non_binary_path` allowlist (tests + tests infrastructure under
    `tests/`, `src/**/tests/`, `ci/`, `.github/`, `docs/`, and a small set
    of top-level metadata files). Anything outside the allowlist — cmake
    files at any depth, top-level `CMakeLists.txt`, parser/grammar/codegen
    inputs, Rust sources, `utils/**`, `docker/**`, `.gitmodules`, and
    crucially `contrib/**` — now defaults to "potentially affects the
    production binary" and suppresses the newly-covered analysis. The
    differential coverage report (or no comment) is the right signal when
    the binary may have changed.

  * Require `_tests_changed` in `_has_coverage_data`, not just
    `_binary_unchanged`. The previous gate posted a comment for any
    binary-unchanged PR including docs-only and helper-only changes; the
    global delta there is dominated by run-to-run coverage variance, which
    is misleading signal. Restore the prior behavior of skipping pure
    non-C/C++ PRs while still posting when a real runnable test changed.

Verified behavior on 14 scenarios (FT/IT/Unit additions, helper-only,
contrib-only, contrib+tests, docs-only, CI-script-only, production C++,
cmake/, top-level CMakeLists.txt, mixed tests+scripts). All classify as
expected: comments only appear when a runnable test changed and the
production binary is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread ci/jobs/llvm_coverage_job.py
Comment thread ci/jobs/llvm_coverage_job.py
@alexbakharew alexbakharew enabled auto-merge May 20, 2026 18:43
@alexbakharew alexbakharew added this pull request to the merge queue May 20, 2026
Merged via the queue into master with commit c88659b May 20, 2026
166 checks passed
@alexbakharew alexbakharew deleted the alexey-bakharev/coverage-comment-for-test-changes branch May 20, 2026 20:38
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label May 20, 2026
DavidHe-2008 pushed a commit to DavidHe-2008/ClickHouse that referenced this pull request Jun 1, 2026
…coverage-comment-for-test-changes

Print newly covered code in LLVM Coverage Job
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants