Support classic PromQL histogram_quantile#104496
Conversation
Add the common PromQL compliance reporting and developer notes used by the follow-up TimeSeries PromQL lowering slices. Keep the parser compliance corpus expectation explicit for selectors Prometheus rejects.
Add focused Prometheus protocol tests for user-facing native lowering errors: unsupported-but-parseable functions and invalid date-function arity. These are semantic guardrails for clear errors, not broad coverage padding.
Add native PromQL lowering for classic histogram_quantile over _bucket series using the vector-grid query shape. The lowering groups buckets by timestamp and output labels, drops le and __name__, coalesces duplicate bounds, and applies Prometheus bucket quantile edge semantics through a time-series helper.
Add the missing unordered PromQL evaluation helper used by the classic histogram tests and make NaN bucket bounds deterministic before sorting. The NaN case is a semantic guard for parseable NaN le labels, not coverage padding.
Avoid fork-local wording in the PromQL review base. These are ClickHouse PromQL regression cases and evidence requirements, not fork-specific artifacts.
Avoid fork-local wording in the PromQL review base. These are ClickHouse PromQL regression cases and evidence requirements, not fork-specific artifacts.
Remove the standalone native histogram discovery Markdown note from the PromQL split branches.
Remove the standalone native histogram discovery Markdown note from the PromQL split branches.
Remove downstream promshim-specific evidence and non-goal language from the native PromQL lowering README so the note stays focused on ClickHouse review guidance.
Remove downstream promshim-specific evidence and non-goal language from the native PromQL lowering README so the note stays focused on ClickHouse review guidance.
Rewrite the native PromQL lowering notes around external SQL-transpiler measurements. Document measured and mixed SQL patterns with target PromQL shapes, pseudocode, and the ClickHouse-side validation expected for in-tree changes.
Rewrite the native PromQL lowering notes around external SQL-transpiler measurements. Document measured and mixed SQL patterns with target PromQL shapes, pseudocode, and the ClickHouse-side validation expected for in-tree changes.
Replace external corpus row names with concrete PromQL expressions and query-range context in the native PromQL lowering notes.
Replace external corpus row names with concrete PromQL expressions and query-range context in the native PromQL lowering notes.
Match Prometheus parser behavior by requiring vector selectors to contain at
least one matcher that cannot match an empty label value. This prevents broad
selectors like `{job=~".*"}` or `{__name__=~".*"}` from silently selecting every
metric when the user likely intended an explicit non-empty constraint.
Users who intentionally want all real metrics can still use `{__name__=~".+"}`,
and empty-matching label matchers remain valid when combined with that explicit
metric-name matcher.
Deduplicate expanded compliance cases by query text before checking the `should_fail` expectation. A query listed once as expected success and once as expected failure is now reported as conflicting metadata instead of slipping through as two separate cases.
…plit/promql-classic-histograms
The histogram quantile tests intentionally compare unordered vectors, but the helper still required the two Prometheus fixtures to return series in the same order before sorting. Sort both Prometheus responses before comparison so the helper only checks series content.
Mirror the upstream PromQL compliance corpus for label transformation invalid label-name cases. These cases should remain should-fail entries so ClickHouse acceptance of invalid destination labels stays visible as a rejection mismatch rather than being counted as expected coverage.
Do not stop selector validation after finding the first matcher that cannot match an empty label value. Later regex matchers still need syntax validation so equivalent selector reorderings cannot change parse validity. Add parser regressions for invalid regex matchers after an implicit metric-name matcher and after explicit non-empty matchers.
…plit/promql-classic-histograms
Add parser and HTTP API coverage for Prometheus selector validation rules: negated non-empty matchers, invalid regex validation after non-empty matchers, and duplicate metric-name matchers. Rejecting duplicate outside/in-brace metric names keeps ClickHouse parsing aligned with Prometheus expression parsing.
Keep `__name__` as part of classic histogram identity while grouping buckets for `histogram_quantile`, then drop it through the normal result path so duplicate output labelsets are rejected like Prometheus. Expand focused histogram tests for q edge values, malformed and sparse buckets, duplicate bounds, monotonicity repair, tiny deltas, transformed rate inputs, and post-name-drop collisions.
Ignore string literals and selector matchers when routing compliance failures by query shape. This keeps report categories from treating label values or exponent signs as binary operators, and recognizes bare '<' and '>' comparisons.
Mirror Prometheus BucketQuantile by treating only exactly zero observations as empty and by letting interpolation handle negative bucket counts. Malformed arithmetic over classic buckets can produce negative cumulative counts, and Prometheus still returns the computed quantile.
Use the existing eps parameter when checking live Prometheus HTTP results against expected JSON, and compare ClickHouse HTTP output against the live Prometheus result. This keeps numeric drift bounded to eps while preserving exact shape, label, and timestamp checks.
Matcher validation intentionally turns malformed label regexes into PromQL parse errors. Construct RE2 with log_errors disabled so rejected selectors do not also emit internal RE2 error log lines on every parse attempt.
…plit/promql-classic-histograms
Prometheus only treats exactly zero classic-histogram observations as empty. If the cumulative +Inf bucket count is NaN, BucketQuantile continues into the bucket search and returns the highest finite upper bound path instead of short-circuiting to NaN. Remove the extra NaN short-circuit and cover this malformed bucket-vector case in the Prometheus protocol integration tests.
| if (buckets.empty()) | ||
| return std::numeric_limits<Float64>::quiet_NaN(); | ||
|
|
||
| if (std::any_of(buckets.begin(), buckets.end(), [](const Bucket & bucket) { return std::isnan(bucket.upper_bound); })) |
There was a problem hiding this comment.
This early return diverges from Prometheus BucketQuantile semantics.
Upstream does not reject all NaN bucket boundaries at this stage, and some malformed/transformed vectors still produce numeric results. For example, with q=0 and buckets (0,2), (NaN,0), (+Inf,0.5), Prometheus returns 0, while this branch forces NaN.
Because this PR targets compatibility for classic histogram_quantile behavior (including malformed bucket inputs), please remove the unconditional NaN-upper-bound short-circuit and follow upstream bucket handling/order instead.
LLVM Coverage Report
Changed lines: 83.70% (385/460) · Uncovered code |
Keep the PromQL parser, modifier, and selector fixes from the local matrix pass, but move the high-value regression coverage into the existing protocol and compliance suites instead of adding the standalone matrix harness to the PR branch.\n\nThis keeps fixed-time range evaluation covered through the existing evaluation tests and records the parser, matcher, timestamp-modifier, and sparse range cases in the shared compliance corpus.
Build (arm_tidy) flagged the vector-grid finalization predicate because one call cloned an AST pointer while another argument moved the same pointer. C++ argument evaluation order can make that a use-after-move.\n\nCreate separate AST nodes for the null and stale-marker checks so the generated predicate stays the same without relying on argument evaluation order.
Parse `@ start()` and `@ end()` as timestamp modifier variants instead of rewriting the raw query text before parsing. This keeps label values and regular expressions untouched while resolving symbolic bounds in the query range planner. Propagate symbolic `@` modifiers through the PromQL tree and SQL lowering so instant selectors and range-vector functions bind to the query start or end consistently with Prometheus.
The selector-matcher validation branch should not also introduce symbolic `@ start()` and `@ end()` modifier parsing. Remove that parser/evaluation feature from the base branch so the behavior can be reviewed independently. Move the symbolic timestamp-modifier compliance cases out with the feature. Keep the offset-only case here because it exercises existing modifier behavior.
Symbolic `@ start()` and `@ end()` modifiers need parser-level handling so selectors, string literals, and regular expressions are not rewritten as raw text. Remove the temporary text-substitution path from the base branch; the symbolic modifier implementation belongs with the dedicated parser change.
…tograms # Conflicts: # src/Storages/TimeSeries/PrometheusQueryToSQL/applyFunction.cpp
There was a problem hiding this comment.
Prometheus instant-vector selectors treat a stale marker as the series being absent, but this filter only drops NULL values. For histogram_quantile over a direct bucket selector, last_over_time can leave the stale-marker payload in values, so a stale bucket is grouped as a real bucket count and passed to timeSeriesPrometheusHistogramQuantile.
That changes results whenever one bucket goes stale independently of the others: for example, if le="1" is stale at the evaluation time and le="+Inf" is still present, Prometheus omits the finite bucket and returns NaN because fewer than two buckets remain, while this path feeds a NaN count and can fall through to a numeric upper bound. Please filter the Prometheus stale marker here as well, before grouping buckets.

Adds native PromQL lowering for classic
histogram_quantileover_bucket{le=...}float series.The lowering groups bucket samples by timestamp and output labels, drops
leand__name__, coalesces duplicate bucket bounds, and applies Prometheus-compatible bucket quantile edge behavior. Native histogram support is intentionally not included here.Split out from the integration draft #104271; shared base #104484.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Added native PromQL-to-SQL lowering for classic
histogram_quantileover_bucket{le=...}float series in ClickHouseTimeSeriesqueries. Native histogram support is not included.Documentation entry for user-facing changes