Support PromQL changes and resets#104487
Conversation
Add the common PromQL compliance reporting and developer notes used by the follow-up TimeSeries PromQL lowering slices. Keep the parser compliance corpus expectation explicit for selectors Prometheus rejects.
Add focused Prometheus protocol tests for user-facing native lowering errors: unsupported-but-parseable functions and invalid date-function arity. These are semantic guardrails for clear errors, not broad coverage padding.
Wire PromQL changes() and resets() through the existing TimeSeries aggregate helpers so the native converter can evaluate those range functions instead of reporting them unsupported. Match Prometheus handling for consecutive NaN samples in changes(), and add differential coverage for reset windows, sparse or empty inputs, special values, and query_range alignment. Validation: Build (amd_debug); test_prometheus_protocols/test_evaluation.py::test_function_over_time; test_prometheus_protocols/test_compliance.py::test_promql_compliance.
Avoid fork-local wording in the PromQL review base. These are ClickHouse PromQL regression cases and evidence requirements, not fork-specific artifacts.
Avoid fork-local wording in the PromQL review base. These are ClickHouse PromQL regression cases and evidence requirements, not fork-specific artifacts.
Remove the standalone native histogram discovery Markdown note from the PromQL split branches.
Remove the standalone native histogram discovery Markdown note from the PromQL split branches.
Remove downstream promshim-specific evidence and non-goal language from the native PromQL lowering README so the note stays focused on ClickHouse review guidance.
Remove downstream promshim-specific evidence and non-goal language from the native PromQL lowering README so the note stays focused on ClickHouse review guidance.
Rewrite the native PromQL lowering notes around external SQL-transpiler measurements. Document measured and mixed SQL patterns with target PromQL shapes, pseudocode, and the ClickHouse-side validation expected for in-tree changes.
Rewrite the native PromQL lowering notes around external SQL-transpiler measurements. Document measured and mixed SQL patterns with target PromQL shapes, pseudocode, and the ClickHouse-side validation expected for in-tree changes.
Replace external corpus row names with concrete PromQL expressions and query-range context in the native PromQL lowering notes.
Replace external corpus row names with concrete PromQL expressions and query-range context in the native PromQL lowering notes.
Match Prometheus parser behavior by requiring vector selectors to contain at
least one matcher that cannot match an empty label value. This prevents broad
selectors like `{job=~".*"}` or `{__name__=~".*"}` from silently selecting every
metric when the user likely intended an explicit non-empty constraint.
Users who intentionally want all real metrics can still use `{__name__=~".+"}`,
and empty-matching label matchers remain valid when combined with that explicit
metric-name matcher.
Deduplicate expanded compliance cases by query text before checking the `should_fail` expectation. A query listed once as expected success and once as expected failure is now reported as conflicting metadata instead of slipping through as two separate cases.
…plit/promql-range-functions
The nested aggregate function can canonicalize its parameter list during factory creation. Wrap the nested function with that canonical parameter set instead of the original parser parameters so the Array wrapper and nested aggregate keep the same definition.
Mirror the upstream PromQL compliance corpus for label transformation invalid label-name cases. These cases should remain should-fail entries so ClickHouse acceptance of invalid destination labels stays visible as a rejection mismatch rather than being counted as expected coverage.
The Array combinator now keeps the nested aggregate's canonical parameters, so it no longer reads the original factory parameter list. Leave the argument unnamed to satisfy warning-as-error builds while preserving the override signature.
|
The |
Do not stop selector validation after finding the first matcher that cannot match an empty label value. Later regex matchers still need syntax validation so equivalent selector reorderings cannot change parse validity. Add parser regressions for invalid regex matchers after an implicit metric-name matcher and after explicit non-empty matchers.
Add parser and HTTP API coverage for Prometheus selector validation rules: negated non-empty matchers, invalid regex validation after non-empty matchers, and duplicate metric-name matchers. Rejecting duplicate outside/in-brace metric names keeps ClickHouse parsing aligned with Prometheus expression parsing.
Prometheus skips stale markers before evaluating `changes` and `resets`, while regular `NaN` samples still participate in `changes`. Windows with one usable sample should return zero instead of disappearing. Selector `offset` also shifts raw `DateTime64` samples during range-function lowering, so pass integer interval values to the `toInterval*` family instead of decimal literals. Validation: - `ninja -C build_debug clickhouse` - `python -m ci.praktika run "integration" --test tests/integration/test_prometheus_protocols/test_evaluation.py::test_function_over_time`
Ignore string literals and selector matchers when routing compliance failures by query shape. This keeps report categories from treating label values or exponent signs as binary operators, and recognizes bare '<' and '>' comparisons.
Use the existing eps parameter when checking live Prometheus HTTP results against expected JSON, and compare ClickHouse HTTP output against the live Prometheus result. This keeps numeric drift bounded to eps while preserving exact shape, label, and timestamp checks.
Matcher validation intentionally turns malformed label regexes into PromQL parse errors. Construct RE2 with log_errors disabled so rejected selectors do not also emit internal RE2 error log lines on every parse attempt.
…plit/promql-range-functions
|
This was fixed by #105146. Let's update the branch. |
Keep the PromQL parser, modifier, and selector fixes from the local matrix pass, but move the high-value regression coverage into the existing protocol and compliance suites instead of adding the standalone matrix harness to the PR branch.\n\nThis keeps fixed-time range evaluation covered through the existing evaluation tests and records the parser, matcher, timestamp-modifier, and sparse range cases in the shared compliance corpus.
Build (arm_tidy) flagged the vector-grid finalization predicate because one call cloned an AST pointer while another argument moved the same pointer. C++ argument evaluation order can make that a use-after-move.\n\nCreate separate AST nodes for the null and stale-marker checks so the generated predicate stays the same without relying on argument evaluation order.
Parse `@ start()` and `@ end()` as timestamp modifier variants instead of rewriting the raw query text before parsing. This keeps label values and regular expressions untouched while resolving symbolic bounds in the query range planner. Propagate symbolic `@` modifiers through the PromQL tree and SQL lowering so instant selectors and range-vector functions bind to the query start or end consistently with Prometheus.
The selector-matcher validation branch should not also introduce symbolic `@ start()` and `@ end()` modifier parsing. Remove that parser/evaluation feature from the base branch so the behavior can be reviewed independently. Move the symbolic timestamp-modifier compliance cases out with the feature. Keep the offset-only case here because it exercises existing modifier behavior.
Symbolic `@ start()` and `@ end()` modifiers need parser-level handling so selectors, string literals, and regular expressions are not rewritten as raw text. Remove the temporary text-substitution path from the base branch; the symbolic modifier implementation belongs with the dedicated parser change.
| makeASTFunction("isNotNull", std::move(array_element_for_null_check)), | ||
| makeASTFunction( | ||
| "notEquals", | ||
| makeASTFunction("reinterpretAsUInt64", makeASTFunction("assumeNotNull", std::move(array_element_for_stale_check))), |
There was a problem hiding this comment.
This only filters stale markers when the VECTOR_GRID is finalized as an instant vector. The same VECTOR_GRID can also become a range vector through applySubquery, which only flips expression.type and leaves the values array intact. For a query like stale_counter_values[60s:10s], fromSelector builds the grid via last_over_time with stale markers preserved, applySubquery retypes it to RANGE_VECTOR, and finalizeRangeVectorAsSQL calls timeSeriesFromGrid without this check, so the Prometheus stale NaN is emitted as a real matrix sample (and can also be consumed by later range functions). Please convert stale-marker grid entries to NULL before a VECTOR_GRID is reused/finalized as a range vector, and add a subquery/range-vector test with a stale marker.
…anges-resets # Conflicts: # tests/integration/test_prometheus_protocols/test_compliance.py # tests/integration/test_prometheus_protocols/test_evaluation.py
An instant-vector `VECTOR_GRID` (built by `last_over_time` for an instant selector) keeps Prometheus stale markers in its `values` array: they are filtered out only later, when the grid is finalized as an instant vector (see `finalizeInstantVectorAsSQL`). A subquery such as `stale_counter_values[60s:10s]` reuses the same grid as a range vector via `applySubquery`, which only flips the result type to `RANGE_VECTOR`. The downstream consumers - `finalizeRangeVectorAsSQL` and the `VECTOR_GRID` branch of `applyFunctionOverRange` - feed the array straight into `timeSeriesFromGrid` without the stale-marker check, so the Prometheus stale `NaN` surfaced as a real matrix sample (and could also feed range functions applied over the subquery). Fix it at the single conversion point: when `applySubquery` turns a `VECTOR_GRID` instant vector into a range vector, replace stale-marker grid entries with `NULL`. `timeSeriesFromGrid` skips `NULL` entries, which matches Prometheus dropping the stale step entirely, while regular `NaN` samples stay visible. Adds `test_subquery_drops_stale_marker` covering an instant-selector subquery whose grid contains a stale marker.
|
|
||
| /// isNotNull(x) AND reinterpretAsUInt64(assumeNotNull(x)) = 0x7ff0000000000002 | ||
| /// (0x7ff0000000000002 is the bit representation of the Prometheus stale marker.) | ||
| ASTPtr is_stale_marker = makeASTFunction( |
There was a problem hiding this comment.
This only catches grids that still contain the exact Prometheus stale-marker payload. An instant-vector operator or function can consume the selector first and destroy that payload before the subquery boundary. For example, (stale_counter_values == bool 2)[30s:10s] goes through applyComparisonOperator's arrayMap, turns the stale step at 110 into 0, and then this filter keeps it, so the matrix contains a real sample where Prometheus would omit the step. The stale sample needs to become NULL before elementwise vector operations run, not only here.
LLVM Coverage ReportChanged lines: Changed C/C++ lines covered: 298/308 (96.75%) · Uncovered code |

Adds native lowering for the PromQL counter state range functions
changesandresetsover ClickHouseTimeSeriesdata.The implementation uses dedicated time-series aggregate helpers so reset/change detection is handled in the same range-grid execution shape as the existing rate and delta family.
Split out from the integration draft #104271; shared base #104484.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Added native PromQL-to-SQL lowering for
changesandresetsover ClickHouseTimeSeriesdata.Documentation entry for user-facing changes