Add functional tests to improve coverage by fm4v · Pull Request #103456 · ClickHouse/ClickHouse · GitHub
Skip to content

Add functional tests to improve coverage#103456

Open
fm4v wants to merge 6 commits intomasterfrom
coverage-improvements-2026-04-23
Open

Add functional tests to improve coverage#103456
fm4v wants to merge 6 commits intomasterfrom
coverage-improvements-2026-04-23

Conversation

@fm4v
Copy link
Copy Markdown
Member

@fm4v fm4v commented Apr 23, 2026

Add 28 new stateless functional tests (04113..04140) targeting modules with the largest uncovered-line counts in master coverage data. All tests pass locally.

The targets are SQL-reachable code paths with significant gaps — parsers, aggregate functions, column types, formats, and functions. Integration-heavy modules (Kafka, Hive, Iceberg, Kubernetes, LDAP, Keeper) are excluded because they need running external services.

Coverage highlights by test:

  • 04113 timeSlots vector/constant combinations for DateTime and DateTime64
  • 04114 flameGraph arity, merge, empty trace, error paths
  • 04115 tumble / hop / windowID error and type paths
  • 04116 formatDateTime / formatDateTimeInJodaSyntax / fromUnixTimestamp edges
  • 04117 + 04124 ParserSystemQuery and ASTSystemQuery formatters
  • 04118 KQL make-series parser (ParserKQLMakeSeries.cpp, was 0%)
  • 04119 ParserSnapshotQuery round-trip
  • 04120 parseDateTime mysql and Joda format specifiers + error paths
  • 04121 JSONExtractTree across NumericNode / BoolNode / StringNode / DecimalNode / DateNode / UUIDNode / IPv4Node / EnumNode / LowCardinalityNode / NullableNode / ArrayNode / TupleNode / MapNode / VariantNode / DynamicNode / ObjectNode
  • 04122 SerializationTime64 across TSV / CSV / JSONEachRow / Values / TSKV
  • 04123 KQL string operators (contains / startswith / endswith / has / has_any / has_all / =~ / !~ / in~)
  • 04125 arrayLevenshteinDistance / arrayLevenshteinDistanceWeighted / arraySimilarity type dispatch
  • 04126 ParserAlterQuery variants not covered by existing tests
  • 04127 bitmap functions across integer element types
  • 04128 sequenceNextNode direction × base combos, merge, type dispatch
  • 04129 distinctJSONPaths / distinctJSONPathsAndTypes
  • 04130 Variant column operations (variantType / variantElement, MergeTree round-trip)
  • 04131 PrometheusQueryParsingUtil via the prometheusQuery table function
  • 04132 ColumnSparse aggregate / filter / sort / JOIN / mutation
  • 04133 groupArray / groupArrayLast / groupArraySample / groupArrayMoving* / groupArrayInsertAt
  • 04134 SingleValueDataGenericWithColumn over Tuple / Array / Map / Decimal / IPv4 / IPv6
  • 04135 BSONEachRow input type soup, numeric coercion, nullable, schema inference
  • 04136 window function frame bounds (ROWS / RANGE, PRECEDING / FOLLOWING, rank / ntile / nth_value)
  • 04137 conversion edges: toXxxOrZero / toXxxOrNull / toXxxOrDefault / accurateCast across numeric / Decimal / Date* / UUID / IPv4 / IPv6 / FixedString / big-integer
  • 04138 higher-order array lambdas (arrayMap / arrayFilter / arrayFold / arraySort with captures)
  • 04139 aggregate-function state round-trip via AggregatingMergeTree and -State / -Merge
  • 04140 Parquet type soup, compression codecs, multiple row groups, filter pushdown, column pruning, nullable, schema inference

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Not for changelog (changelog entry is not required)

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Add 28 new stateless tests (04113..04140) targeting modules with the
largest uncovered-line counts from master coverage data:

- KQL: make-series parser, string operators
- Parsers: ParserSnapshotQuery, ParserSystemQuery extras, ParserAlterQuery variants
- Functions: parseDateTime format specifiers, JSONExtract type tree,
  conversion OrZero/OrNull/accurateCast, bitmap integer type dispatch,
  arrayLevenshtein/Weighted/Similarity, higher-order lambdas, timeSlots,
  time window (tumble/hop), formatDateTime edge cases
- AggregateFunctions: flameGraph arity/merge paths, sequenceNextNode
  direction/base combos, distinctJSONPaths, groupArray family,
  SingleValueData complex types, state/merge lifecycle
- Columns: Variant operations, sparse serialization, aggregate-function column
- Formats: BSONEachRow type soup + coercion, Parquet type soup + codecs +
  row groups + pruning
- Transforms: window function frame bounds
- DataTypes: SerializationTime64 across TSV/CSV/JSON/Values/TSKV
- Storages: Prometheus/PromQL parser via prometheusQuery table function
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 23, 2026

@clickhouse-gh clickhouse-gh Bot added the pr-ci label Apr 23, 2026
Comment thread tests/queries/0_stateless/04117_parser_system_query_variants.sql
Comment thread tests/queries/0_stateless/04117_parser_system_query_variants.sql Outdated
Comment thread tests/queries/0_stateless/04132_column_sparse_ops.sql
Comment thread tests/queries/0_stateless/04124_parser_system_query_extra.sql Outdated
Comment thread tests/queries/0_stateless/04124_parser_system_query_extra.sql
fm4v added 5 commits April 23, 2026 19:46
- 04117, 04124: add no-parallel tag (SYSTEM DROP in the file trips the
  style check), and drop the bare `SYSTEM STOP MERGES` / `SYSTEM FLUSH LOGS`
  lines that the style check rejects wherever they appear.
- 04131: add no-fasttest tag (ANTLR4 is disabled in the fast-test build).
- 04132: add `database = currentDatabase()` to the `system.parts_columns`
  query.
- 04140: add no-fasttest tag (fast-test build lacks some Parquet codecs).

See #103456
The 'system\s*flush\s*logs\s*(;|$|")' grep rule also matched the
literal 'SYSTEM FLUSH LOGS;' inside comments I added in the previous
commit, so the style check flagged those comment lines.
Stateless tests randomize `--session_timezone` (Africa/Juba, Mexico/BajaSur,
etc.). 04118, 04120 and 04137 rendered zero/epoch DateTime values and bucketed
datetimes whose output changed with timezone, which tripped reference diff in
the flaky and parallel jobs. Adding `SET session_timezone = 'UTC'` at the top
makes the output independent of the CI override.

See #103456
- 04127: wrap bitmapSubsetInRange/Limit output in arraySort (roaring bitmap
  iteration order is not guaranteed).
- 04133: sort groupArrayArray result (UNION ALL block order is not stable).
- 04134: replace any()/anyLast() over tuples with ORDER BY + LIMIT 1 since
  any*() are documented as returning an unspecified row.
- 04136: add a rowNumberInAllBlocks() tiebreaker so row_number() over ties
  has a deterministic assignment.
- 04139: feed topKState() a weighted distribution instead of unique values
  so the top-3 is unambiguous.

See #103456
Those aggregators intentionally return an unspecified row and remain
non-deterministic under parallel execution, so they can't be asserted on
in a reference test. Keep min/max/count on the Nullable branch, which are
still exercising SingleValueDataGenericWithColumn paths.

See #103456
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant