Add columns in columns.sql for INFORMATION_SCHEMA by FArthur-cmd · Pull Request #29637 · ClickHouse/ClickHouse · GitHub
Skip to content

Add columns in columns.sql for INFORMATION_SCHEMA#29637

Merged
FArthur-cmd merged 6 commits into
ClickHouse:masterfrom
FArthur-cmd:add_sql_columns
Oct 22, 2021
Merged

Add columns in columns.sql for INFORMATION_SCHEMA#29637
FArthur-cmd merged 6 commits into
ClickHouse:masterfrom
FArthur-cmd:add_sql_columns

Conversation

@FArthur-cmd

Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
...

Detailed description / Documentation draft:
add columns to avoid errors.

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Oct 1, 2021
@tavplubix tavplubix self-assigned this Oct 1, 2021
@alexey-milovidov alexey-milovidov changed the title Add columns in columns.sql Add columns in columns.sql for INFORMATION_SCHEMA Oct 2, 2021
@FArthur-cmd FArthur-cmd marked this pull request as ready for review October 5, 2021 11:39

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but I'm not sure that NULL is better than data_type

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we already have type AS data_type.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why do we need column_type column if we already have data_type column? :)

@FArthur-cmd

Copy link
Copy Markdown
Contributor Author

@FArthur-cmd FArthur-cmd merged commit 1c72421 into ClickHouse:master Oct 22, 2021
pull Bot pushed a commit to AKJUS/ClickHouse that referenced this pull request May 18, 2026
The test asserted a deterministic row-by-row output from:

  SELECT val, avg(toUInt32(val))
  FROM t_group_by_lowcardinality
  GROUP BY val
  LIMIT 10
  SETTINGS max_threads = 1, max_rows_to_group_by = 100, group_by_overflow_mode = 'any';

With `group_by_overflow_mode = 'any'`, only the first `max_rows_to_group_by` distinct
keys make it into the hash table — and which keys those are depends on the order in
which rows arrive at the aggregator. `LIMIT 10` without `ORDER BY` then selects ten of
those keys in hash-iteration order. The `max_threads = 1` clause does not help under
`ParallelReplicas` (where the table is sharded across replicas and partial aggregations
are merged) or under S3 storage (where reads can be split into independent ranges), so
the resulting row set varies between runs.

Originally tracked in ClickHouse#36069 (2022). The `-- Tags: no-random-settings` workaround masked
the issue for random-settings randomization but the underlying non-determinism returned
once `ParallelReplicas` was enabled by default in some CI configurations. Concretely the
test now fails ~10–20 times per day on master under
`Stateless tests (amd_llvm_coverage, ParallelReplicas, s3 storage, parallel)` —
@alexey-milovidov flagged this on ClickHouse#102039.

The test is a crash-regression test for
"Avoid crash in case of GROUP BY LowCardinality(Nullable(String)) column and
group_by_overflow_mode='any'" (PR ClickHouse#29637 / ClickHouse#56057). The query is preserved verbatim,
including all the problematic settings; only the assertion is changed to
`SELECT count() FROM (...) `, which is always 10. If the underlying crash ever returns,
the test will detect it; the specific row values were never deterministic to begin with.

Verified locally that the old reference fails under `max_threads = 4` (extra null bucket
appears, last row drops — exact diff matches the CI report). The new assertion returns
`{"count()":10}` for all parallelism levels.

Closes ClickHouse#36069
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants