Antalya-26.1: Fix row policies silently ignored on Iceberg tables with PREWHERE enabled by mkmkme · Pull Request #1597 · Altinity/ClickHouse · GitHub
Skip to content

Antalya-26.1: Fix row policies silently ignored on Iceberg tables with PREWHERE enabled#1597

Merged
zvonand merged 7 commits into
antalya-26.1from
mkmkme/antalya-26.1/iceberg-fix-prewhere
Mar 31, 2026
Merged

Antalya-26.1: Fix row policies silently ignored on Iceberg tables with PREWHERE enabled#1597
zvonand merged 7 commits into
antalya-26.1from
mkmkme/antalya-26.1/iceberg-fix-prewhere

Conversation

@mkmkme

@mkmkme mkmkme commented Mar 30, 2026

Copy link
Copy Markdown
Collaborator

The Iceberg read optimization (allow_experimental_iceberg_read_optimization) identifies constant columns from Iceberg metadata and removes them from the read request. When all requested columns become constant, it sets need_only_count = true, which tells the Parquet reader to skip all initialization — including preparePrewhere — and just return the raw row count from file metadata.

This completely bypasses row_level_filter (row policies) and prewhere_info, returning unfiltered row counts. The InterpreterSelectQuery relies on the storage to apply these filters when supportsPrewhere is true and does not add a fallback FilterStep to the query plan, so the filter is silently lost.

The fix prevents need_only_count from being set when an active row_level_filter or prewhere_info exists in the format filter info.

Fixes #1595

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix row policies silently ignored on Iceberg tables with PREWHERE enabled

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@mkmkme mkmkme added bugfix port-antalya PRs to be ported to all new Antalya releases antalya-26.1 antalya-26.1.6.20001 labels Mar 30, 2026
@mkmkme

mkmkme commented Mar 30, 2026

Copy link
Copy Markdown
Collaborator Author

@mkmkme

mkmkme commented Mar 30, 2026

Copy link
Copy Markdown
Collaborator Author

I've also checked the regression test is passing with this change.

ianton-ru
ianton-ru previously approved these changes Mar 30, 2026

@ianton-ru ianton-ru left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CarlosFelipeOR

CarlosFelipeOR commented Mar 30, 2026

Copy link
Copy Markdown
Collaborator

QA Verification: Partial fix — issue found

PR #1597 successfully fixes the row policy issue (#1595): the 76 row_policy regression test failures and 122 integration test failures from PR #1581 are now all passing.

However, the branch (mkmkme/antalya-26.1/iceberg-fix-prewhere) which also includes the backport of upstream 100361, introduces a new issue: server crash (SIGABRT) on all amd_debug stateless test jobs. The crash occurs in Reader::applyPrewhere()addDummyColumnWithRowCount() in the Parquet V3 Reader (Signal 6, STID 2938-44c8). This is deterministic and reproduces on every debug build run. Release builds pass all tests.

Additionally, the regression test prewhere clause (REST + Glue catalogs, both archs) fails because the test expects exitcode=182 (ILLEGAL_PREWHERE) for versions < 26.2, but PREWHERE now works on Iceberg. This is a false positive — the test needs updating.

Details:

Adding verified-with-issue label.

Follow-up validation — issues resolved

Following additional validation performed before and after the merge of PRs #1581 and #1597, we confirmed that:

With these fixes, all previously identified issues related to this PR are resolved.

The remaining CI failures are not related to this PR.

@CarlosFelipeOR CarlosFelipeOR added the verified-with-issues Verified by QA and issues found. label Mar 30, 2026
@mkmkme mkmkme changed the title Fix row policies silently ignored on Iceberg tables with PREWHERE enabled Antalya-26.1: Fix row policies silently ignored on Iceberg tables with PREWHERE enabled Mar 31, 2026
@mkmkme

mkmkme commented Mar 31, 2026

Copy link
Copy Markdown
Collaborator Author

The crash is fixed by @CarlosFelipeOR commit in #1581 (we probably need to move it here).

In the personal discussion with Carlos, we found out there was another issue even with that fix: one of the regression tests didn't pass with old analyzer. I have fixed this test with the latest commit.

@mkmkme

mkmkme commented Mar 31, 2026

Copy link
Copy Markdown
Collaborator Author

the audit-report for the last commit shows:

No confirmed defects in reviewed scope.

alexey-milovidov and others added 7 commits March 31, 2026 22:58
…ere-external-columns

Fix exception in Parquet PREWHERE when column is not in file
…o-assertion

Fix exception in `updateFormatPrewhereInfo` when only row-level filter is set
…bled

The Iceberg read optimization (`allow_experimental_iceberg_read_optimization`)
identifies constant columns from Iceberg metadata and removes them from the
read request. When all requested columns become constant, it sets
`need_only_count = true`, which tells the Parquet reader to skip all
initialization — including `preparePrewhere` — and just return the raw row
count from file metadata.

This completely bypasses `row_level_filter` (row policies) and `prewhere_info`,
returning unfiltered row counts. The InterpreterSelectQuery relies on the
storage to apply these filters when `supportsPrewhere` is true and does not
add a fallback FilterStep to the query plan, so the filter is silently lost.

The fix prevents `need_only_count` from being set when an active
`row_level_filter` or `prewhere_info` exists in the format filter info.

Fixes #1595
…t NULLs

The Altinity-specific constant column optimization
(`allow_experimental_iceberg_read_optimization`) scans `requested_columns`
for nullable columns absent from the Iceberg file metadata and replaces
them with constant NULLs. However, `requested_columns` can also contain
columns produced by `prewhere_info` or `row_level_filter` expressions
(e.g. `equals(boolean_col, false)`). These computed columns are not in
the file metadata, and their result type is often `Nullable(UInt8)`, so
the optimization incorrectly treats them as missing file columns and
replaces them with NULLs.

This corrupts the prewhere pipeline: the Parquet reader evaluates the
filter expression correctly, but the constant column optimization then
overwrites the result with NULLs. With `need_filter = false` (old planner,
PREWHERE + WHERE), all rows appear to fail the filter, producing empty
output. With `need_filter = true`, the filter column is NULL so all rows
are filtered out.

The fix skips columns that match the `prewhere_info` or `row_level_filter`
column names, since these are computed at read time and never stored in
the file.
@CarlosFelipeOR CarlosFelipeOR added verified Approved for release and removed verified-with-issues Verified by QA and issues found. labels Mar 31, 2026
@mkmkme mkmkme force-pushed the backports/antalya-26.1/95476 branch from 9fbbcaf to 33a1a86 Compare March 31, 2026 21:09
@mkmkme mkmkme force-pushed the mkmkme/antalya-26.1/iceberg-fix-prewhere branch from 341e1ff to b7696a3 Compare March 31, 2026 21:10
@mkmkme mkmkme changed the base branch from backports/antalya-26.1/95476 to antalya-26.1 March 31, 2026 21:12
@mkmkme mkmkme dismissed ianton-ru’s stale review March 31, 2026 21:12

The base branch was changed.

@mkmkme

mkmkme commented Mar 31, 2026

Copy link
Copy Markdown
Collaborator Author

@zvonand zvonand merged commit fd7dc16 into antalya-26.1 Mar 31, 2026
224 of 233 checks passed
@zvonand zvonand removed the port-antalya PRs to be ported to all new Antalya releases label Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Row policies silently ignored on Iceberg tables after enabling PREWHERE (PR #1581)

6 participants