DeltaLake: replace cached snapshot on update() so kernel sees fresh creds by ahmadov · Pull Request #107480 · ClickHouse/ClickHouse · GitHub
Skip to content

DeltaLake: replace cached snapshot on update() so kernel sees fresh creds#107480

Merged
ahmadov merged 16 commits into
masterfrom
ahmadov/deltalake-cache-update
Jun 22, 2026
Merged

DeltaLake: replace cached snapshot on update() so kernel sees fresh creds#107480
ahmadov merged 16 commits into
masterfrom
ahmadov/deltalake-cache-update

Conversation

@ahmadov

@ahmadov ahmadov commented Jun 15, 2026

Copy link
Copy Markdown
Member

Closes https://github.com/ClickHouse/support-escalation/issues/7739.

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

The fresh snapshot rebuilt inside update() already captures the C++ S3 client's current creds via createBuilder keep it instead of discarding it when the version matches, so the next kernel op runs on a freshly-credentialed engine and doesn't outlive the STS TTL.

Note that besides DeltaLake, this also changes AwsAuthSTSAssumeRoleCredentialsProvider::GetAWSCredentials/Reload to honor and clear SetNeedRefresh() (matching the Web-Identity and SSO providers), which affects any S3 access using STS assume-role on auth retries. Describe the new IObjectStorage::tryRefreshCredentialsViaCallback virtual and the credentials-fingerprint rebuild path in TableSnapshot.

Version info

  • Merged into: 26.6.1.1101
  • Backported to: 26.5.4.19, 26.4.5.74, 26.3.16.15

…reds

The fresh snapshot rebuilt inside `update()` already captures the
C++ S3 client's current creds via `createBuilder` keep it instead of
discarding it when the version matches, so the next kernel op runs on
a freshly-credentialed engine and doesn't outlive the STS TTL.
@ahmadov ahmadov requested a review from kssenii June 15, 2026 07:38
@clickhouse-gh

clickhouse-gh Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Jun 15, 2026
Comment thread src/Storages/ObjectStorage/DataLakes/DeltaLakeMetadataDeltaKernel.cpp Outdated
@kssenii kssenii self-assigned this Jun 16, 2026
Comment thread src/Storages/ObjectStorage/DataLakes/DeltaLakeMetadataDeltaKernel.cpp Outdated
Compare a SipHash128 of the C++ S3 client's current credentials against the
one captured when kernel_snapshot_state was built, and rebuild only the kernel
engine + snapshot + scan when they differ. Schema, stats, and the snapshot
LRU cache are preserved across rebuilds.
Comment thread src/Storages/ObjectStorage/DataLakes/DeltaLake/TableSnapshot.cpp
@ahmadov ahmadov added v26.2-must-backport v26.3-must-backport v26.4-must-backport v26.5-must-backport pr-must-backport Pull request should be backported intentionally. Use this label with great care! labels Jun 16, 2026
Comment thread src/Storages/ObjectStorage/DataLakes/DeltaLake/TableSnapshot.cpp
ahmadov added 4 commits June 17, 2026 11:29
The previous wording change broke
test_database_delta::test_complex_table_schema's contains_in_log
assertion. Split the init and rebuild branches so the init path emits
the original message verbatim.
Comment thread src/Storages/ObjectStorage/DataLakes/DeltaLake/KernelHelper.cpp Outdated
Comment thread src/Storages/ObjectStorage/DataLakes/DeltaLake/KernelHelper.cpp
Vended catalog credentials (Glue / Unity / REST) are static in the C++
S3 client until `credentials_refresh_callback` fires, but delta-kernel's
Rust object_store bypasses the ClickHouse error handlers that normally
invoke it. So an expired vended session surfaces as `DELTA_KERNEL_ERROR`
with `ExpiredToken` and never recovers.

Plumb the callback through `IObjectStorage::tryRefreshCredentialsViaCallback`
to `IKernelHelper::refreshCredentials`, and add a one-shot retry on
stale-token errors at both kernel touchpoints:
  - `TableSnapshot::initOrUpdateSnapshot` (engine build)
  - `Iterator::scanDataFunc` (per-batch listing, guarded by
    `total_data_files == 0` so we don't replay already-emitted paths)
@clickgapai

Copy link
Copy Markdown
Contributor

@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Jun 22, 2026
@ahmadov ahmadov removed pr-synced-to-cloud The PR is synced to the cloud repo pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Jun 22, 2026
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Jun 22, 2026
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 22, 2026
@fm4v fm4v removed pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Jun 22, 2026
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Jun 22, 2026
robot-ch-test-poll added a commit that referenced this pull request Jun 25, 2026
Cherry pick #107480 to 26.3: DeltaLake: replace cached snapshot on update() so kernel sees fresh creds
robot-clickhouse added a commit that referenced this pull request Jun 25, 2026
robot-ch-test-poll added a commit that referenced this pull request Jun 25, 2026
Cherry pick #107480 to 26.4: DeltaLake: replace cached snapshot on update() so kernel sees fresh creds
robot-clickhouse added a commit that referenced this pull request Jun 25, 2026
robot-ch-test-poll added a commit that referenced this pull request Jun 25, 2026
Cherry pick #107480 to 26.5: DeltaLake: replace cached snapshot on update() so kernel sees fresh creds
robot-clickhouse added a commit that referenced this pull request Jun 25, 2026
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Jun 25, 2026
alexey-milovidov added a commit that referenced this pull request Jun 27, 2026
Backport #107480 to 26.5: DeltaLake: replace cached snapshot on update() so kernel sees fresh creds
alexey-milovidov added a commit that referenced this pull request Jun 27, 2026
Backport #107480 to 26.4: DeltaLake: replace cached snapshot on update() so kernel sees fresh creds
alexey-milovidov added a commit that referenced this pull request Jun 27, 2026
Backport #107480 to 26.3: DeltaLake: replace cached snapshot on update() so kernel sees fresh creds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-bugfix Pull request with bugfix, not backported by default pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo v26.2-must-backport v26.3-must-backport v26.4-must-backport v26.5-must-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants