Fix flaky 02504_regexp_dictionary_ua_parser on parallel replicas + s3 by murphy-4o · Pull Request #104666 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix flaky 02504_regexp_dictionary_ua_parser on parallel replicas + s3#104666

Merged
murphy-4o merged 1 commit into
ClickHouse:masterfrom
murphy-4o:fix-02504-regexp-dict-flaky
May 12, 2026
Merged

Fix flaky 02504_regexp_dictionary_ua_parser on parallel replicas + s3#104666
murphy-4o merged 1 commit into
ClickHouse:masterfrom
murphy-4o:fix-02504-regexp-dict-flaky

Conversation

@murphy-4o

@murphy-4o murphy-4o commented May 12, 2026

Copy link
Copy Markdown
Member

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry:

Not required (CI fix).

Documentation entry for user-facing changes

  • Documentation is unnecessary (test-only change).

Qualifies the three dictGet calls in 02504_regexp_dictionary_ua_parser.sh with ${CLICKHOUSE_DATABASE}. so the test passes deterministically under parallel replicas.

Why it flakes: dictGet's first argument is kept as a string literal end-to-end. On parallel-replica workers the parallel_replicas cluster has no <default_database> (and ClientInfo doesn't carry one), so the worker's current_database is empty. When parallel_replicas_local_plan=0 (e.g. randomized) routes the getReturnTypeImpl typing call to a worker, ExternalDictionariesLoader::resolveDictionaryName can't find regexp_os in the test database and throws Dictionary (regexp_os) not found. Qualifying the literal with the database name skips the current_database fallback in the resolver.

Smallest-blast-radius fix; the underlying engine issue (workers should inherit current_database, or dict literals should be UUID-rewritten before dispatch) is left for a separate PR. Pairs with #104627 which already took this test off the blacklist.

Version info

  • Merged into: 26.5.1.551

@murphy-4o murphy-4o marked this pull request as draft May 12, 2026 07:09
Qualify the three dictGet dictionary names with ${CLICKHOUSE_DATABASE}
so the worker can resolve them without depending on its current_database.

On parallel-replica runs (the parallel_replicas cluster has no
<default_database>), the worker's current_database is empty, so an
unqualified dictGet('regexp_os', ...) — kept as a string literal through
formattedAST — fails resolveDictionaryNameFromDatabaseCatalog and throws
"Dictionary (regexp_os) not found" during getReturnTypeImpl on the
worker. parallel_replicas_local_plan=1 normally hides this because the
typing call stays on the initiator; randomized settings expose it.

Test-side workaround. Removes the test from parallel_replicas_blacklist
so the FAIL→SKIPPED conversion no longer masks regressions. The
underlying engine bug — workers should inherit the initiator's
current_database, or unqualified dict literals should be rewritten to
UUIDs before dispatch — is left as a separate fix.
@murphy-4o murphy-4o force-pushed the fix-02504-regexp-dict-flaky branch from 66bdf37 to c2b8abe Compare May 12, 2026 07:13
@clickhouse-gh

clickhouse-gh Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-ci label May 12, 2026
@murphy-4o murphy-4o marked this pull request as ready for review May 12, 2026 10:08
@alesapin alesapin self-assigned this May 12, 2026
@murphy-4o murphy-4o added this pull request to the merge queue May 12, 2026
Merged via the queue into ClickHouse:master with commit 2b2d7c7 May 12, 2026
165 checks passed
@murphy-4o murphy-4o deleted the fix-02504-regexp-dict-flaky branch May 12, 2026 11:29
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants