Fix test_disks_app_interactive ls helper to be robust to disks app sort order by groeneai · Pull Request #106730 · ClickHouse/ClickHouse · GitHub
Skip to content

Fix test_disks_app_interactive ls helper to be robust to disks app sort order#106730

Merged
nikitamikhaylov merged 2 commits into
ClickHouse:masterfrom
groeneai:groeneai/fix-tests-unstable-sort-106650
Jun 9, 2026
Merged

Fix test_disks_app_interactive ls helper to be robust to disks app sort order#106730
nikitamikhaylov merged 2 commits into
ClickHouse:masterfrom
groeneai:groeneai/fix-tests-unstable-sort-106650

Conversation

@groeneai

@groeneai groeneai commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Per @alexey-milovidov directive on #106650, harden integration tests that could wrongly assert results dependent on the sort algorithm.

What changes

tests/integration/test_disks_app_interactive/test.py — the ls helper in DisksClient now sorts the parsed listings on the Python side. The disks app sorts entries with std::sort in programs/disks/CommandList.cpp before printing, and test_disks_app_interactive_list_directories_default asserts dict values against fixed alphabetically ordered lists. If the underlying sort changes (different algorithm, different ordering on equal-key inputs), the exact-list assertions would deterministically fail.

The added Python-side sorted is idempotent for the current alphabetical server-side sort. All existing expected lists are already alphabetical; no test assertions need to change.

test_profile_max_sessions_for_user

The directive also mentioned test_profile_max_sessions_for_user. Reviewed each assertion and found nothing order-dependent: the test relies on assert_logs_contain_with_retry (substring search), client.expect(...) (sequential prompt/string matching), and KILL QUERY (output discarded). No list comparisons. If a specific assertion is in scope, please point it out and I will follow up.

Validation

  • Local run with the new ls helper: 25 passed in 11s

    python3 -m ci.praktika run "integration" --test "test_disks_app_interactive/test.py" --count 5
    
  • Simulated three server-side sort orderings (alphabetical, reverse, shuffle):

    Sort mode Old ls passes New ls passes
    alphabetical yes yes
    reverse no yes
    shuffle no yes

Related: #106650

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Version info

  • Merged into: 26.6.1.546

The recursive `disks list --recursive` output is sorted server-side by
`std::sort` in `programs/disks/CommandList.cpp`, and the test asserts dict
values against alphabetically ordered lists. If that sort algorithm ever
changes (e.g. to an unstable sort that does not happen to produce
alphabetical order on equal-key inputs, or to a different ordering), the
exact-list assertions in `test_disks_app_interactive_list_directories_default`
would fail.

Sort listings on the Python side in the `ls` helper. The change is
idempotent for the current alphabetical server-side sort, and decouples
test assertions from the underlying sort algorithm.

Verified locally with `python3 -m ci.praktika run "integration" --test
"test_disks_app_interactive/test.py" --count 5` (25 passed in 11s).

Related: ClickHouse#106650
@groeneai

groeneai commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Pre-PR validation gate

# Question Answer
a Deterministic repro? Yes — simulated three server-side sort orderings (alphabetical, reverse, random shuffle) against the assertion in test_disks_app_interactive_list_directories_default. Old ls helper passes only on alphabetical; reverse and shuffle deterministically fail. New helper passes on all three. Simulation script driven by the actual disks list --recursive output format from programs/disks/CommandList.cpp.
b Root cause explained? The disks app sorts entries in programs/disks/CommandList.cpp with std::sort and prints them. The Python ls helper in DisksClient parses that output but does not re-sort, so test assertions like traversed_dir == {"./dir1": ["dir11", "dir13"], ...} rely on the server-side sort producing exactly that order. Any change to that sort (algorithm change, switch to unstable sort, different ordering on equal-key inputs) breaks every exact-list assertion in test_disks_app_interactive_list_directories_default.
c Fix matches root cause? Yes — the helper now sorts each parsed listing on the Python side, decoupling the test from the disks app sort order entirely. Not a band-aid: directly addresses the mechanism from (b) by removing the order coupling rather than working around it.
d Test intent preserved / new tests added? Preserved. The assertions still verify the same dict structure and the same set of entries per directory. The change only relaxes ordering. The expected lists in the test were already alphabetical, so the assertion semantics are unchanged. No new test needed — this is a test-helper hardening, not a code bug fix.
e Both directions demonstrated? Yes — proof above (a). Old helper FAILS on reverse/shuffle, NEW helper PASSES on all three. Local run with the new helper: python3 -m ci.praktika run "integration" --test "test_disks_app_interactive/test.py" --count 5 → 25 passed in 11s.
f Fix is general, not a narrow patch? Yes — ls is the single helper through which all client.ls(...) calls flow, so sorting there fixes every order-dependent assertion in the test file at once. No sibling code path missed. test_profile_max_sessions_for_user was reviewed for order-dependent assertions (substring expect, assert_logs_contain_with_retry, discarded KILL QUERY output) — none found, see PR description.

Session id: cron:clickhouse-ci-task-worker:20260608-145700

@groeneai

groeneai commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

cc @azat — could you review this? It is a test-only fix per @ alexey-milovidov directive: makes the ls helper in test_disks_app_interactive sort listings on the Python side so assertions do not depend on the disks app sort order.

@davenger davenger self-assigned this Jun 8, 2026
@azat azat added the can be tested Allows running workflows for external contributors label Jun 8, 2026
@clickhouse-gh

clickhouse-gh Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Workflow [PR], commit [2924efe]

Summary:


AI Review

Summary
  • The PR makes test_disks_app_interactive independent of clickhouse disks listing order and scopes the test's host filesystem state by PYTEST_XDIST_WORKER for flaky-check workers. I found no correctness, isolation, or metadata issues that warrant inline review comments.
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-ci label Jun 8, 2026
Under `Integration tests (amd_asan_ubsan, flaky)` praktika runs pytest-xdist
with `--dist=each` and many workers concurrently. All workers share the runner
host filesystem, so the test must scope every persistent path by
`PYTEST_XDIST_WORKER`. The previous patch only sorted `ls` output on the
client side; that addresses sort-order assertions but does nothing for
concurrent workers stomping `/var/lib/clickhouse`, the shared `test/`
directory, and the temporary `a.txt` next to `tests/integration/`.

Changes:
- `DisksClient.default_disk_root_directory` is now per-worker
  (`/var/lib/clickhouse-<gwN>`).
- `getLocalDisksClient` wipes the per-worker root on every refresh, terminates
  any orphan `clickhouse disks` subprocess from the previous iteration, and
  writes a per-worker `config_<gwN>.xml` whose `<path>` points to that root.
- `test_disks_app_interactive_list_directories_default` and
  `test_disks_app_interactive_test_move_and_write` now use per-worker
  directory and file names instead of the shared `test`/`a.txt`.
- `test_disks_app_interactive_cp_and_read` uses per-worker file and directory
  names so concurrent workers cannot delete each other's `a.txt` mid-test.

Verified locally with `python3 -m ci.praktika run "Integration tests
(amd_asan_ubsan, flaky)" --test test_disks_app_interactive` at workers=4 and
workers=8: 20/20 and 40/40 pass, vs. the previous 5/20 fail. Also verified
single-worker mode (5/5 pass) and `--count 10 --workers 4` (50/50 pass).

CI report on the previous flaky-check failure:
https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=106730&sha=110670122e495cdcbc23b0fc7aba7a70679fa5c6&name_0=PR&name_1=Integration%20tests%20%28amd_asan_ubsan%2C%20flaky%29

Related: ClickHouse#106650

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai

groeneai commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Pre-PR validation gate

Follow-up commit 2924efe3aadf addresses the flaky-check failures left by the original commit. Validation table for the new commit:

# Question Answer
a Deterministic repro? Yes. python3 -m ci.praktika run "Integration tests (amd_asan_ubsan, flaky)" --test test_disks_app_interactive --workers 4 deterministically reproduces 5/20 failures on the previous commit 1106701 (FileExistsError: '/var/lib/clickhouse', cannot stat 'tests/integration/a.txt' on disk 'local', missing ./dir2, [] == ['a.txt']).
b Root cause explained? The flaky check uses --dist=each with N pytest-xdist workers running ALL changed tests concurrently on the SAME runner host. DisksClient runs clickhouse disks LOCALLY (not in Docker) with <path>/var/lib/clickhouse/</path> — a single shared root. Workers race on os.mkdir('/var/lib/clickhouse'), on mkdir('test') inside the disks app, and on tests/integration/a.txt (which one worker os.removes while another is mid-test). The cluster.py per-worker isolation does not apply because DisksClient does not use the Docker cluster.
c Fix matches root cause? Yes. The fix scopes every persistent path by PYTEST_XDIST_WORKER: default_disk_root_directory becomes /var/lib/clickhouse-<gwN>, the disks-app config is generated per-worker (config_<gwN>.xml), mkdir('test')/a.txt use per-worker names, and getLocalDisksClient(refresh=True) terminates the old subprocess and wipes its per-worker root before spawning a new one.
d Test intent preserved / new tests added? Preserved. Each test still exercises the same disks-app commands; only the path names differ. No assertions are weakened. Sort-order robustness from the previous commit is kept (sorted on the client side).
e Both directions demonstrated? Yes. With --workers 4 the previous commit produced 5 failures across 4 workers; with this commit 20/20 pass (4 workers x 5 tests). Stress-tested at workers=8 (40/40 pass) and --count 10 --workers 4 (50/50 pass). Single-worker normal-mode also passes (5/5).
f Fix is general, not a narrow patch? Yes. The fix targets the ONE shared subsystem (filesystem state outside Docker) and applies the worker-scoping pattern that cluster.py already uses for Docker isolation, so any other test that uses DisksClient.getLocalDisksClient would inherit the same isolation. No sibling code paths use the same shared /var/lib/clickhouse literal.

Session id: cron:clickhouse-ci-task-worker:20260608-175200

@clickhouse-gh

clickhouse-gh Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.60% 84.50% -0.10%
Functions 92.30% 92.30% +0.00%
Branches 77.20% 77.20% +0.00%

Changed lines: No C/C++ source files changed — skipping uncovered code analysis.

Newly covered by added/modified tests: 268 line(s), 34 function(s) across 113 file(s) · Details

Top files
  • src/Storages/MaterializedView/RefreshTask.cpp: 13 line(s)
  • src/Storages/StorageReplicatedMergeTree.cpp: 11 line(s)
  • src/Storages/ObjectStorageQueue/ObjectStorageQueueIFileMetadata.cpp: 8 line(s), 1 function(s)
  • src/Storages/MergeTree/MergeTreePartsMover.cpp: 8 line(s)
  • src/Functions/array/arrayNorm.cpp: 7 line(s)

Full report

@groeneai

groeneai commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

CI fully finished on 2924efe3aadf. Finish Workflow + Mergeable Check + LLVM Coverage all SUCCESS at 2026-06-08T20:42:30Z (≥20-min CIDB ingestion buffer met).

CI Ledger — covered SHA: 2924efe

Result Notes
All 166 checks SUCCESS or SKIPPED Zero failures recorded in CIDB on this SHA
Stateless / Integration / AST fuzzer / Unit tests / Build matrix All green across all sanitizer variants and architectures

PR APPROVED by @davenger. No outstanding reviewer feedback. Awaiting merge.

@nikitamikhaylov nikitamikhaylov added this pull request to the merge queue Jun 9, 2026
Merged via the queue into ClickHouse:master with commit 447efcc Jun 9, 2026
166 checks passed
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants