CI: add experimental serverfuzz stress test and BuzzHouse jobs#101399
Conversation
Add new `Stress test (experimental, serverfuzz, ...)` and `BuzzHouse (experimental, serverfuzz, ...)` CI jobs that re-enable the server-side AST fuzzer (`ast_fuzzer_runs`) which was disabled for regular jobs in #101274. The new jobs cover the same build variants as the existing stress and buzz jobs. The `ast_fuzzer_runs` / `ast_fuzzer_any_query` settings are activated only when `serverfuzz` is present in `Info().job_name`: - `stress_job.py` passes `ENABLE_SERVER_FUZZER=1` to docker, and `stress_tests.lib` writes a separate XML config when that env is set. - `ast_fuzzer_job.py` passes `SERVER_FUZZER_ENABLED=1` to docker, and `run-fuzzer.sh` writes a separate XML config when that env is set. New jobs are added to both `master.py` and `pull_request.py` (for testing). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix. |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep them only in the master workflow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the server dies, the stress test reports both a generic "Server died" from the test runner and a specific error parsed from server logs (e.g. Logical error, sanitizer finding). Skip the generic "Server died" entry since the log parsing always produces a more specific result. #101399 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
If the server died but log parsing produced no specific result, re-add "Server died" so the failure is never silently lost. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@groeneai please read this PR. |
|
Read the PR. Here is my understanding: New jobs (MasterCI only):
Mechanism: Monitoring updated:
Will start monitoring for failures as they come in. |
|
@nikitamikhaylov Here's a comprehensive catalog of all crashes found by the experimental serverfuzz jobs on master. I've been tracking these since the jobs went live (April 15). Serverfuzz Crash Catalog — Master Branch (last 30 days)22 total failures across 7 master commits, mapping to 11 distinct crash types. All are tracked with fix PRs or under active investigation.
Summary
I'll continue working through the queue. The highest priority remaining items are:
|

Re-enables the server-side AST fuzzer (
ast_fuzzer_runs) that was disabled in #101274, but only for new dedicated CI jobs that carry theserverfuzzandexperimentalkeywords in their names. Regular stress and BuzzHouse jobs are not affected.New jobs added to
master.pyonly (not run on PRs):Stress test (experimental, serverfuzz, ...)— mirrors all existing stress test variants (11 build configs + 2 Azure)BuzzHouse (experimental, serverfuzz, ...)— mirrors all existing BuzzHouse variants (4 build configs)The setting is activated by checking
Info().job_namein the Python job scripts:stress_job.pypassesENABLE_SERVER_FUZZER=1into docker;stress_tests.libwrites a separate XML profile when that env is set.ast_fuzzer_job.pypassesSERVER_FUZZER_ENABLED=1into docker;run-fuzzer.shwrites a separate XML profile when that env is set. Enables bothast_fuzzer_runsandast_fuzzer_any_queryto cover write/DDL paths.Additionally, when the server dies and a specific error is extracted from logs (e.g. Logical error, sanitizer finding), the redundant generic "Server died" test result is now excluded from the report.
Reverts the disable from: #101274
Changelog category (leave one):
Version info
26.4.1.972