Fix server abort on bad argument count for object-storage table engines (STID 4283-5f31)#103544
Conversation
The defensive arg-count checks inside `addStructureAndFormatToArgsIfNeeded*`
(Azure, HDFS, S3) and `ITableFunctionFileLike::updateStructureAndFormatArgumentsIfNeeded`
threw `LOGICAL_ERROR`. That code path is reachable from user input via
`StorageObjectStorage::addInferredEngineArgsToCreateQuery`, after the
storage is constructed - notably for `DeltaLakeAzure(<single-arg>)` where
`AzureStorageParsedArguments::fromAST` accepts 1 argument as a "lightweight
loading" case but the structure/format helper expects at least 3.
`LOGICAL_ERROR` triggers `abortOnFailedAssertion` in debug/sanitizer
builds, so a malformed `CREATE TABLE` aborts the server, which is what
BuzzHouse hit (STID 4283-5f31):
Logical error: Expected 3 to 10 arguments in table function
azureBlobStorage, got 1
Stack: addStructureAndFormatToArgsIfNeededAzure -> addInferredEngineArgsToCreateQuery
-> InterpreterCreateQuery::doCreateTable
Replaced `LOGICAL_ERROR` with `NUMBER_OF_ARGUMENTS_DOESNT_MATCH` in all
four call sites - matching the upstream `*StorageParsedArguments::fromAST`
validation, which already uses `NUMBER_OF_ARGUMENTS_DOESNT_MATCH` for the
same kind of check. Users now get a graceful, well-typed error instead of
a server abort.
Added stateless test `04119_object_storage_bad_arg_count_no_logical_error`
documenting the expected error code for `file`, `url`, `s3`, `hdfs`, and
`azureBlobStorage` table functions with bad argument counts.
Failure source:
https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=98377&sha=402e066f308294fe00a31ac763f9a72d38f3e96e&name_0=PR&name_1=BuzzHouse%20%28amd_debug%29
|
Pre-PR validation (per task TASK.md Phase 4 Step 9): (a) Deterministic repro? Yes via stack trace from CIDB STID 4283-5f31. The user-facing path (b) Root cause explained? (c) Fix matches root cause? Yes — direct fix: change error code from (d) Test intent preserved? / New tests added? New stateless test (e) Both directions demonstrated? Test passes 30/30 with the fix applied. Without the fix, the test would still pass for the user-facing table-function paths (because (f) Fix is general, not a narrow patch? Yes. Sibling check done across the entire ObjectStorage / TableFunctions surface: all 4 places with the Session: cron:clickhouse-ci-task-worker:20260425-234500 |
|
Workflow [PR], commit [704a360] Summary: ❌
AI ReviewSummaryThis PR changes user-reachable object-storage argument-count checks from Final VerdictStatus: ✅ Approve |
Address review feedback on PR ClickHouse#103544: the original test exercised only the table function paths (`file`, `url`, `s3`, `hdfs`, `azureBlobStorage`); the BuzzHouse failure surfaced through the engine path (`InterpreterCreateQuery::doCreateTable` -> `StorageObjectStorage::addInferredEngineArgsToCreateQuery` -> `addStructureAndFormatToArgsIfNeededAzure`). Added engine-path tests using `CREATE TABLE ... ENGINE = X(<bad arg count>)` for `S3`, `HDFS`, `AzureBlobStorage`, and `DeltaLakeAzure`. Each asserts `NUMBER_OF_ARGUMENTS_DOESNT_MATCH`, ensuring the CREATE TABLE codepath stays covered for future regressions in either `*StorageParsedArguments::fromAST` (the upstream validation) or `addStructureAndFormatToArgsIfNeeded*` (the helper that previously aborted with `LOGICAL_ERROR`). Added `no-msan` tag because `delta-kernel-rs` is disabled under MSan, which makes `DeltaLakeAzure` unavailable. Reviewer: clickhouse-gh[bot] PR: ClickHouse#103544
|
Addressed the engine-path coverage feedback in commit 9ae6712. What's added: Direct engine-path tests via CREATE TABLE t_engine_s3_zero (x UInt32) ENGINE = S3(); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
CREATE TABLE t_engine_s3_many (x UInt32) ENGINE = S3('a','b','c','d','e','f','g','h','i','j','k','l');
CREATE TABLE t_engine_hdfs_zero (x UInt32) ENGINE = HDFS();
CREATE TABLE t_engine_hdfs_many (x UInt32) ENGINE = HDFS('a','b','c','d','e','f');
CREATE TABLE t_engine_azure_zero (x UInt32) ENGINE = AzureBlobStorage();
CREATE TABLE t_engine_azure_many (x UInt32) ENGINE = AzureBlobStorage('a','b','c','d','e','f','g','h','i','j','k','l');
CREATE TABLE t_engine_delta_azure_zero (x UInt32) ENGINE = DeltaLakeAzure();
CREATE TABLE t_engine_delta_azure_many (x UInt32) ENGINE = DeltaLakeAzure('a','b','c','d','e','f','g','h','i','j','k','l');Each asserts
Added Local verification: All non-DeltaLake test cases pass via |
…nt check These tests call addStructureAndFormatToArgsIfNeededAzure directly via the single-argument lightweight-loading path and an over-the-maximum arg list, the only shapes that reach the helper's arg-count guard. They assert NUMBER_OF_ARGUMENTS_DOESNT_MATCH; reverting the guard to LOGICAL_ERROR aborts the test in debug/ASan builds. The stateless test cannot reach this branch because fromAST rejects those argument counts earlier. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
LLVM Coverage Report
Changed lines: Changed C/C++ lines covered by tests: 29/42 (69.05%) | Lost baseline coverage: none · Uncovered code |
3db3ad2
04119_object_storage_bad_arg_count_no_logical_error is a regression test for PR ClickHouse#103544: every query in it throws on purpose (expected serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH). On a thrown query, executeQuery logs the full query text -- via toOneLineQuery, which keeps comment tokens verbatim -- at Error level, so the file's leading comment ends up in clickhouse-server.err.log. That comment quoted the original pre-fix crash message word for word ("Logical error: Expected 3 to 10 arguments in table function azureBlobStorage, got 1"). The CI server-log scanner (ci/jobs/scripts/log_parser.py) greps the server log with `rg --text -A 10 -o 'Logical error.*'`, matched the comment, used its first line as the failure title and a nearby unrelated stack for the STID, and reported a bogus blocking failure. After ClickHouse#103544 merged on 2026-06-09 the server can no longer emit that message (it now throws NUMBER_OF_ARGUMENTS_DOESNT_MATCH), so the only remaining source of the string was this comment -- which is why the failure appeared on master and across dozens of unrelated PRs starting exactly that day (STID 3574-4812 / 2508-4994). log_parser.py already documents this SQL-comment false-match class. Reword the comment so it no longer contains the verbatim runtime message (refer to the error by its code name LOGICAL_ERROR), and add a note telling future editors to keep the file free of scanner trigger strings. Test-only change; the queries and their expected errors are unchanged and the test still passes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

BuzzHouse hit a server abort when
CREATE TABLE ... ENGINE = DeltaLakeAzure(<single-arg>)reachedaddStructureAndFormatToArgsIfNeededAzurewith one argument. The defensive arg-count check there (and in three sibling helpers) usedLOGICAL_ERROR, whichabortOnFailedAssertions in debug/sanitizer builds.Stack:
The path is reachable because
AzureStorageParsedArguments::fromASTaccepts a single argument as a "lightweight loading" case, while the structure/format helper assumes at least 3 arguments. The same anti-pattern (usingLOGICAL_ERRORfor an arg-count check) lived in four places:src/Storages/ObjectStorage/Azure/Configuration.cpp->addStructureAndFormatToArgsIfNeededAzuresrc/Storages/ObjectStorage/HDFS/Configuration.cpp->addStructureAndFormatToArgsIfNeededHDFSsrc/Storages/ObjectStorage/S3/Configuration.cpp->addStructureAndFormatToArgsIfNeededS3src/TableFunctions/ITableFunctionFileLike.h->updateStructureAndFormatArgumentsIfNeededAll four now throw
NUMBER_OF_ARGUMENTS_DOESNT_MATCH, which is what the upstream*StorageParsedArguments::fromASTvalidation already uses for the same kind of check. Users get a graceful, well-typed error instead of a server abort.Added stateless test
04119_object_storage_bad_arg_count_no_logical_errorthat documents the expected error code forfile,url,s3,hdfs, andazureBlobStoragetable functions with bad argument counts.CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=98377&sha=402e066f308294fe00a31ac763f9a72d38f3e96e&name_0=PR&name_1=BuzzHouse%20%28amd_debug%29
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fixed a server abort when creating a table with an object-storage engine (
AzureBlobStorage,DeltaLakeAzure,S3,HDFS) using an unsupported number of arguments. The server now returns a cleanNUMBER_OF_ARGUMENTS_DOESNT_MATCHerror instead of aborting in debug/sanitizer builds.Documentation entry for user-facing changes
Version info
26.6.1.548