Add validation for expand command on scalar types (#5065) by penghuo · Pull Request #5089 · opensearch-project/sql · GitHub
Skip to content

Add validation for expand command on scalar types (#5065)#5089

Closed
penghuo wants to merge 6 commits into
opensearch-project:mainfrom
penghuo:fix-issue-5065
Closed

Add validation for expand command on scalar types (#5065)#5089
penghuo wants to merge 6 commits into
opensearch-project:mainfrom
penghuo:fix-issue-5065

Conversation

@penghuo

@penghuo penghuo commented Jan 29, 2026

Copy link
Copy Markdown
Collaborator

Description

Add validation to reject expand command on scalar types with a clear error message. This addresses issue #5065 where expand fails with a confusing "UNNEST argument must be a collection" error when used on OpenSearch multi-value fields.

Root Cause:
OpenSearch doesn't have an ARRAY type. Multi-value fields like [1, 2, 3] are stored as repeated scalar values (e.g., type long). When Calcite's uncollect operation is triggered during codegen, it expects ArraySqlType but receives BasicSqlType(BIGINT), causing a CalciteException.

Solution:
Validate the field type at planning time and fail fast with a descriptive error message explaining that expand only works on explicitly defined array types, not OpenSearch's implicit multi-value fields.

Impact:

  • Only affects expand command
  • No performance impact
  • Existing expand tests on true array types continue to work
  • Users get clear guidance instead of cryptic Calcite errors

Related Issues

Resolves #5065

Testing

  • Added integration test Issue5065IT that verifies the error message
  • All existing expand tests pass (CalciteExpandCommandIT)
  • Test command: ./gradlew :integ-test:integTest --tests "org.opensearch.sql.calcite.Issue5065IT"

Check List

  • New functionality includes testing
  • Commits are signed per the DCO using --signoff
  • Public documentation issue/PR created

Signed-off-by: Peng Huo <penghuo@gmail.com>
…/index keywords

When a PPL query contains duplicate 'source' or 'index' keywords
(e.g., 'source source=index_name'), the parser was accepting it as
valid syntax, treating the first keyword as a search expression.
This caused confusing errors later when OpenSearch tried to expand
fields.

This fix adds validation in AstBuilder.visitSearchFrom() to detect
when reserved keywords 'source' or 'index' appear as search
expressions before the fromClause. It now throws a clear
SyntaxCheckException with a helpful error message suggesting the
correct syntax.

Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
…#5065)

- Validate field type in buildExpandRelNode before uncollect operation
- Throw UnsupportedOperationException with clear message for scalar types
- OpenSearch multi-value fields stored as scalars cannot be expanded when codegen triggered
- Add integration test to verify error message

Signed-off-by: Peng Huo <penghuo@gmail.com>
@coderabbitai

coderabbitai Bot commented Jan 29, 2026

Copy link
Copy Markdown
Contributor

@penghuo penghuo closed this Jan 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Calcite PPL doesn't handle array value columns if codegen triggered

1 participant