ML command supports category_field parameter by gaobinlong · Pull Request #3909 · opensearch-project/sql · GitHub
Skip to content

ML command supports category_field parameter#3909

Merged
LantaoJin merged 1 commit into
opensearch-project:mainfrom
gaobinlong:fix_test
Jan 6, 2026
Merged

ML command supports category_field parameter#3909
LantaoJin merged 1 commit into
opensearch-project:mainfrom
gaobinlong:fix_test

Conversation

@gaobinlong

@gaobinlong gaobinlong commented Jul 22, 2025

Copy link
Copy Markdown
Contributor

Description

From the document of ML command, it shows that ml supports category_field command, but actually it doesn't work. This PR makes ML command supports category_field parameter.

Request:

POST _plugins/_ppl?format=jdbc
{
  "query":"source = abcd_test | eval value = cast(value as double) | fields value, category | ml action='trainandpredict' algorithm='rcf' input='value' category_field='category'"
}

Response:

{
  "schema": [
    {
      "name": "value",
      "type": "double"
    },
    {
      "name": "category",
      "type": "string"
    },
    {
      "name": "score",
      "type": "double"
    },
    {
      "name": "anomalous",
      "type": "boolean"
    }
  ],
  "datarows": [
    [
      1,
      "a",
      0,
      false
    ],
    [
      2,
      "b",
      0,
      false
    ]
  ],
  "total": 2,
  "size": 2
}

Related Issues

#3406

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Binlong Gao <gbinlong@amazon.com>
@gaobinlong

Copy link
Copy Markdown
Contributor Author

@opensearch-trigger-bot

Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@songkant-aws

Copy link
Copy Markdown
Collaborator

LGTM

@songkant-aws

Copy link
Copy Markdown
Collaborator

@LantaoJin @qianheng-aws @yuancu Need other reviews.

String categoryField =
arguments.containsKey(CATEGORY_FIELD)
? (String) arguments.get(CATEGORY_FIELD).getValue()
: null;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

categoryField is null will throw NPE in generateCategorizedInputDataset

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? generateCategorizedInputDataset has null checking:

ExprValue categoryValue = categoryField == null ? null : tupleValue.get(categoryField);

If we want, we can add a @Nullable annotation to that field to document that contract in the signature

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it depends on the what kind of Map it used, seems HashMap can handle null key for computeIfAbsent(key), but ConcurrentHashMap and other kinds of Map throws NPE.

@LantaoJin LantaoJin added enhancement New feature or request and removed stalled labels Oct 15, 2025
@opensearch-trigger-bot

Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

@opensearch-trigger-bot

Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

@Swiddis

Swiddis commented Nov 25, 2025

Copy link
Copy Markdown
Collaborator

@LantaoJin can you re-review?

@opensearch-trigger-bot

Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

@LantaoJin LantaoJin merged commit 661cb8d into opensearch-project:main Jan 6, 2026
37 of 42 checks passed
@opensearch-trigger-bot

Copy link
Copy Markdown
Contributor

LantaoJin pushed a commit to LantaoJin/search-plugins-sql that referenced this pull request Jan 7, 2026
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
(cherry picked from commit 661cb8d)
LantaoJin pushed a commit to LantaoJin/search-plugins-sql that referenced this pull request Jan 7, 2026
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
(cherry picked from commit 661cb8d)
Signed-off-by: Lantao Jin <ltjin@amazon.com>
@LantaoJin LantaoJin added the backport-manually Filed a PR to backport manually. label Jan 7, 2026
LantaoJin added a commit that referenced this pull request Jan 7, 2026
(cherry picked from commit 661cb8d)

Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Co-authored-by: gaobinlong <gbinlong@amazon.com>
aalva500-prog pushed a commit to aalva500-prog/sql that referenced this pull request Jan 12, 2026
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants