Add category_field to AD command in PPL by joshuali925 · Pull Request #952 · opensearch-project/sql · GitHub
Skip to content

Add category_field to AD command in PPL#952

Merged
joshuali925 merged 6 commits into
opensearch-project:2.xfrom
joshuali925:ad-agg-field
Oct 26, 2022
Merged

Add category_field to AD command in PPL#952
joshuali925 merged 6 commits into
opensearch-project:2.xfrom
joshuali925:ad-agg-field

Conversation

@joshuali925

@joshuali925 joshuali925 commented Oct 21, 2022

Copy link
Copy Markdown
Member

Description

Add category_field to AD command. PPL will group inputs by category, and each group will be sent to AD for predictions separately. This is a temporary solution before ml-commons supports 2 dimensional data.

Enabled doctest for AD (which was disabled by #575)

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@joshuali925 joshuali925 added the v2.4.0 'Issues and PRs related to version v2.4.0' label Oct 21, 2022
@joshuali925 joshuali925 self-assigned this Oct 21, 2022
@codecov-commenter

codecov-commenter commented Oct 21, 2022

Copy link
Copy Markdown

Signed-off-by: Joshua Li <joshuali925@gmail.com>
Signed-off-by: Joshua Li <joshuali925@gmail.com>
Signed-off-by: Joshua Li <joshuali925@gmail.com>
Signed-off-by: Joshua Li <joshuali925@gmail.com>
Signed-off-by: Joshua Li <joshuali925@gmail.com>
Signed-off-by: Joshua Li <joshuali925@gmail.com>
@joshuali925 joshuali925 marked this pull request as ready for review October 25, 2022 20:00
@joshuali925 joshuali925 requested a review from a team as a code owner October 25, 2022 20:00
super.open();
DataFrame inputDataFrame = generateInputDataset(input);
String categoryField = arguments.containsKey(CATEGORY_FIELD)
? (String) arguments.get(CATEGORY_FIELD).getValue() : null;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid using null value?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you suggest? i thought about optional but code style said to not use Optional as function parameters, and other arguments defaults to null

return BatchRCFParams.builder()
.numberOfTrees(arguments.containsKey(NUMBER_OF_TREES)
? ((Integer) arguments.get(NUMBER_OF_TREES).getValue())
: null)

@penghuo

penghuo commented Oct 26, 2022

Copy link
Copy Markdown
Collaborator

Description

Add category_field to AD command. PPL will group inputs by category, and each group will be sent to AD for predictions separately. This is a temporary solution before ml-commons supports 2 dimensional data.

Enabled doctest for AD (which was disabled by #575)

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.

    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.

    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Agree, this is the temporary solution, please add the issue to track AD change in future release.

@joshuali925

joshuali925 commented Oct 26, 2022

Copy link
Copy Markdown
Member Author

@joshuali925 joshuali925 merged commit 2a16227 into opensearch-project:2.x Oct 26, 2022
@dai-chen dai-chen added PPL Piped processing language ml Issues related to integration with ML commons and plugin labels Oct 27, 2022
@joshuali925 joshuali925 mentioned this pull request Nov 2, 2022
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ml Issues related to integration with ML commons and plugin PPL Piped processing language v2.4.0 'Issues and PRs related to version v2.4.0'

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants