Allow to specify custom tags to decide the assignment of servers by zuston · Pull Request #30 · apache/uniffle · GitHub
Skip to content

Allow to specify custom tags to decide the assignment of servers#30

Merged
roryqi merged 5 commits into
apache:masterfrom
zuston:tagsAdd
Jul 8, 2022
Merged

Allow to specify custom tags to decide the assignment of servers#30
roryqi merged 5 commits into
apache:masterfrom
zuston:tagsAdd

Conversation

@zuston

@zuston zuston commented Jul 5, 2022

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

Allow to specify custom tags to decide data placement.
Changelog

  1. Introduce the conf of rss.server.tags for shuffle server
  2. Introduce the conf of rss.client.assignment.tags for client to choose the data placement

Why are the changes needed?

Sometimes, we hope the specified spark/mr job's shuffle data can be stored the specified group of shuffle servers, maybe due to DC location and so on.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

UTs.

@zuston

zuston commented Jul 5, 2022

Copy link
Copy Markdown
Member Author

@roryqi

roryqi commented Jul 5, 2022

Copy link
Copy Markdown
Contributor

You should add some integration test cases.

@zuston

zuston commented Jul 5, 2022

Copy link
Copy Markdown
Member Author

You should add some integration test cases.

Yes, i’m doing it. Sorry not to add it firstly.

@roryqi

roryqi commented Jul 5, 2022

Copy link
Copy Markdown
Contributor

Em...If we use HDFS to store data, the shuffle server will not store data, data placement is a little inappropriate, Could we give a better name?

Comment thread server/src/main/java/org/apache/uniffle/server/ShuffleServer.java
Comment thread server/src/main/java/org/apache/uniffle/server/ShuffleServer.java Outdated
Comment thread client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java Outdated
// it will incur humongous allocation, so we set it to 14m.
public static String RSS_CLIENT_READ_BUFFER_SIZE_DEFAULT_VALUE = "14m";
// The tags specified by rss client to determine shuffle data placement.
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS = "rss.client.data.placement.tags";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we give it a better name ?
The reason is as below:
#30 (comment)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rss.client.shuffle-server.assignment.tags ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rss.client.assignment.tags will be better.

public static String RSS_CLIENT_READ_BUFFER_SIZE_DEFAULT_VALUE = "14m";
// The tags specified by rss client to determine shuffle data placement.
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS = "rss.client.data.placement.tags";
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS_DEFAULT_VALUES = Constants.SHUFFLE_SERVER_VERSION;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A shuffle server can have multiple tags. The version tag shouldn't be overrided the configurable tags.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,I will remove the default value of this conf

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, our client tags should contain version tag in any time.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got your thought.

@codecov-commenter

codecov-commenter commented Jul 8, 2022

Copy link
Copy Markdown

Codecov Report

Merging #30 (d7bb0fc) into master (bba68a9) will increase coverage by 0.03%.
The diff coverage is 67.74%.

@@             Coverage Diff              @@
##             master      #30      +/-   ##
============================================
+ Coverage     56.83%   56.87%   +0.03%     
- Complexity     1204     1207       +3     
============================================
  Files           152      152              
  Lines          8401     8429      +28     
  Branches        813      816       +3     
============================================
+ Hits           4775     4794      +19     
- Misses         3368     3376       +8     
- Partials        258      259       +1     
Impacted Files Coverage Δ
...apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java 0.00% <0.00%> (ø)
...rg/apache/uniffle/client/util/RssClientConfig.java 0.00% <0.00%> (ø)
.../java/org/apache/uniffle/server/ShuffleServer.java 72.26% <71.42%> (-0.30%) ⬇️
.../java/org/apache/hadoop/mapreduce/RssMRConfig.java 91.66% <100.00%> (+0.75%) ⬆️
.../java/org/apache/spark/shuffle/RssSparkConfig.java 88.88% <100.00%> (+1.38%) ⬆️
...org/apache/spark/shuffle/RssSparkShuffleUtils.java 49.20% <100.00%> (+6.34%) ⬆️
...he/uniffle/client/impl/ShuffleWriteClientImpl.java 25.98% <100.00%> (ø)
...a/org/apache/uniffle/server/ShuffleServerConf.java 99.30% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bba68a9...d7bb0fc. Read the comment docs.

@zuston zuston requested a review from roryqi July 8, 2022 05:02

@roryqi roryqi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your contribution.

@roryqi roryqi changed the title Allow to specify custom tags to decide data placement Allow to specify custom tags to decide the assignment of servers Jul 8, 2022
@roryqi

roryqi commented Jul 8, 2022

Copy link
Copy Markdown
Contributor

Would you like to add some docs about this feature in another pr? Because this feature brings some user-facing changes, we need some docs.

@zuston

zuston commented Jul 8, 2022

Copy link
Copy Markdown
Member Author

@roryqi roryqi merged commit 4a51ff3 into apache:master Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants