[ISSUE-378][HugePartition][Part-2] Introduce memory usage limit and data flush by zuston · Pull Request #471 · apache/uniffle · GitHub
Skip to content

[ISSUE-378][HugePartition][Part-2] Introduce memory usage limit and data flush#471

Merged
zuston merged 7 commits into
apache:masterfrom
zuston:memoryLimit
Jan 17, 2023
Merged

[ISSUE-378][HugePartition][Part-2] Introduce memory usage limit and data flush#471
zuston merged 7 commits into
apache:masterfrom
zuston:memoryLimit

Conversation

@zuston

@zuston zuston commented Jan 11, 2023

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

  1. Introduce memory usage limit for huge partition to keep the regular partition writing stable
  2. Once partition is marked as huge-partition, when its buffer size is greater than rss.server.single.buffer.flush.threshold value, single-buffer flush will be triggered whatever the single buffer flush is enabled or not

Why are the changes needed?

  1. To solve the problems mentioned by [Improvement] Optimize data flushing and memory usage for huge partitions to improve stability  #378

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

  1. UTs

@zuston zuston changed the title [ISSUE-378][HugePartition][Part-2] Introduce memory usage limit [ISSUE-378][HugePartition][Part-2] Introduce memory usage limit and data flush Jan 11, 2023
@zuston zuston requested a review from roryqi January 11, 2023 11:37
@advancedxy

advancedxy commented Jan 12, 2023

Copy link
Copy Markdown
Contributor

@advancedxy

Copy link
Copy Markdown
Contributor

I think these two similar configurations will bring more confusion to end user.

@zuston

zuston commented Jan 12, 2023

Copy link
Copy Markdown
Member Author

I don't think we should introduce a new configuration to control buffer flush. rss.server.single.buffer.flush.enabled can be used for the huge partition buffer flush purpose. If the huge partition limit is enabled, the single buffer flush could be enabled automatically.

If single buffer flush could be enabled automatically, how to set the flush threshold size? I don't hope the rss.server.single.buffer.flush.enabled is enabled for regular partitions, which could be flushed to ssd/hdd directly instead of cold storage. And huge partition could be flushed to HDFS.

I think these two similar configurations will bring more confusion to end user.

Emm. Yes.

@zuston

zuston commented Jan 12, 2023

Copy link
Copy Markdown
Member Author

It seems I got your point. The conf of rss.server.single.buffer.flush.threshold should be specified especially for huge partition whatever the single buffer flush is enabled or not. Right?

@codecov-commenter

codecov-commenter commented Jan 12, 2023

Copy link
Copy Markdown

Codecov Report

Merging #471 (cfffd7a) into master (19a8bac) will increase coverage by 0.07%.
The diff coverage is 75.47%.

@@             Coverage Diff              @@
##             master     #471      +/-   ##
============================================
+ Coverage     58.78%   58.86%   +0.07%     
- Complexity     1704     1714      +10     
============================================
  Files           206      206              
  Lines         11471    11517      +46     
  Branches       1024     1033       +9     
============================================
+ Hits           6743     6779      +36     
- Misses         4317     4324       +7     
- Partials        411      414       +3     
Impacted Files Coverage Δ
...pache/uniffle/server/ShuffleServerGrpcService.java 0.79% <0.00%> (-0.02%) ⬇️
.../org/apache/uniffle/server/ShuffleTaskManager.java 76.50% <76.92%> (-0.06%) ⬇️
...he/uniffle/server/buffer/ShuffleBufferManager.java 83.21% <90.47%> (+0.45%) ⬆️
...a/org/apache/uniffle/server/ShuffleServerConf.java 99.31% <100.00%> (+0.02%) ⬆️
...rg/apache/uniffle/server/ShuffleServerMetrics.java 97.05% <0.00%> (-0.14%) ⬇️
...org/apache/uniffle/server/ShuffleFlushManager.java 84.04% <0.00%> (+0.08%) ⬆️
...g/apache/uniffle/server/ShuffleDataFlushEvent.java 83.67% <0.00%> (+0.34%) ⬆️
...ava/org/apache/uniffle/server/ShuffleTaskInfo.java 100.00% <0.00%> (+5.55%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@advancedxy

Copy link
Copy Markdown
Contributor

I don't think we should introduce a new configuration to control buffer flush. rss.server.single.buffer.flush.enabled can be used for the huge partition buffer flush purpose. If the huge partition limit is enabled, the single buffer flush could be enabled automatically.

If single buffer flush could be enabled automatically, how to set the flush threshold size? I don't hope the rss.server.single.buffer.flush.enabled is enabled for regular partitions, which could be flushed to ssd/hdd directly instead of cold storage. And huge partition could be flushed to HDFS.

I think these two similar configurations will bring more confusion to end user.

Emm. Yes.

Sorry, I forgot to reply this comment.

If I understand the code and design correctly, single.buffer.flushed.enabled is added to support flushing big buffer to cold storage, such as hdfs directly, which I think it perfectly matches you intention when serving huge partitions.

So, I think we could reuse the rss.server.single.buffer.flush.threshold settings, which is 64MB by default. We don't have to introduce rss.server.huge-partition.memory.limit.ratio ?

@zuston

zuston commented Jan 12, 2023

Copy link
Copy Markdown
Member Author

So, I think we could reuse the rss.server.single.buffer.flush.threshold settings, which is 64MB by default. We don't have to introduce rss.server.huge-partition.memory.limit.ratio ?

Memory limit is required. Usually the huge partition writing speed is fast, and the HDFS flushing speed is slower than writing. If missing memory limit, the regular partitions will be affected.

By the way, this design has been introduced into our internal version, it works well. Before this PR, sometimes one huge partition will make other regular partitions buffer require fail.

@advancedxy

Copy link
Copy Markdown
Contributor

So, I think we could reuse the rss.server.single.buffer.flush.threshold settings, which is 64MB by default. We don't have to introduce rss.server.huge-partition.memory.limit.ratio ?

Memory limit is required. Usually the huge partition writing speed is fast, and the HDFS flushing speed is slower than writing. If missing memory limit, the regular partitions will be affected.

By the way, this design has been introduced into our internal version, it works well. Before this PR, sometimes one huge partition will make other regular partitions buffer require fail.

I'm ok to add rss.server.huge-partition.size.threshold setting, which is used to add memory limit. I just think maybe we should reuse rss.server.single.buffer.flush.threshold instead of another rss.server.huge-partition.memory.limit.ratio settings?

@zuston

zuston commented Jan 12, 2023

Copy link
Copy Markdown
Member Author

I'm ok to add rss.server.huge-partition.size.threshold setting, which is used to add memory limit. I just think maybe we should reuse rss.server.single.buffer.flush.threshold instead of another rss.server.huge-partition.memory.limit.ratio settings?

This suggestion has been accepted 😁 (Here: #471 (comment)).

Please review the latest code.

@advancedxy

Copy link
Copy Markdown
Contributor

I'm ok to add rss.server.huge-partition.size.threshold setting, which is used to add memory limit. I just think maybe we should reuse rss.server.single.buffer.flush.threshold instead of another rss.server.huge-partition.memory.limit.ratio settings?

This suggestion has been accepted 😁 (Here: #471 (comment)).

Please review the latest code.

en. I will take a look in details by tomorrow. But on the surface, the rss.server.huge-partition.memory.limit.ratio setting still existed?
image

@zuston

zuston commented Jan 12, 2023

Copy link
Copy Markdown
Member Author

rss.server.huge-partition.memory.limit.ratio is the conf for memory limitation rather than buffer flush threshold. The original conf to control flush for huge partition has been removed. 49c2400

Comment thread server/src/main/java/org/apache/uniffle/server/ShuffleServerGrpcService.java Outdated
Comment thread server/src/main/java/org/apache/uniffle/server/ShuffleTaskManager.java Outdated
@zuston

zuston commented Jan 16, 2023

Copy link
Copy Markdown
Member Author

PTAL @advancedxy. After this PR is merged, I will introduce some metrics about huge partition.

@zuston zuston requested a review from advancedxy January 16, 2023 02:33
@zuston zuston requested a review from advancedxy January 16, 2023 06:18

@advancedxy advancedxy left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zuston zuston merged commit e802b93 into apache:master Jan 17, 2023
@zuston

zuston commented Jan 17, 2023

Copy link
Copy Markdown
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants