Fix AbstractStorage#containsWriteHandler#281
Conversation
…ehandler when select storage
There was a problem hiding this comment.
Why do we remove this exception?
There was a problem hiding this comment.
I don't know what is the purpose of this exception. And i think it should continue if one storage is corrupted.
There was a problem hiding this comment.
The removal of the Exception seems irreverent with the title, can you revert this change?
In another word, you should change it in another PR if it's really needed.
There was a problem hiding this comment.
DiskErrorToleranceTest#diskErrorTest will fail without this change. https://github.com/apache/incubator-uniffle/actions/runs/3328036207/jobs/5503556615
There was a problem hiding this comment.
Yes. This is a problem, but maybe we should open another issue to solve it.
There was a problem hiding this comment.
DiskErrorToleranceTest#diskErrorTestwill fail without this change. https://github.com/apache/incubator-uniffle/actions/runs/3328036207/jobs/5503556615
But how do I deal with this problem. I think this ut is reasonable. Merge another pr first?
There was a problem hiding this comment.
@jerqi What do you think? fix in another pr or in current pr? if fix in another pr, i will fix DiskErrorToleranceTest#diskErrorTest in current pr.
There was a problem hiding this comment.
I think if #297 fixed, this exception should not be thrown. And this exception will never be thrown in original logic. So i think is ok to remove this exception
There was a problem hiding this comment.
How do we determine that disk is broken?
I think the disk can't be read or written. So I don't think this is a bug.
So we should let the application fail fast. But I already have multiple replicas, so it's unnecessary to fail fast here.
It's ok for me after I think twice.
kaijchen
left a comment
There was a problem hiding this comment.
Thanks @xianjingfeng for the improvement, I have left some suggestions.
| event.getStartPartition())); | ||
| if (storage.containsWriteHandler(event.getAppId(), event.getShuffleId(), event.getStartPartition()) | ||
| && storage.isCorrupted()) { | ||
| throw new RuntimeException("storage " + storage.getBasePath() + " is corrupted"); |
There was a problem hiding this comment.
The removal of the Exception seems irreverent with the title, can you revert this change?
In another word, you should change it in another PR if it's really needed.
Codecov Report
@@ Coverage Diff @@
## master #281 +/- ##
============================================
+ Coverage 59.71% 60.50% +0.79%
- Complexity 1377 1426 +49
============================================
Files 166 175 +9
Lines 8918 9085 +167
Branches 853 873 +20
============================================
+ Hits 5325 5497 +172
+ Misses 3318 3295 -23
- Partials 275 293 +18
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
kaijchen
left a comment
There was a problem hiding this comment.
LGTM, thanks @xianjingfeng for updating the patch.
### What changes were proposed in this pull request? Fix AbstractStorage#containsWriteHandler ### Why are the changes needed? It is a bug, and it is obvious. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Feel unnecessary

What changes were proposed in this pull request?
Fix AbstractStorage#containsWriteHandler
Why are the changes needed?
It is a bug, and it is obvious.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Feel unnecessary