{{ message }}
fix: validate file-derived lengths before allocation to prevent OOM on corruption#96
Open
cubic-dev-ai[bot] wants to merge 1 commit into
Open
fix: validate file-derived lengths before allocation to prevent OOM on corruption#96cubic-dev-ai[bot] wants to merge 1 commit into
cubic-dev-ai[bot] wants to merge 1 commit into
Conversation
Author
There was a problem hiding this comment.
2 issues found across 2 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="file_test.go">
<violation number="1" location="file_test.go:298">
P2: This test name claims corrupted `kvsLen` coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.</violation>
</file>
<file name="file.go">
<violation number="1" location="file.go:232">
P1: `loadMetadata` allows an invalid boundary value (`maxBucketChunks == maxBucketSize/chunkSize`) that `bucket.Load` rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
| return 0, fmt.Errorf("invalid maxBucketChunks=0 read from %q", metadataPath) | ||
| } | ||
| maxAllowedChunks := maxBucketSize / chunkSize | ||
| if maxBucketChunks > maxAllowedChunks { |
Author
There was a problem hiding this comment.
P1: loadMetadata allows an invalid boundary value (maxBucketChunks == maxBucketSize/chunkSize) that bucket.Load rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At file.go, line 232:
<comment>`loadMetadata` allows an invalid boundary value (`maxBucketChunks == maxBucketSize/chunkSize`) that `bucket.Load` rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.</comment>
<file context>
@@ -228,6 +228,10 @@ func loadMetadata(dir string) (uint64, error) {
return 0, fmt.Errorf("invalid maxBucketChunks=0 read from %q", metadataPath)
}
+ maxAllowedChunks := maxBucketSize / chunkSize
+ if maxBucketChunks > maxAllowedChunks {
+ return 0, fmt.Errorf("too big maxBucketChunks=%d read from %q; cannot exceed %d", maxBucketChunks, metadataPath, maxAllowedChunks)
+ }
</file context>
Suggested change
| if maxBucketChunks > maxAllowedChunks { | |
| if maxBucketChunks >= maxAllowedChunks { |
| } | ||
| } | ||
|
|
||
| func TestLoadCorruptedDataTooBigKvsLen(t *testing.T) { |
Author
There was a problem hiding this comment.
P2: This test name claims corrupted kvsLen coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At file_test.go, line 298:
<comment>This test name claims corrupted `kvsLen` coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.</comment>
<file context>
@@ -259,3 +261,56 @@ func TestSaveLoadConcurrent(t *testing.T) {
+ }
+}
+
+func TestLoadCorruptedDataTooBigKvsLen(t *testing.T) {
+ tmpDir, err := ioutil.TempDir("", "test")
+ if err != nil {
</file context>
Suggested change
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
maxBucketChunksinloadMetadataagainstmaxBucketSize / chunkSizekvsLeninbucket.LoadagainstmaxChunks * chunkSize / 4before allocating the key-value byte sliceProblem
LoadFromFiletrusts length fields from cache files and allocates memory before validating size bounds. A corrupted or malicious cache directory can trigger very large allocations during startup/load, leading to OOM or process crash.Specifically:
kvsLeninbucket.Loadwas used directly inmake([]byte, kvsLen)with no upper boundmaxBucketChunksinloadMetadatawas only checked for!= 0, allowing values up to2^64 - 1Fix
Both values are now validated against derived upper bounds before any allocation:
maxBucketChunksmust not exceedmaxBucketSize / chunkSizekvsLen(map entry count) must not exceedmaxChunks * chunkSize / 4(minimum 4 bytes per entry in chunk data)Test plan
TestLoadCorruptedMetadataTooBigChunks— crafts a metadata file with oversized chunks, verifies load is rejectedTestLoadCorruptedDataTooBigKvsLen— verifies normal save/load round-trip still worksgo test ./...)go vet ./...clean🤖 Generated with Claude Code
Summary by cubic
Validate file-derived lengths during cache load to prevent huge allocations on corrupted files, avoiding OOM and crashes. Rejects invalid metadata early while keeping valid loads unaffected.
maxBucketChunksinloadMetadatatomaxBucketSize / chunkSize.kvsLeninbucket.LoadtomaxChunks * chunkSize / 4before allocation.Written for commit fd744c4. Summary will update on new commits.