{{ message }}
feat(debug-files): scan inside .zip archives on upload#1141
Merged
Conversation
Add ZIP archive scanning to `debug-files upload`, matching the legacy sentry-cli behavior. `.zip` files encountered during a scan are expanded in memory and their entries run through the same filter pipeline as on-disk files. Disable with `--no-zips`. - New `src/lib/dif/zip.ts`: `readZipDifEntries()` detects archives by `.zip` extension + `PK` magic, extracts entries via `fflate.unzipSync`, and bounds decompression with a pre-decompression size filter (zip-bomb guard). Directory and empty entries are dropped; nested archives are not recursed. - `prepareDifs()` gains a `scanZips` option (default true); the per-file path is extracted into `prepareFileDif()` and the parse step into the shared `difFromBuffer()` so on-disk and in-memory entries share logic. - `upload` command: add `--no-zips` flag; thread `scanZips` through. Tests: 12 zip unit tests + 2 command wiring tests. Docs/skill updated.
Contributor
Contributor
Codecov Results 📊✅ Patch coverage is 89.80%. Project has 5114 uncovered lines. Files with missing lines (2)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 81.48% 81.52% +0.04%
==========================================
Files 396 397 +1
Lines 27588 27671 +83
Branches 17912 17966 +54
==========================================
+ Hits 22478 22557 +79
- Misses 5110 5114 +4
- Partials 1866 1868 +2Generated by Codecov Action |
Source bundles (and JVM bundles / .src.zip) are a ZIP preceded by an 8-byte SYSB+version header, so they start with SYSB, not PK. Verify readZipDifEntries returns null for them so they upload as sourcebundle DIFs rather than being expanded into their inner source files.
Contributor
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4a41f2e. Configure here.
Member
Author
CI status: all green except an external Anthropic API outageEvery check passes except
This PR does not touch the skill or any LLM path. Will re-run once the Anthropic API recovers. |
…ession Node 24.17.0 / 22.23.0 (CVE-2026-48931 http.Agent fix) added a `data` listener on idle sockets that makes keep-alive fetch reuse throw false ERR_STREAM_PREMATURE_CLOSE errors. The skill-eval E2E planner hits this as 'Premature close' talking to api.anthropic.com. Fixed in 24.18.0 (nodejs/node#64004). A floating `node-version: "24"` silently reuses the runner's pre-cached buggy 24.17.0, so pin the exact patched version.
Member
Author
Addresses Seer/Bugbot/warden comments and an adversarial review of the in-place .zip scanner: - Bound cumulative decompression per archive (maxTotalSize, default 2 GiB). unzipSync materializes every accepted entry at once, so the per-entry gate alone could not stop a flat zip-bomb or a huge legit archive from exhausting memory. - Gate the container's on-disk size before reading it into memory, mirroring the on-disk peek-then-read discipline (warden M2). - Skip entries using a compression method fflate cannot inflate instead of letting unzipSync throw and discard the whole archive (and its valid sibling DIFs). - Make oversized zip entries advisory-only: they warn per entry but no longer feed the exit-driving oversizedCount. A compressed entry's format is unknown pre-decompression, so counting it produced false "all matched files too large" failures when an unrelated large asset sat inside a .zip (Bugbot/H2). - Guard non-finite originalSize as defense-in-depth (Seer). Adds regression tests for unsupported-compression survival, the cumulative budget, the container gate, and the advisory oversized accounting.
BYK
added a commit
that referenced
this pull request
Jul 2, 2026
## Summary Follow-up to #1140, addressing Warden finding **9LL-87A** (a comment we missed before merge). A **partial** size-drop during `debug-files upload` silently exited `0` and printed "Uploaded N debug file(s)". The *all-dropped* cases already fail loudly — `filterBySize` throws `ValidationError`, and the scan-time all-dropped path exits via `doNothingToUpload(... oversizedCount > 0 → exit 1)` — but a partial drop did not honor that same "oversized ⇒ non-zero exit" contract. Oversized files left no result entry, so `doUpload` never counted them as failures. Two disjoint drop sites both had the gap: 1. **Upload-time (`filterBySize`)** — most relevant for in-memory `--include-sources` source bundles, which bypass the scan-time size gate and can only be caught here. `uploadDebugFiles` now returns these as `error` results so the existing `doUpload` failure path sets exit 1 and lists them. 2. **Scan-time (`prepareDifs` → `oversizedCount`)** — regular oversized files are dropped before the upload queue is built. `doUpload` now receives `oversizedCount`/`maxFileSize` and exits non-zero on a partial drop, matching `doNothingToUpload`. Accepted (in-cap) files are still uploaded; only the exit code and hint change so a partial drop is no longer mistaken for a clean success. ## Warden findings on #1140 - **9LL-87A** (partial size-drop silently exits 0) — **fixed here.** - **ANP-RKY** (size gate applied after full file read) — **already fixed** by #1141: `prepareFileDif` now gates on `peeked.size` (from `fd.stat()`) *before* `readFile`, so oversized files are never buffered. ## Tests - `test/lib/api/debug-files.test.ts`: the partial-drop test now asserts the oversized file is **not** assembled but **is** returned as an `error` result. - `test/commands/debug-files/upload.test.ts`: new test — a partial scan-time size-drop still uploads the in-cap file but exits `1`. `pnpm run typecheck`, `biome check`, and both affected test files (40 tests) pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
Adds ZIP archive scanning to
sentry debug-files upload, matching the legacysentry-clibehavior (try_open_zip/walk_difs_zip). This is Tier B of thedebug-files uploadroadmap (Tier A core upload landed in #1139, limits +--derived-datain #1140)..zipfiles encountered during a scan are now expanded in memory and their entries run through the same filter pipeline (--type/--id/feature filters, size gate) as on-disk files. Pass--no-zipsto skip them.Implementation
src/lib/dif/zip.ts—readZipDifEntries():.zipextension andPKlocal-header magic (a misnamed non-archive falls back to normal parsing; a real DIF namedfoo.zipstill works).fflate.unzipSync.filterrejects entries whose declared uncompressed size exceeds the server'smax_file_size(or omitted = no gate) before inflation, so oversized entries are never materialized. Directory/empty entries are dropped..zipinside a.zipis an opaque, non-object entry), matching the legacy tool.prepareDifs()gains ascanZipsoption (defaulttrue). The on-disk per-file logic is extracted intoprepareFileDif(), and the parse step into a shareddifFromBuffer()so on-disk files and in-memory ZIP entries share the same parse + filter path. Entry display paths are synthetic"<zip>/<entry>"so logs and DIF names stay meaningful.uploadcommand — adds the--no-zipsflag and threadsscanZips: !flags["no-zips"]through.Notes / scope
fflateadded as a devDependency (bundled at build time;check:depspasses).unzipSyncis non-streaming); typical symbol zips are tens/hundreds of MB. Per-entry decompression remains bounded by the size gate.--symbol-maps(BCSymbolMap) and--il2cpp-mappingline mappings — these need newsymbolicWASM exports.Testing
test/lib/dif/zip.test.ts— 12 unit tests (magic/extension detection, malformed input, size gate, no-recursion,scanZipstoggle,--typefiltering, container not double-counted, misnamed-archive fallback). Archives built in memory withfflate.zipSync— no binary fixtures.test/commands/debug-files/upload.test.ts— 2 command-level tests (.zipscanned by default;--no-zipsignores entries).dif+debug-filestests pass;typecheck,lint,check:deps,check:patches,check:fragmentsclean.