fix(transfer_manager): Prevent path traversal in `download_many_to_path` by chandra-siri · Pull Request #1768 · googleapis/python-storage · GitHub
Skip to content
This repository was archived by the owner on Mar 31, 2026. It is now read-only.

fix(transfer_manager): Prevent path traversal in download_many_to_path#1768

Merged
chandra-siri merged 14 commits into
mainfrom
disallow_folder_traversal
Mar 11, 2026
Merged

fix(transfer_manager): Prevent path traversal in download_many_to_path#1768
chandra-siri merged 14 commits into
mainfrom
disallow_folder_traversal

Conversation

@chandra-siri

@chandra-siri chandra-siri commented Mar 9, 2026

Copy link
Copy Markdown
Collaborator

fix(transfer_manager): Prevent path traversal in download_many_to_path

This PR addresses a security vulnerability where download_many_to_path could be exploited to write files outside the intended destination directory.

The fix ensures that the resolved path for each blob download remains within the bounds of the user-provided destination_directory. If a blob name would result in a path outside this directory (e.g., by using ../), a warning is issued, and that specific blob download is skipped. This prevents directory traversal attacks.

Absolute paths in blob names (e.g., /etc/passwd) are now treated as relative to the destination_directory, so /etc/passwd will be downloaded to destination_directory/etc/passwd.

See b/449616593 for more details.

BREAKING CHANGE: Blobs that would resolve to a path outside the destination_directory are no longer downloaded. While this is a security fix, users relying on the previous behavior to write files outside the target directory will see a change.

@product-auto-label product-auto-label Bot added size: m Pull request size is medium. api: storage Issues related to the googleapis/python-storage API. labels Mar 9, 2026
@gemini-code-assist

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a Path object and a _resolve_path function to resolve the destination path for downloaded blobs in the download_many_to_path function, preventing path traversal vulnerabilities. The tests were updated to reflect the change in destination path. Review feedback indicates that the implementation of _resolve_path is incorrect for relative paths and will raise a ValueError when blob_path is a relative path. Also, the pathlib.Path.is_relative_to() method was introduced in Python 3.9, and since this library supports Python 3.7+, using this method will cause an AttributeError on Python versions 3.7 and 3.8. The traceback module is imported but never used in the file. The TODO comment should be replaced with a concrete example before merging. The docstring for destination_directory contains a TODO comment, a reference to the old os.path.join implementation, and a redundant parenthetical note. The expression Path(destination_directory).resolve() is evaluated on every iteration of the loop.

Comment thread google/cloud/storage/transfer_manager.py
Comment thread google/cloud/storage/transfer_manager.py Outdated
Comment thread google/cloud/storage/transfer_manager.py Outdated
Comment thread google/cloud/storage/transfer_manager.py Outdated
Comment thread google/cloud/storage/transfer_manager.py Outdated
Comment thread google/cloud/storage/transfer_manager.py Outdated
@product-auto-label product-auto-label Bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Mar 10, 2026
@chandra-siri chandra-siri marked this pull request as ready for review March 10, 2026 07:06
@chandra-siri chandra-siri requested review from a team as code owners March 10, 2026 07:06
@chandra-siri chandra-siri changed the title fix: prevent downloading file in directory outside fix(transfer_manager): Prevent path traversal in download_many_to_path Mar 10, 2026
@chandra-siri chandra-siri changed the title fix(transfer_manager): Prevent path traversal in download_many_to_path fix(transfer_manager): Prevent path traversal in download_many_to_path Mar 10, 2026
Comment thread google/cloud/storage/transfer_manager.py
Comment thread google/cloud/storage/transfer_manager.py
Comment thread google/cloud/storage/transfer_manager.py Outdated
Comment thread tests/unit/test_transfer_manager.py
@chandra-siri chandra-siri merged commit 700fec3 into main Mar 11, 2026
15 checks passed
@chandra-siri chandra-siri deleted the disallow_folder_traversal branch March 11, 2026 12:19
chandra-siri added a commit that referenced this pull request Mar 18, 2026
PR created by the Librarian CLI to initialize a release. Merging this PR
will auto trigger a release.

Librarian Version: v1.0.2-0.20251119154421-36c3e21ad3ac
Language Image:
us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:8e2c32496077054105bd06c54a59d6a6694287bc053588e24debe6da6920ad91
<details><summary>google-cloud-storage: 3.10.0</summary>

##
[3.10.0](v3.9.0...v3.10.0)
(2026-03-18)

### Features

* [Bucket Encryption Enforcement] add support for bucket encryption
enforcement config (#1742)
([2a6e8b0](2a6e8b0))

### Perf Improvments

* [Rapid Buckets Reads] Use raw proto access for read resumption
strategy (#1764)
([14cfd61](14cfd61))
* [Rapid Buckets Benchmarks] init mp pool & grpc client once, use
os.sched_setaffinity (#1751)
([a9eb82c](a9eb82c))
* [Rapid Buckets Writes] don't flush at every append, results in bad
perf (#1746)
([ab62d72](ab62d72))


### Bug Fixes

* [Windows] skip downloading blobs whose name contain `":" ` eg: `C:`
`D:` etc when application runs in Windows. (#1774)
([5581988](5581988))
* [Path Traversal] Prevent path traversal in `download_many_to_path`
(#1768)
([700fec3](700fec3))
* [Rapid Buckets] pass token correctly, '&' instead of ',' (#1756)
([d8dd1e0](d8dd1e0))


</details>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

api: storage Issues related to the googleapis/python-storage API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants