{{ message }}
Commit a0da993
authored
feat(storage): Enhance Otel Span Attributes with BucketId and Location details for every Bucket/Blob operation
# feat(storage): Enhance Otel Span Attributes with BucketId and Location details for every Bucket/Blob operation as part of ACO (App-centric Observability)
This PR implements **App-centric Observability (ACO)** tracing
compatibility for the GCS Python SDK (`google-cloud-storage`). All
OpenTelemetry trace spans produced by bucket and blob operations now
seamlessly incorporate mandatory destination resource annotations
(`gcp.resource.destination.id` and `gcp.resource.destination.location`).
---
## Core Architecture & Design
### 1. Centralized, DRY Telemetry Helper (`_helpers.py`)
- All OpenTelemetry span context generation, attribute injection, and
exception trapping are centralized in a module-level context manager
`create_trace_span_helper` in
[`_helpers.py`](file:///usr/local/google/home/chandrasiri/storage_related/org-google-cloud-python/packages/google-cloud-storage/google/cloud/storage/_helpers.py).
- **Zero modifications to the core tracing module**:
[`_opentelemetry_tracing.py`](file:///usr/local/google/home/chandrasiri/storage_related/org-google-cloud-python/packages/google-cloud-storage/google/cloud/storage/_opentelemetry_tracing.py)
remains completely pristine and identical to `main`.
- Seamlessly wrapped all critical read/write operations across
`blob.py`, `bucket.py`, and `client.py` (e.g., `download_as_bytes`,
`upload_from_string`, `get_bucket`, `lookup_bucket`, etc.).
### 2. Bounded LRU Metadata Cache (`_lru_cache.py`,
`_bucket_metadata_cache.py`)
- **LRU Capacity Bounding**: Implemented `LRUCache` utilizing an
`OrderedDict` to support O(1) operations and strict capacity bounding to
eliminate memory leaks in long-running applications.
- **Concurrent Singleflight Warming**: Implemented `BucketMetadataCache`
to store bucket locations and project numbers. On cache misses, it
spawns background threads (`_fetch_background`) using singleflight
tracking (`_inflight_fetches`) to prevent server stampedes / thundering
herds.
- **Fallback Annotations on 403**: On GCS `403 Forbidden` permissions
errors, the cache permanently registers fallback annotations
(`projects/_/buckets/{name}`) to completely avoid retry storms on
subsequent API calls.
### 3. Resilient 404 Existence Eviction (`_http.py`, `_helpers.py`,
`bucket.py`)
- **Smart Out-of-band 404 Verification**: When a `404 NotFound` error
occurs during media transfers or REST calls, a background thread is
spawned (with concurrency protection via `_inflight_checks`) to check if
the bucket was deleted out-of-band (`bucket.exists()`). If `exists()`
returns `False`, the bucket is cleanly evicted from the cache.
- **Instant Synchronous Eviction**: Direct `Bucket.delete()` calls
synchronously and instantly evict the bucket name from the cache,
ensuring real-time consistency.
---
## Extensive Testing Suite
### 1. 100% Sleep-Free System Tests (`test_aco_observability.py`)
Added a comprehensive system test suite
[`test_aco_observability.py`](file:///usr/local/google/home/chandrasiri/storage_related/org-google-cloud-python/packages/google-cloud-storage/tests/system/test_aco_observability.py)
executing against a live GCS backend:
- **Sequential Priming**: Verifies cache miss return times, background
priming, and subsequent span enrichment.
- **403 Fallback**: Verifies minimal fallback registration on Forbidden
responses.
- **Cache Stampede Protection**: Simulates 15 concurrent threads on a
cache miss and asserts only 1 GCS call is fired.
- **Smart 404 Eviction**: Deletes a bucket out-of-band and verifies
async cache clean-up on 404.
- **Synchronous Delete Eviction**: Asserts immediate cache eviction on
SDK deletion.
- **LRU Capacity Bounding**: Populates the cache beyond its limits and
verifies proper LRU eviction.
- **Deterministic Synchronization**: Uses **`threading.Event` (zero
static sleeps)** for thread coordination, guaranteeing thundering-fast
execution and completely eliminating timing flakiness.
### 2. Robust Unit Tests
- Added
[`test__lru_cache.py`](file:///usr/local/google/home/chandrasiri/storage_related/org-google-cloud-python/packages/google-cloud-storage/tests/unit/test__lru_cache.py)
(LRU correctness, bounding, eviction).
- Added
[`test__bucket_metadata_cache.py`](file:///usr/local/google/home/chandrasiri/storage_related/org-google-cloud-python/packages/google-cloud-storage/tests/unit/test__bucket_metadata_cache.py)
(concurrency, location resolution, 403 fallback, singleflight).
- Added `test_delete_hit_evicts_from_cache` inside
[`test_bucket.py`](file:///usr/local/google/home/chandrasiri/storage_related/org-google-cloud-python/packages/google-cloud-storage/tests/unit/test_bucket.py).
---
## Validation Results
All checks, unit tests, and live GCS system tests pass flawlessly:
- **Unit Tests**: 835 passed in 17.82s
- **System Tests**: 8 passed in 26.94s
- **Format & Linter**: 100% clean (`black` / `flake8`)1 parent 4d64ebc commit a0da993
14 files changed
Lines changed: 1584 additions & 86 deletions
File tree
- packages/google-cloud-storage
- google/cloud/storage
- tests
- system
- unit
Lines changed: 150 additions & 0 deletions
Lines changed: 102 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| 26 | + | |
25 | 27 | | |
26 | 28 | | |
27 | 29 | | |
28 | 30 | | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
29 | 34 | | |
30 | 35 | | |
31 | 36 | | |
32 | 37 | | |
33 | 38 | | |
34 | 39 | | |
35 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
36 | 46 | | |
37 | 47 | | |
38 | 48 | | |
| |||
137 | 147 | | |
138 | 148 | | |
139 | 149 | | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
140 | 206 | | |
141 | 207 | | |
142 | 208 | | |
| |||
185 | 251 | | |
186 | 252 | | |
187 | 253 | | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
188 | 290 | | |
189 | 291 | | |
190 | 292 | | |
| |||
Lines changed: 52 additions & 3 deletions

0 commit comments