iframe-proxy

TomNicholas · 2026-07-01T20:55:23Z

Builds on #4112. BatchedCodecPipeline.read now fetches a whole (non-sharded) request with a single Store.get_many call instead of one get per chunk, so a store can batch/coalesce the underlying reads — independently of codec_pipeline.batch_size, which still governs only decode batching.

The sharding codec's partial-decode path is unchanged, and stores without a specialized get_many fall back to the previous concurrent per-chunk behavior.

Motivation — xref #1758 (request coalescing), #1806 (batched Store API), and zarr-developers/VirtualiZarr#947 (files-as-shards / consolidating small reads).

Stacked on #4112 — its commit is the first one here; review after it. Draft.

Add a public, overridable `Store.get_many` that retrieves many values at once - each request being a whole key or a `(key, byte_range)` pair. It generalizes `Store.get_ranges` (many ranges of one key) to many keys, and yields `(request_index, Buffer | None)` batches in completion order so a store can coalesce reads that land in the same underlying object. The ABC default fetches requests concurrently with `get`, so every store works out of the box; stores with a bulk backend override it (`FsspecStore` coalesces via fsspec's `cat_ranges`). Coalescing tuning is left to each store rather than exposed on the interface. This restores and generalizes the batched-fetch capability of the v2 `getitems` Store API (see zarr-developersgh-1806).

BatchedCodecPipeline.read now fetches the encoded bytes for an entire (non-sharded) read with a single Store.get_many call, instead of one Store.get per chunk. It drives get_many over all chunk keys, scatters the completion-ordered (index, buffer) results back into position, and feeds them to the per-batch decode path. This lets a store batch or coalesce the underlying reads (e.g. FsspecStore via cat_ranges, or a custom store such as virtualizarr's ManifestStore / icechunk's IcechunkStore that overrides get_many) regardless of codec_pipeline.batch_size, which still governs only decode batching. The sharding codec's partial-decode path is untouched, and stores without a specialized get_many fall back to the previous concurrent per-chunk gets.

codecov · 2026-07-01T21:00:15Z

TomNicholas added 2 commits July 1, 2026 17:00

TomNicholas force-pushed the feat/pipeline-use-get-many branch from d8a292d to 4f1ad9f Compare July 1, 2026 21:00

TomNicholas mentioned this pull request Jul 1, 2026

Coalescing ManifestStore.get_many zarr-developers/VirtualiZarr#1033

Draft

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Use Store.get_many for whole-chunk reads in BatchedCodecPipeline#4113

Use Store.get_many for whole-chunk reads in BatchedCodecPipeline#4113
TomNicholas wants to merge 2 commits into
zarr-developers:mainfrom
TomNicholas:feat/pipeline-use-get-many

TomNicholas commented Jul 1, 2026

Uh oh!

codecov Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Uh oh!

Conversation

TomNicholas commented Jul 1, 2026

Uh oh!

codecov Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jul 1, 2026 •

edited

Loading