refactor: one Array class with sync+async APIs#4049
Conversation
Alternative to zarr-developers#4034 that delivers every zarr-developers#4027 goal except removing the public AsyncArray class. - Array gains a user-friendly constructor: Array(metadata, store_path, config=...) builds the wrapped AsyncArray internally; Array(async_array) still works for back-compat. - Array gains the full public async surface (*_async twins for every sync method), so callers no longer reach into the private _async_array. - Each operation has a single implementation: the async twin holds the logic and the sync method delegates via sync(). No drift between the pairs; a parity test guards their signatures. - No Runner protocol and no _AsyncArrayView/_rebind_state shared-state machinery: Array still wraps one AsyncArray, so async_array is the held object and mutation coherence is free. - getitem_async/setitem_async mirror the full basic-selection parameter surface (out=, fields=, prototype=). Tests: integrates the async/sync boundary suite (shared-state coupling, codec-pipeline-built-once, selection parity, out= zero-copy, full parameter surface). The loop-affinity test is xfail (strict) as a design-independent limitation of mixing zarr's background loop with a user-owned loop on stores with event-loop affinity. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mkitti
left a comment
There was a problem hiding this comment.
I think we have the pattern backwards here. If the objective is to lower overhead for the synchronous API,
- Instead of calling
syncon the asynchronous API ... - Use
asyncio.to_threadto call a synchronous function.
This way the synchronous API has as little overhead as possible. Meanwhile, we still have an asynchronous API that is still asynchronous.
| 6 | ||
| """ | ||
| return sync(self.async_array.nchunks_initialized()) | ||
| return sync(self.nchunks_initialized_async()) |
There was a problem hiding this comment.
If all the "synchronous" calls just call sync on their asynchronous versions, then we still incur all the overhead of scheduling an async task.
| delete_outside_chunks : bool, optional | ||
| If True (default), chunks that fall entirely outside the new array | ||
| shape are deleted from the underlying store. If False, those chunks | ||
| are left in place. |
There was a problem hiding this comment.
Should we be exposing new keyword arguments in the synchronous API at this time? Could we defer this change to a later pull request?
|
thanks for your review @mkitti! I should've marked this PR as a draft. It's one of a few possible solutions related to #4027, including #4034 and generated sync code. I'd be curious for your overall take on the different options. |

Alternative to #4034 that delivers every #4027 goal except removing the public
AsyncArrayclass.Array(metadata, store_path, config=...)builds the wrappedAsyncArrayinternally;Array(async_array)still works for back-compat.Arraygains the full public async surface (*_asynctwins for every sync method), so callers no longer reach into the private _async_array.sync(). No drift between the pairs; a parity test guards their signatures.Runnerprotocol and no_AsyncArrayView/_rebind_stateshared-state machinery:Arraystill wraps oneAsyncArray, soasync_arrayis the held object and mutation coherence is free.getitem_async/setitem_asyncmirror the full basic-selection parameter surface (out=,fields=,prototype=).Tests: integrates the async/sync boundary suite (shared-state coupling, codec-pipeline-built-once, selection parity, out= zero-copy, full parameter surface). The loop-affinity test is xfail (strict) as a design-independent limitation of mixing zarr's background loop with a user-owned loop on stores with event-loop affinity.
TODO:
docs/user-guide/*.mdchanges/