add numcodecs.zarr3.to_zarr3 method by brokkoli71 · Pull Request #741 · zarr-developers/numcodecs · GitHub
Skip to content

add numcodecs.zarr3.to_zarr3 method#741

Draft
brokkoli71 wants to merge 8 commits into
zarr-developers:mainfrom
scalableminds:cast-to-zarr3
Draft

add numcodecs.zarr3.to_zarr3 method#741
brokkoli71 wants to merge 8 commits into
zarr-developers:mainfrom
scalableminds:cast-to-zarr3

Conversation

@brokkoli71

@brokkoli71 brokkoli71 commented Apr 22, 2025

Copy link
Copy Markdown
Member

TODO:

  • Unit tests and/or doctests in docstrings
  • Tests pass locally
  • Docstrings and API docs for any new/modified user-facing classes and functions
  • Changes documented in docs/release.rst
  • Docs build locally
  • GitHub Actions CI passes
  • Test coverage to 100% (Codecov passes)

@codecov

codecov Bot commented Apr 22, 2025

Copy link
Copy Markdown

@d-v-b

d-v-b commented Apr 22, 2025

Copy link
Copy Markdown
Contributor

Thanks for working on this! We should probably have a conversation about the strategy here. My preference would be to move away from separate zarr 2 and zarr 3 codec classes, which would look somewhat different than the effort here.

@rabernat

Copy link
Copy Markdown
Contributor

My preference would be to move away from separate zarr 2 and zarr 3 codec classes, which would look somewhat different than the effort here.

Are these mutually exclusive?

I see this PR as a shim to solve a pretty urgent problem that Zarr users are experiencing in the V3 transition.

In the future, we could refactor how codec classes work, but that's likely a much slower process.

@d-v-b

d-v-b commented Apr 22, 2025

Copy link
Copy Markdown
Contributor

My preference would be to move away from separate zarr 2 and zarr 3 codec classes, which would look somewhat different than the effort here.

Are these mutually exclusive?

I see this PR as a shim to solve a pretty urgent problem that Zarr users are experiencing in the V3 transition.

In the future, we could refactor how codec classes work, but that's likely a much slower process.

One way to achieve this shim without adding more problematic zarr 2 / zarr 3 logic to numcodecs would be to implement the changes in this PR in zarr-python, instead of numcodecs. Is there any reason why that would not be possible?

@normanrz

Copy link
Copy Markdown
Member

My preference would be to move away from separate zarr 2 and zarr 3 codec classes, which would look somewhat different than the effort here.

Are these mutually exclusive?
I see this PR as a shim to solve a pretty urgent problem that Zarr users are experiencing in the V3 transition.
In the future, we could refactor how codec classes work, but that's likely a much slower process.

One way to achieve this shim without adding more problematic zarr 2 / zarr 3 logic to numcodecs would be to implement the changes in this PR in zarr-python, instead of numcodecs. Is there any reason why that would not be possible?

I would argue that adding this to zarr-python actually increases the problematic coupling, because this to_zarr3 method depends on private numcodecs interfaces. However, I think we can be pragmatic here and implement it on either side until we have resolved #742

@d-v-b

d-v-b commented Apr 23, 2025

Copy link
Copy Markdown
Contributor

I would argue that adding this to zarr-python actually increases the problematic coupling, because this to_zarr3 method depends on private numcodecs interfaces.

As numcodecs has so far existed chiefly for zarr-python's benefit, and we control numcodecs, I would argue that effectively all numcodecs interfaces are public to zarr-python. To put it differently, "zarr-python uses numcodecs interface X" would be a valid reason for us not to change that interface, whether interface X was public or not.

This is of course a problematic, and ultimately something we should fix. I think the first steps would be to fully extract as much zarr-specific-logic from numcodecs, which argues for making the code in this PR over in zarr-python.

@juntyr

juntyr commented Dec 8, 2025

Copy link
Copy Markdown

I created the zarr-any-numcodecs package that can wrap any existing numcodecs codec as a zarr v3 codec, which is more general (not limited to just the builtin numcodecs codecs) but also cannot be as optimized since this repo can create wrappers that benefit from implementation details, e.g. by exposing partial decoding support

@d-v-b

d-v-b commented Dec 8, 2025

Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants