ENH: use `_get_bin_edges()` on `histogramdd` for consistency. by JosephMehdiyev · Pull Request #31276 · numpy/numpy · GitHub
Skip to content

ENH: use _get_bin_edges() on histogramdd for consistency.#31276

Open
JosephMehdiyev wants to merge 10 commits intonumpy:mainfrom
JosephMehdiyev:hist
Open

ENH: use _get_bin_edges() on histogramdd for consistency.#31276
JosephMehdiyev wants to merge 10 commits intonumpy:mainfrom
JosephMehdiyev:hist

Conversation

@JosephMehdiyev
Copy link
Copy Markdown
Contributor

@JosephMehdiyev JosephMehdiyev commented Apr 19, 2026

PR summary

Basically make bin edge calculation of histogramdd similar to histogram. This will also add possibility of str values on bins
Fixes #20215
Continuation of #20358
I did recycle some stuff like documentation and main fix approach from the unclosed old PR, I am not sure how I could have continued from the old PR with rebase and stuff so I opened a new PR from scratch. Plagiarism is not my intention here, if there is some way to fix this let me know.

Other comments

  1. I changed a test about ValueError.I believe for example 0.95 not being a integer should raise a TypeError in bins, not a ValueError (similar to range(0.95) error)
    Also the error comes from _get_bin_edges() as a TypeError. I can change it of course, but I think TypeError is more logical here.
  2. histogram2d and histogramdd always return a float array #7845 why is this still open? The fix seems like straightforward, but I guess there are some reasons for it
    I found a discussion about this
    https://mail.python.org/archives/list/numpy-discussion@python.org/thread/NIHE7UT4SDPU6KCDSMUIU2UA52PDSEIJ/
    Its been so long, and fix is so simple. Just wanted to point it out here
  3. I still have not written related tests for the new behaviour. I mirrored some histogram tests on histogramdd and it works as intended (except for the float things that I have said above). Will write them later
  4. Wouldn't it better to just implement histogramdd function only and use it for internals ofhistogram similar to histogram2d? Don't know what is the reason here, changing this would fix issues like 2) above if exists

AI Disclosure

I used Claude as sanity check on my fixes. All modifications etc are done solely by my decisions and manually typed or copy pasted documentation from the related PR above.

This will also add additional string arguments for `bin` for
`histogramdd`. Similarly, `histogram2d` will change.
Comment thread numpy/lib/_histograms_impl.py
Comment thread numpy/lib/_twodim_base_impl.py
@JosephMehdiyev
Copy link
Copy Markdown
Contributor Author

JosephMehdiyev commented Apr 19, 2026

@JosephMehdiyev
Copy link
Copy Markdown
Contributor Author

JosephMehdiyev commented Apr 20, 2026

@jorenham (afaik you are the author of these stubs) is there a specific reason why overload stubs of histogram2d are more detailed (a couple of them even looks reduntant to me) and written differently than histogram and histogramdd? While I am here changing the file I might also rewrite it similar to histogram etc, unless these are intended

for example deleting this wont change things will succesfully run tests anyways (also true before PR changes)

@overload
def histogram2d(
    x: _ArrayLike1DNumber_co,
    y: _ArrayLike1DNumber_co,
    bins: _BinKind | Sequence[Sequence[int] | _BinKind],
    range: _ArrayLike2DFloat_co | None = None,
    density: bool | None = None,
    weights: _ArrayLike1DFloat_co | None = None,
) -> _Histogram2D[np.int_]: ...

since this below already handles that case

def histogram2d[ScalarT: _Number_co](
    x: _ArrayLike1DNumber_co,
    y: _ArrayLike1DNumber_co,
    bins: _BinKind | _ArrayLike1D[ScalarT] | Sequence[_ArrayLike1D[ScalarT] | _BinKind],
    range: _ArrayLike2DFloat_co | None = None,
    density: bool | None = None,
    weights: _ArrayLike1DFloat_co | None = None,
) -> _Histogram2D[ScalarT]: ...

I have no experience with .pyi files, I am mirroring my knowledge from .hpp and cpp templates FYI

@jorenham
Copy link
Copy Markdown
Member

jorenham commented Apr 20, 2026

@jorenham (afaik you are the author of these stubs) is there a specific reason why overload stubs of histogram2d are more detailed (a couple of them even looks reduntant to me) and written differently than histogram and histogramdd? While I am here changing the file I might also rewrite it similar to histogram etc, unless these are intended

for example deleting this wont change things

@overload
def histogram2d(
    x: _ArrayLike1DNumber_co,
    y: _ArrayLike1DNumber_co,
    bins: _BinKind | Sequence[Sequence[int] | _BinKind],
    range: _ArrayLike2DFloat_co | None = None,
    density: bool | None = None,
    weights: _ArrayLike1DFloat_co | None = None,
) -> _Histogram2D[np.int_]: ...

since this below already handles that case

def histogram2d[ScalarT: _Number_co](
    x: _ArrayLike1DNumber_co,
    y: _ArrayLike1DNumber_co,
    bins: _BinKind | _ArrayLike1D[ScalarT] | Sequence[_ArrayLike1D[ScalarT] | _BinKind],
    range: _ArrayLike2DFloat_co | None = None,
    density: bool | None = None,
    weights: _ArrayLike1DFloat_co | None = None,
) -> _Histogram2D[ScalarT]: ...

I have no experience with .pyi files, I am mirroring my knowledge from .hpp and cpp templates FYI

The difference here is in the bins, where the first overload here handles int sequences, and the second ScalarT. The important bit is that ScalarT only accept np.number | np.bool, and would therefore reject int.

So these overloads are distinct and non-overlapping.

... or at least, that's what they should be. You added _BinKind to both, which makes them overlap in an incompatible way.

@JosephMehdiyev JosephMehdiyev marked this pull request as ready for review April 21, 2026 22:30
@JosephMehdiyev
Copy link
Copy Markdown
Contributor Author

Not sure the failing tests are PR related.
I did not do anything about #31296, let me know if I should add some documentation about it
PR should be ready to review now. I looked around the stuff I did and did not find any issues (hopefully I am right)

Comment on lines +1005 to +1009
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in second thought this specific warning seems unnecessary, i'll change it

@JosephMehdiyev JosephMehdiyev requested a review from jorenham April 21, 2026 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: np.histogram2d() does not support argument 'auto' for bins.

2 participants