Histogram (revisited) by LukeMathWalker · Pull Request #9 · rust-ndarray/ndarray-stats · GitHub
Skip to content

Histogram (revisited)#9

Merged
LukeMathWalker merged 146 commits into
rust-ndarray:masterfrom
LukeMathWalker:histogram-w-edges
Nov 18, 2018
Merged

Histogram (revisited)#9
LukeMathWalker merged 146 commits into
rust-ndarray:masterfrom
LukeMathWalker:histogram-w-edges

Conversation

@LukeMathWalker

Copy link
Copy Markdown
Member

Based on our discussion in #8, I have revised the implementation.

It needs a testing suite and a couple more helper methods on HistogramCounts but the skeleton it's there. Let me know your thoughts @jturner314

@jturner314

jturner314 commented Oct 22, 2018

Copy link
Copy Markdown
Member

@LukeMathWalker

Copy link
Copy Markdown
Member Author

I managed to address all comments (and I fixed the bug you spotted) - they were all on point, thanks for taking the time to go through the PR with that level of attention!

I have also reduced the number of types parameter on BinsBuildingStrategy from 2 to 1 using the strategy you suggested (associated type on trait).

With respect to the issue of bin boundaries I proceeded as follows: all strategies now provide an optimal bin width (either directly through the rule they specify or indirectly recasting the optimal number of bins to an optimal bin width) and we make sure that the last edge is strictly greater than the maximum of the array that has been passed to the builder.
This basically means that we might add an extra bin to the right if it is required to account for the maximum value, but we don't modify the grid parameters to achieve it. What do you think?

@jturner314

Copy link
Copy Markdown
Member

I managed to address all comments (and I fixed the bug you spotted) - they were all on point, thanks for taking the time to go through the PR with that level of attention!

You're welcome. Thanks for working on this! Would you like me to review the updated version, or do you think it's ready to merge?

This basically means that we might add an extra bin to the right if it is required to account for the maximum value, but we don't modify the grid parameters to achieve it. What do you think?

I think that's fine.

For future versions, I think it would be worth investigating how Julia's StatsBase does things, because StatsBase is similar to our implementation in using all half-open intervals (unlike NumPy, which considers the last bin to be a closed interval).

@LukeMathWalker

LukeMathWalker commented Nov 14, 2018 via email

Copy link
Copy Markdown
Member Author

Comment thread src/quantile.rs Outdated
@LukeMathWalker LukeMathWalker merged commit fcbe35a into rust-ndarray:master Nov 18, 2018
@LukeMathWalker LukeMathWalker deleted the histogram-w-edges branch November 18, 2018 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants