ENH: Add 'bitorder' keyword to packbits, unpackbits by mattip · Pull Request #12962 · numpy/numpy · GitHub
Skip to content

ENH: Add 'bitorder' keyword to packbits, unpackbits#12962

Merged
charris merged 4 commits into
numpy:masterfrom
mattip:unpackbits
May 12, 2019
Merged

ENH: Add 'bitorder' keyword to packbits, unpackbits#12962
charris merged 4 commits into
numpy:masterfrom
mattip:unpackbits

Conversation

@mattip

@mattip mattip commented Feb 13, 2019

Copy link
Copy Markdown
Member

Fixes #11172

Adds an 'order' kwarg to packbits, unpackbits. Should this hit the mailing list? It is a small enhancement. I removed the separate handling of NPY_BYTE_ORDER, we only have little endian systems to test on, but it shouldn't matter for uint8 processing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There probably is a more elegant way to flip the bits

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to rewrite the loop about, and not do build <<= 1, but rather OR-in an already shifted element (i.e., something like build |= (inputr[j] != 0) << shift where shift depends on order and j).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adopted

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mention "adopted" but I think the code is still the same, no? I meant the loop above (not "about"), which now has the build <<= 1. I think that loop can be rewritten to just do the right thing for each order.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could write this as below, is that better? I don't understand why I needed to use unsigned char instead of char for the order != 'b' case, I am only ever shifting right. If unsigned char is required it should be for the left shifts, no?

Details
    for (; index < n_out; index++) {
        unsigned char build = 0;
        int i, maxi;
        npy_intp j;

        maxi = (index == n_out - 1) ? remain : 8;
        if (order == 'b') {
            for (i = 0; i < maxi; i++) {
                build <<= 1;
                for (j = 0; j < element_size; j++) {
                    build |= (inptr[j] != 0);
                }
                inptr += in_stride;
            }
            if (index == n_out - 1) {
                build <<= 8 - remain;
            }
        }
        else
        {
            for (i = 0; i < maxi; i++) {
                build >>= 1;
                for (j = 0; j < element_size; j++) {
                    build |= (inptr[j] != 0) ? 128 : 0;
                }
                inptr += in_stride;
            }
            if (index == n_out - 1) {
                build >>= 8 - remain;
            }
        }
        *outptr = (char)build;
        outptr += out_stride;
    }

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was a bug, if not we need to add it back in. I don't have a big-endian system to test on.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this does get turned into bytes in the end, and those will have a different order in the uint64 here. So, my guess is that you do need to swap (i.e., have the reverse swap for big and little endian). But am a bit too lazy to just write it out... Note that the test cases surely would fail for big endian if this was wrong...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now tested on a big-endian system since I have access to the gcc build farm and a sparc64 machine. See 5d1b9208d

@charris charris changed the title ENH: add 'order' keyword to packbits, unpackbits ENH: Add 'order' keyword to packbits, unpackbits Feb 19, 2019
@mattip

mattip commented Feb 25, 2019

Copy link
Copy Markdown
Member Author

Waiting for #10855 to land, then this will need updating

@mattip

mattip commented Feb 26, 2019

Copy link
Copy Markdown
Member Author

Tests pass, no extra modifications were needed for count

@mhvk mhvk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea. Some initial comments.

Comment thread numpy/core/multiarray.py Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flip around for packbits [0, 0, 0, 0, 0, 1, 1] => 3

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to rewrite the loop about, and not do build <<= 1, but rather OR-in an already shifted element (i.e., something like build |= (inputr[j] != 0) << shift where shift depends on order and j).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this in an else clause of the if statement below.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused: exactly the same v is constructed here as below, the only difference being the byte swap. Can we just fill both big and little in one loop? (one the unswapped, the other the swapped v)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this does get turned into bytes in the end, and those will have a different order in the uint64 here. So, my guess is that you do need to swap (i.e., have the reverse swap for big and little endian). But am a bit too lazy to just write it out... Note that the test cases surely would fail for big endian if this was wrong...

@mattip

mattip commented Mar 3, 2019

Copy link
Copy Markdown
Member Author

Following this guide, I set up a mips big-endian qemu machine, translated numpy, and ran the tests in test_packbits.py. They passed.

@eric-wieser eric-wieser self-requested a review March 10, 2019 17:21

@mhvk mhvk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments about comments, otherwise just a question about the initialization.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjust the comment to make clear this is for the "big-endian" case.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed reference to endian, since the 'b' makes the intention clear. Moved the rest of the comment to mark the intrinsic

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mention "adopted" but I think the code is still the same, no? I meant the loop above (not "about"), which now has the build <<= 1. I think that loop can be rewritten to just do the right thing for each order.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to adjust this comment too...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"fixed" by removing "big endian"

@eric-wieser eric-wieser Mar 10, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we match python here in int.from_bytes (C longobject.c:int_to_bytes_impl), and actually require the full string?

Could consider using the internal _PyUnicode_EqualToASCIIId, which I think is available in the python versions we care about.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adopted

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used strncmp instead, which does not require Py_LIMITED_API

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already depend on Py_LIMITED_API, so I don't think we need to care

@mattip mattip Mar 13, 2019

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of the test runners failed when I used _PyUnicode_EqualToASCIIString, saying the function was not defined.

Edit: fix function name

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guarantee that char(256) overflows, CHAR_BIT might not be 8.

Why not just use a simple loop here?

for (int i = 0; i < 8; i++) {
    *outptr = (bool)(*inptr & (1 << i));
    outptr += out_stride;
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adopted

Comment thread numpy/core/src/multiarray/compiled_base.c Outdated
Comment thread numpy/core/src/multiarray/compiled_base.c Outdated
Comment thread numpy/core/src/multiarray/compiled_base.c Outdated
Comment thread numpy/core/src/multiarray/compiled_base.c Outdated
Comment thread numpy/core/src/multiarray/compiled_base.c Outdated
Comment thread numpy/core/multiarray.py Outdated

@eric-wieser eric-wieser Mar 10, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be 8 items in this list, not 7?

I'm not sure I'd call this the common standard - I frankly find it bizarre that unpackbits(1)[0] does not refer to "bit 0" of x.

Could instead compare to binary literal syntax, stating that 0b00000011 => [0, 0, 0, 0, 0, 0, 1, 1]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing the comment, I remember being bitten by exactly this (which is of course why "little" will be useful!).

Maybe the docstring can be agnostic about which is "standard" - both have advantages. I like the idea of adding 3 = 0b00000011 => [...]

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adopted

@mattip mattip force-pushed the unpackbits branch 2 times, most recently from be968b2 to bf8b7cd Compare March 13, 2019 18:09
@mattip

mattip commented Mar 16, 2019

Copy link
Copy Markdown
Member Author

@eric-wieser ping

1 similar comment
@mattip

mattip commented Mar 28, 2019

Copy link
Copy Markdown
Member Author

@eric-wieser ping

Comment thread numpy/core/multiarray.py Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order refers to the list of bits in the input.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to bitorder?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redid the comment and changed order -> bitorder

@mattip mattip changed the title ENH: Add 'order' keyword to packbits, unpackbits ENH: Add 'bitorder' keyword to packbits, unpackbits May 11, 2019
Comment thread numpy/core/multiarray.py Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be

    bitorder : {'big', 'little'}, optional

Also below.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread numpy/core/multiarray.py Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Periods, Matti, periods.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@charris

charris commented May 12, 2019

Copy link
Copy Markdown
Member

close/reopen

@charris charris closed this May 12, 2019
@charris charris reopened this May 12, 2019
`numpy.packbits` and `numpy.unpackbits` accept an ``order`` keyword
-------------------------------------------------------------------
The ``order`` keyword defaults to ``big``, and will order the **bits**
accordingly. For ``'big'`` 3 will become ``[0, 0, 0, 0, 0, 0, 1, 1]``, and

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Period. But I will fix it when editing the release note.

@charris charris merged commit 9dad348 into numpy:master May 12, 2019
@charris

charris commented May 12, 2019

Copy link
Copy Markdown
Member

Thanks Matti.

}

/* setup lookup table under GIL, big endian 0..256 as bytes */
if (unpack_init == 0) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does the removal of the lookup table impact performance?
As I recall it was significant when I added it.

@juliantaylor

Copy link
Copy Markdown
Contributor
       before           after         ratio
     [08b17aee]       [e6227a03]
     <v1.16.3^0>       <master>  
+      5.80±0.5μs         46.4±1μs     8.00  bench_core.UnpackBits.time_unpackbits
+         120±2μs         911±20μs     7.58  bench_core.UnpackBits.time_unpackbits_axis1

I realize the lookup table is a bunch of extra code, but it is a factor 8 speedup.
this is pretty significant, it should at least be mentioned in the release notes.

@juliantaylor

Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DOC: Bit numbering for 8 bit types

6 participants