iframe-proxy

matthieuxyz · 2019-11-04T16:13:03Z

Why:

Fix a random UnicodeDecodeError, when using AdbCommands.Logcat.

Steps to reproduce

Add multi-bytes characters to log file
Try to use AdbCommands.Logcat

Expected result

Output result as unicode string (or str in python3)

Actual result

UnicodeDecodeError is sometime raised with the following message:

ERROR: 'utf-8' codec can't decode byte 0xe3 in position 4095: unexpected end of data

The exception is raised from the method AdbMessage.StreamingCommand() from the module adb_protocol.py:

    @classmethod
    def StreamingCommand(cls, usb, service, command='', timeout_ms=None):
        """
        ...
        """
        if not isinstance(command, bytes):
            command = command.encode('utf8')
        connection = cls.Open(
            usb, destination=b'%s:%s' % (service, command),
            timeout_ms=timeout_ms)
        for data in connection.ReadUntilClose():
            yield data.decode('utf-8')  # <---- HERE

Investigation

When streaming output of command, a max length is provided to UsbHandle.BulkRead. This length is (if I understood correctly) given in bytes.

So the end of a bulk of bytes data can append at any given bytes.

But in utf-8, characters may be encoded on multiples bytes (ex: あ (U+3042) is encoded as 0xE3 0x81 0x82 in utf-8).

Since, the bulk may end at any given bytes, it is possible for a multi-byte sequence to be split between two bulks (ex: [b'...\xe3', b'\x81\x82...'] ).

But, decoding will fail if a multi-byte sequence is incomplete (try to run b'AAA\xe3'.decode('utf-8') in a python console).

What:

This pull request contains...

A test case with a @skip decorator to illustrate the problem.
New "Bytes" methods that implements a "workaround" (ie. using bytes and decode only when stream is complete)
New tests to cover the new "Bytes" methods

There is probably a better solution, but this is what worked for me. And I hope that it will help to illustrate the problem with StreamingCommand.

googlebot · 2019-11-04T16:13:21Z

coveralls · 2019-11-04T16:14:47Z

Coverage increased (+1.2%) to 44.888% when pulling 5f45f7c on matthieuxyz:master into f4e597f on google:master.

googlebot · 2019-11-04T16:52:44Z

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

JeffLIrion · 2019-11-05T15:39:57Z

@matthieuxyz thanks for the detailed explanation! I implemented this fix for the adb-shell package, although I didn't add a specific test for this scenario. Could you please take a look and let me know if the change looks correct?

JeffLIrion/adb_shell#34

matthieuxyz · 2019-11-05T20:48:35Z

Workaroud for UnicodeDecodeError

4e5e773

Fix for python2

5f45f7c

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Workaroud for UnicodeDecodeError#171

Workaroud for UnicodeDecodeError#171
matthieuxyz wants to merge 2 commits into
google:masterfrom
matthieuxyz:master

matthieuxyz commented Nov 4, 2019

Uh oh!

googlebot commented Nov 4, 2019

Uh oh!

coveralls commented Nov 4, 2019 •

edited

Loading

Uh oh!

googlebot commented Nov 4, 2019

Uh oh!

JeffLIrion commented Nov 5, 2019

Uh oh!

matthieuxyz commented Nov 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

matthieuxyz commented Nov 4, 2019

Why:

Steps to reproduce

Expected result

Actual result

Investigation

What:

Uh oh!

googlebot commented Nov 4, 2019

What to do if you already signed the CLA

Individual signers

Corporate signers

Uh oh!

coveralls commented Nov 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

googlebot commented Nov 4, 2019

Uh oh!

JeffLIrion commented Nov 5, 2019

Uh oh!

matthieuxyz commented Nov 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

coveralls commented Nov 4, 2019 •

edited

Loading