{{ message }}
This repository was archived by the owner on Jan 10, 2023. It is now read-only.
Workaroud for UnicodeDecodeError#171
Open
matthieuxyz wants to merge 2 commits into
Open
Conversation
|
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
Contributor
|
@matthieuxyz thanks for the detailed explanation! I implemented this fix for the |
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Why:
Fix a random UnicodeDecodeError, when using AdbCommands.Logcat.
Steps to reproduce
Expected result
Output result as unicode string (or str in python3)
Actual result
UnicodeDecodeError is sometime raised with the following message:
The exception is raised from the method AdbMessage.StreamingCommand() from the module adb_protocol.py:
Investigation
When streaming output of command, a max length is provided to UsbHandle.BulkRead. This length is (if I understood correctly) given in bytes.
So the end of a bulk of bytes data can append at any given bytes.
But in utf-8, characters may be encoded on multiples bytes (ex: あ (U+3042) is encoded as 0xE3 0x81 0x82 in utf-8).
Since, the bulk may end at any given bytes, it is possible for a multi-byte sequence to be split between two bulks (ex: [b'...\xe3', b'\x81\x82...'] ).
But, decoding will fail if a multi-byte sequence is incomplete (try to run
b'AAA\xe3'.decode('utf-8')in a python console).What:
This pull request contains...
A test case with a
@skipdecorator to illustrate the problem.New "Bytes" methods that implements a "workaround" (ie. using bytes and decode only when stream is complete)
New tests to cover the new "Bytes" methods
There is probably a better solution, but this is what worked for me. And I hope that it will help to illustrate the problem with StreamingCommand.