{{ message }}
Bugfix for stuck in write method of WiFiClient and WiFiClientSecure until the remote peer closed connection#6104
Merged
Conversation
Collaborator
earlephilhower
requested changes
May 16, 2019
earlephilhower
left a comment
Collaborator
There was a problem hiding this comment.
I think you found a sneaky bug that has been there for a long time, thanks!
However, I would request you change it to a bool and adjust the if (send_waiting==1) statement (which caused the infinite hang once send_waiting got to 2) accordingly. We really want a flag here, not a count, so a bool would reduce technical debt.
Contributor
Author
|
I've updated _send_waiting to be clear bool flag. |
earlephilhower
approved these changes
May 16, 2019
Collaborator
|
Thanks! I'll leave it to @d-a-v to double-check that this only needs to be a flag and not a count (in which case the |
Contributor
earlephilhower
added a commit
to earlephilhower/Arduino
that referenced
this pull request
May 20, 2019
Changes since 2.5.1 (to 2.5.2) Core ---- * Add explicit Print::write(char) (esp8266#6101) Build system ---- * Fix typo in elf2bin for QOUT binary generation (esp8266#6116) * Support PIO Wl-T and Arduino -T linking properly (esp8266#6095) * Allow *.cc files to be linked into flash by default (esp8266#6100) * Use custom "ElfToBin" builder for PIO (esp8266#6091) * Fail if generated JSON file cannot be read (esp8266#6076) * Moved 'Dropping' print from stdout to stderr in drop_versions.py (esp8266#6071) * Fix PIO issue when build environment contains spaces (esp8266#6119) Libraries ---- * Remove deadlock when server is not acking our data (esp8266#6107) * Bugfix for stuck in write method of WiFiClient and WiFiClientSecure until the remote peer closed connection (esp8266#6104) * Re-add original SD FAT info access methods (esp8266#6092) * Make FILE_WRITE append in SD.h wrapper (esp8266#6106) * Drop X509 after connection, avoid hang on TLS broken (esp8266#6065)
Merged
earlephilhower
added a commit
that referenced
this pull request
May 20, 2019
Changes since 2.5.1 (to 2.5.2) Core ---- * Add explicit Print::write(char) (#6101) Build system ---- * Fix typo in elf2bin for QOUT binary generation (#6116) * Support PIO Wl-T and Arduino -T linking properly (#6095) * Allow *.cc files to be linked into flash by default (#6100) * Use custom "ElfToBin" builder for PIO (#6091) * Fail if generated JSON file cannot be read (#6076) * Moved 'Dropping' print from stdout to stderr in drop_versions.py (#6071) * Fix PIO issue when build environment contains spaces (#6119) Libraries ---- * Remove deadlock when server is not acking our data (#6107) * Bugfix for stuck in write method of WiFiClient and WiFiClientSecure until the remote peer closed connection (#6104) * Re-add original SD FAT info access methods (#6092) * Make FILE_WRITE append in SD.h wrapper (#6106) * Drop X509 after connection, avoid hang on TLS broken (#6065)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Couple of days I was troubleshooting strange behavior with stability of components built on top of WiFiClient and WiFiClientSecure. Finally, I found the root cause of these issues. From time to time it happened that call of write method get stuck until the remote peer closed connection. It seems that root cause bug is present for quite long time in the code.
When tcp send buffer is full, ClientContext::_write_from_source increments _send_waiting and switch context to NONOS using esp_yield. If something else call esp_schedule (not _write_some_from_cb method in the same instance of ClientContext), the cycle in _write_from_source is repeated, send buffer is still full and value of _send_waiting is incremented again (thus from this moment _send_waiting>1). Any successful ack on the relevant connection never call esp_schedule because of condition in _write_some_from_cb where _send_waiting is decremented only if it is equal to 1.
One example when something else can call esp_schedule method is when there are two or more ClientContext instances (e.g. two client connections). Ack on other client context cause esp_schedule and thus resume of write this client context while there is still no space in tcp send buffer.
The simplest solution is set _send_waiting to 1 instead of its increment. As _send_waiting is one Byte it has no sense to change it to bool.