Wait for a sibling HTTP/2 connection without blocking the caller thread by pavel-ptashyts · Pull Request #2227 · AsyncHttpClient/async-http-client · GitHub
Skip to content

Wait for a sibling HTTP/2 connection without blocking the caller thread#2227

Open
pavel-ptashyts wants to merge 2 commits into
AsyncHttpClient:mainfrom
maygemdev:fix/h2-wait-no-caller-thread-spin
Open

Wait for a sibling HTTP/2 connection without blocking the caller thread#2227
pavel-ptashyts wants to merge 2 commits into
AsyncHttpClient:mainfrom
maygemdev:fix/h2-wait-no-caller-thread-spin

Conversation

@pavel-ptashyts

Copy link
Copy Markdown
Contributor

When the per-host connection cap is saturated and HTTP/2 is enabled, a request that fails to acquire a permit tries to multiplex onto a sibling connection another request is establishing to the same origin (stream reuse needs no permit). Off the event loop, waitForHttp2Connection did this by Thread.sleep(10)-polling the H2 registry until connectTimeout (5s default) elapsed — parking the caller thread (the synchronous part of execute()) for up to the full timeout and burning CPU. Under a bounded caller thread pool, a burst of over-cap requests to a new H2 origin could exhaust the pool.

Replace the busy-poll with an event-driven, non-blocking deferral: register a one-shot waiter keyed by the request's partition key and return the pending future immediately. registerHttp2Connection wakes matching waiters, which resume the send via sendRequestWithOpenChannel. A connectTimeout deadline on the Netty timer fails the request with the original permit exception if no connection arrives; the client-close path fails pending waiters (their request-timeout backstop is not scheduled yet). A once-only CAS makes the registration, deadline, and close paths mutually exclusive, and the waiter rechecks the registry after registering to close the poll-vs-register lost-wakeup race.

On the event loop (a redirect / 401 / 407 retry) the single immediate poll is kept and we give up if it misses — a wait there could self-deadlock, since the connection is being established on that same loop. The ROUND_ROBIN #2214 limitation is unchanged (per-IP keying) but no longer occupies the caller thread.

Adds ChannelManagerHttp2WaiterTest covering wake-on-registration, key isolation, waiter removal, and fail-on-close. Existing HTTP/2 regression tests (conformance, multiplexing, stream-orphan, streaming-body flow-control) pass unchanged.

When the per-host connection cap is saturated and HTTP/2 is enabled, a request that fails to acquire a permit tries to multiplex onto a sibling connection another request is establishing to the same origin (stream reuse needs no permit). Off the event loop, waitForHttp2Connection did this by Thread.sleep(10)-polling the H2 registry until connectTimeout (5s default) elapsed — parking the caller thread (the synchronous part of execute()) for up to the full timeout and burning CPU. Under a bounded caller thread pool, a burst of over-cap requests to a new H2 origin could exhaust the pool.

Replace the busy-poll with an event-driven, non-blocking deferral: register a one-shot waiter keyed by the request's partition key and return the pending future immediately. registerHttp2Connection wakes matching waiters, which resume the send via sendRequestWithOpenChannel. A connectTimeout deadline on the Netty timer fails the request with the original permit exception if no connection arrives; the client-close path fails pending waiters (their request-timeout backstop is not scheduled yet). A once-only CAS makes the registration, deadline, and close paths mutually exclusive, and the waiter rechecks the registry after registering to close the poll-vs-register lost-wakeup race.

On the event loop (a redirect / 401 / 407 retry) the single immediate poll is kept and we give up if it misses — a wait there could self-deadlock, since the connection is being established on that same loop. The ROUND_ROBIN AsyncHttpClient#2214 limitation is unchanged (per-IP keying) but no longer occupies the caller thread.

Adds ChannelManagerHttp2WaiterTest covering wake-on-registration, key isolation, waiter removal, and fail-on-close. Existing HTTP/2 regression tests (conformance, multiplexing, stream-orphan, streaming-body flow-control) pass unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant