[dbsp] Don't rebalance during initial step. by ryzhyk · Pull Request #6013 · feldera/feldera · GitHub
Skip to content

[dbsp] Don't rebalance during initial step.#6013

Open
ryzhyk wants to merge 8 commits intomainfrom
init_step
Open

[dbsp] Don't rebalance during initial step.#6013
ryzhyk wants to merge 8 commits intomainfrom
init_step

Conversation

@ryzhyk
Copy link
Copy Markdown
Contributor

@ryzhyk ryzhyk commented Apr 8, 2026

The balancer used to trigger a rebalancing on a restart from a checkpoint. This was a bad idea in hindsight. Not only it slowed down failover, it also caused the initial step executed before the pipeline transitioned to the RUNNING state potentially take a very long time. During this time none of the monitoring features are available, making this a potentially very long and awkward silence.

This commit disables this behavior. It also introduces a new DBSP-level API that disables automatic rebalancing altogether. The plan is to expose this API externally, allowing the user to control when rebalancings are allowed to happen, e.g., a latency-sensitive workload may disable rebalancing after initial backfill. This will be combined with an API to request a rebalancing on demand.

For now this API is only used to disable rebalancing during the initial step, making sure that a regular (non-forced) rebalancing doesn't trigger accidentally at that point.

This commit additionally skips the initial step when the pipeline is bootstrapping. The reason is similar to the above: bootstrapping steps can be expensive, so we don't want to perform them while the pipeline is initializing. This does mean that the pipeline won't initialize output snapshots until bootstrapping completes, but that was already the case previously.


The above fix revealed a bug in the implementation of adaptive joins during bootstrapping. The PR includes bug fix commits.

Describe Manual Test Plan

Checklist

  • Unit tests added/updated
  • Integration tests added/updated
  • Documentation updated
  • Changelog updated

Breaking Changes?

Mark if you think the answer is yes for any of these components:

Describe Incompatible Changes

@ryzhyk ryzhyk requested a review from blp April 8, 2026 22:03
@ryzhyk ryzhyk added the DBSP core Related to the core DBSP library label Apr 8, 2026
Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@blp blp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can not rebalancing on the initial step make that step take much longer?

@ryzhyk
Copy link
Copy Markdown
Contributor Author

ryzhyk commented Apr 9, 2026

@ryzhyk
Copy link
Copy Markdown
Contributor Author

ryzhyk commented Apr 9, 2026

PS. I need to add an integration test before marking this PR ready.

@blp
Copy link
Copy Markdown
Member

blp commented Apr 9, 2026

Can not rebalancing on the initial step make that step take much longer?

The initial step doesn't get any input and should normally complete instantaneously. We've identified two cases where this was not true: (1) rebalancing kicks in, which can be as bad as complete backfill in the worst case, (2) bootstrapping kicks -- less dramatic, but can still be bad.

Oh, I get it now, I think: without rebalancing, there's no work to do because there's no input; with rebalancing, the entire integral up to this point ends up being processed. Is that right?

@ryzhyk
Copy link
Copy Markdown
Contributor Author

ryzhyk commented Apr 9, 2026

Can not rebalancing on the initial step make that step take much longer?

The initial step doesn't get any input and should normally complete instantaneously. We've identified two cases where this was not true: (1) rebalancing kicks in, which can be as bad as complete backfill in the worst case, (2) bootstrapping kicks -- less dramatic, but can still be bad.

Oh, I get it now, I think: without rebalancing, there's no work to do because there's no input; with rebalancing, the entire integral up to this point ends up being processed. Is that right?

correct!

@ryzhyk ryzhyk marked this pull request as ready for review April 12, 2026 17:50
Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ryzhyk added 8 commits April 23, 2026 11:30
The balancer used to trigger a rebalancing on a restart from a checkpoint.
This was a bad idea in hindsight. Not only it slowed down failover, it also
caused the initial step executed before the pipeline transitioned to the
RUNNING state potentially take a very long time. During this time none of the
monitoring features are available, making this a very long an awkward silence.

This commit disables this behavior. It also introduces a new DBSP-level API
that disables automatic rebalancing altogether. The plan is to expose this API
externally, allowing the user to control when rebalancings are allowed to
happen, e.g., a latency-sensitive workload may disable rebalancing after
initial backfill. This will be combined with an API to request a rebalancing on
demand.

For now this API is only used to disable rebalancing during the initial step,
making sure that a regular (non-forced) rebalancing doesn't trigger
accidentally at that point.

This commit additionally skips the initial step when the pipeline is
bootstrapping. The reason is similar to the above: bootstrapping steps can be
expensive, so we don't want to perform them while the pipeline is initializing.
This does mean that the pipeline won't initialize output snapshots until
bootstrapping completes, but that was already the case previously.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Dedup a repetitive expression.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Improve debug output in replay tests.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Add a test that combines rebalancing, checkpointing, and bootstrapping.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
This fixes a bug in adaptive joins that can manifest during bootstrapping.

We did not correctly handle streams that did not participate in bootstrapping
during bootstrapping. Such streams cannot change their balancing policy until
bootstrapping completes; however we allowed the balancer to assign new policies
to such streams. These policies were ignored leading to incorrect output and
inconsistent internal state.

As part of this fix, we also distinguish streams that haven't been assigned a
policy yet instead of assigning a default policy (Shard) to them. Such default
policies can be inconsistent during bootstrapping. This also improved
debuggability as we can distinguish cases when the balancer when the balancer
hasn't assigned a policy yet.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
In replay mode, Z1Trace and AccumulateZ1Trace operators could continue
producing outputs after returning `true` from `is_flush_complete`, which
violates the transaction commit protocol.

This commit fixes this behavior. It also addresses a related issue where
strict operators (Z1) did not distinguish between `flush` being called on
the input vs output half of the operator.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
During bootstrapping, disabled operators do not participate in transactions
and we shouldn't call start_transaction on such operators. This could cause
a livelock in situations where some of the streams in a join cluster were
inactive. Calling start_transaction on such streams set their flush_state
to TransactionStarted. As a result RabalancingExchangeSender::ready_to_commit
always returned false for all other streams in the cluster.

With this commit, we no longer call `start_transaction` on disabled operator.
Additionally, we fix the `ready_to_commit` implementation to ignore disabled
operators when deciding whether to commit a transaction.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Make the test more complicated to trigger the bug fixed in the previous commit.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
@ryzhyk ryzhyk enabled auto-merge April 23, 2026 23:16
@ryzhyk ryzhyk added this pull request to the merge queue Apr 23, 2026
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still alive?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I often uncomment it for debugging

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 23, 2026
@ryzhyk ryzhyk added this pull request to the merge queue Apr 23, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 24, 2026
@ryzhyk ryzhyk added this pull request to the merge queue Apr 24, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DBSP core Related to the core DBSP library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants