Added support for IBD by ABD-AZE · Pull Request #240 · braidpool/braidpool · GitHub
Skip to content

Added support for IBD #240

Merged
mcelrath merged 11 commits into
devfrom
feat/GetAllBeads
Dec 5, 2025
Merged

Added support for IBD #240
mcelrath merged 11 commits into
devfrom
feat/GetAllBeads

Conversation

@ABD-AZE

@ABD-AZE ABD-AZE commented Aug 11, 2025

Copy link
Copy Markdown

Additional changes made

  • Added helper functions (reset , get_beads_after) in braid/mod.rs .
  • Added tests for verifying correct serialisation and deserialisation of enum variants in bead/mod.rs .
  • Added extensive tests for get_beads_after() function .
  • Added better error handling by introducing BeadSyncError enum .
  • Complete IBD functionality following the flow as GetTips ---> GetBeads ----> GetData along with acknowledgement.
  • IBDHandler for managing the cache requirements corresponding to IBD .
  • Tested over the cases such as
    • A new node is spawned containing zero beads and undergoes IBD from a sync node .
    • A new node was spawned and after syncing but it has not mined and the sync node also didn't mined .
    • A new node after syncing mined beads and the peer node was also connected .
    • A new node after syncing mined beads and the peer got disconnected and reconnected to undergo IBD wrt new node which mined some beads during the time the sync node got disconnected .
    • Vice-versa of 4th just mining being done at the end of peer node instead of new node .

NOTE: Algorithm works only if Braid::cohorts vector is topologically sorted

I have read the extend function and it seems that the above condition is always true, please review this once again for any edge cases.

@ABD-AZE ABD-AZE requested review from Copilot and mcelrath and removed request for Copilot August 11, 2025 17:39

This comment was marked as outdated.

@mcelrath mcelrath requested a review from Copilot August 12, 2025 19:18

This comment was marked as outdated.

@mcelrath mcelrath requested a review from Copilot August 21, 2025 16:01

This comment was marked as outdated.

@mcelrath mcelrath requested a review from Copilot August 21, 2025 16:16

This comment was marked as outdated.

@mcelrath mcelrath left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is good, it just lacks tests. Here is a list from Gemini of missing tests:

  • No get_beads_after Logic Tests: The new Braid::get_beads_after() function in
    node/src/braid/mod.rs, which contains the core logic for this feature, has no unit
    tests. As I mentioned in the review, the logic of this function might be fragile,
    making the lack of tests a critical issue.
  • No Serialization Tests: The new BeadRequest::GetBeadsAfter and
    BeadResponse::GetBeadsAfter variants are not tested for serialization and
    deserialization in node/src/bead/tests.rs. This means there's no guarantee they can be
    correctly encoded and decoded for transmission over the network.
  • No BeadSyncError Serialization Tests: The new BeadSyncError enum itself lacks unit
    tests for its Encodable and Decodable implementations.
  • No End-to-End Test: There are no integration tests that simulate a full
    request-response cycle for the GetBeadsAfter flow between two peers. The test in
    node/src/behaviour/tests.rs only covers the older GetBeads flow.
  • No GenesisMismatch Test: The new logic for handling a GenesisMismatch error (where a
    peer receives this error and then requests the genesis beads) is not covered by any
    test.

This comment was marked as outdated.

@mcelrath mcelrath requested a review from Copilot August 25, 2025 18:48

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements support for GetBeadsAfter request-response functionality to enable efficient incremental synchronization between nodes in the braid pool network.

  • Added GetBeadsAfter request/response variants to enable requesting beads after a specific set of tip hashes
  • Introduced BeadSyncError enum for better error handling with structured error types
  • Implemented get_beads_after() algorithm in the Braid that relies on topologically sorted cohorts

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
node/src/main.rs Handles new GetBeadsAfter requests and responses in the main event loop
node/src/braid/mod.rs Adds get_beads_after() method and reset() helper function to Braid
node/src/braid/tests.rs Comprehensive tests for the new get_beads_after() functionality
node/src/behaviour/mod.rs Updates API to use Vec instead of HashSet and new BeadSyncError type
node/src/behaviour/tests.rs Updates tests to use new BeadSyncError enum
node/src/bead/mod.rs Adds GetBeadsAfter request/response variants and BeadSyncError serialization
node/src/bead/tests.rs Tests for serialization/deserialization of new enum variants

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread node/src/braid/mod.rs
Comment thread node/src/main.rs Outdated
} else {
swarm.behaviour_mut().respond_with_error(
channel,
BeadSyncError::Other(String::from(

Copilot AI Aug 25, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace hardcoded error message with a more specific BeadSyncError variant. The TODO comment indicates this should be addressed.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

Comment thread node/src/braid/mod.rs
Comment thread node/src/bead/mod.rs Outdated
@Sansh2356 Sansh2356 mentioned this pull request Oct 14, 2025
37 tasks
@ABD-AZE ABD-AZE force-pushed the feat/GetAllBeads branch 3 times, most recently from 9c907f9 to c3450e5 Compare October 31, 2025 00:22
@Sansh2356 Sansh2356 mentioned this pull request Nov 12, 2025

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 32 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Comment thread node/src/main.rs
Comment thread node/src/main.rs Outdated
Comment thread node/src/braid/mod.rs
Comment thread node/src/main.rs Outdated
Comment thread node/src/main.rs
Comment thread node/src/main.rs
Comment thread node/src/bead/mod.rs
Comment thread node/src/braid/mod.rs
Comment thread node/src/behaviour/mod.rs
Comment thread node/src/main.rs Outdated
- Copies/reimplements some macros from rust-bitcoin (that are not
      exported)
    - Allows an "enum" variant that can include both integer and
      data-carrying variants with explicit discriminant assignment
      (because discriminants are protocol "magic numbers" and would be
      needed for any alternative implementation)
    - Automatically implements consensus_encode/decode including Vec
      arguments
    - Update BeadRequest, BeadResponse, BeadSyncError to use
      braidpool_protocol!
    - Remove DRY-violating BeadRequestType
    - Add rustdoc comments to all these messages
    - Refuse to respond to most messages if ibd_spinlock is held

condition for no sync peer in connection mapping

removing unwraps and typos in comments

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Comment thread node/src/main.rs
Comment thread node/src/main.rs
Comment thread node/src/main.rs
Comment thread node/src/ibd_manager/mod.rs
Comment thread node/src/ibd_manager/mod.rs
Comment thread node/src/main.rs
Comment thread node/src/main.rs Outdated
Comment thread node/src/macros.rs
Comment thread node/src/braid/mod.rs
Comment thread node/src/behaviour/mod.rs
@Sansh2356 Sansh2356 marked this pull request as draft November 30, 2025 20:31
@mcelrath

Copy link
Copy Markdown
Collaborator

Currently, this PR is not working to download any beads, because:

  1. The boot node never exits IBD
  2. Peer nodes don't serve what beads they have over IBD when they don't think they're synced.

I verified that a node in IBD correctly refuses Stratum connections.

To fix this:

  1. When in IBD, a node should fire SwarmCommand::InitiateIBD and attempt to initiate IBD with any new incoming peer.
  2. If we failed to find an IBD peer, and we have peers, we need to retry SwarmCommand::InitiateIBD on a timer.
  3. Allow IBD to/from any peer, regardless of whether we're synced. If our beads are different than our peers, both peers need to IBD them in both directions.

The dynamics of the Braid are that it should never fork, and we never discard valid beads. If we had a fork, both sides of the fork need to share their beads. Then someone names both sides as parents and ties them up into a big cohort. The side with lower hashrate on the fork won't get paid, but this is expected behavior. But we have to evaluate that by actually getting those beads, computing the cohorts, and evaluating their work. We make payout decisions on beads we didn't receive so shouldn't be using any kind of timing decisions on whether we should ask for beads. We should always ask for beads.

I think we should remove the ibd_or_not boolean flag and the spinlock. With the above changes, it doesn't matter to peers whether we think we're synced and have decided to mine, because we'll serve beads to anyone. So we can remove the BeadSyncError::PeerSyncing error code I added...

So I propose:

  1. Remove the P2P ibd_spinlock checks and simplify IBDManager:
    • Remove variables tied to the spinlock timestamp_mapping, incoming_bead_mapping
    • Remove logic in IBDManager about when to exit IBD and ibd_or_not (move to Stratum)
    • Remove UpdateTimestampMapping, FetchTimestamp, FetchAllTimestamps, UpdateIncomingBeadMapping, GetIncomingBeadRetryCount, AbortWaitHandle.
    • Remove related logic in main.rs lines 479-573, 1006-1041, 1241-1301
  2. Remove BeadSyncError::PeerSyncing and every place it's returned, serve the request instead.
  3. Move the ibd_or_not flag to stratum.rs, rename it to ready_to_mine or something.
    • Requires Braid::orphanage_occupancy(10*LATENCY_ALPHA) < 0.5 or something like that.
  4. Keep timestamps of when beads enter and exit Braid::orphans. I added an orphan_index: HashSet<BeadHash> to the Braid anyway in Braid refactor: Close the loop #299 because I needed to look up orphans by hash. I'll expand this to:
    • pub orphanage: HashMap<BeadHash, (Bead, time::Instant)> actual orphan store with O(1) lookup by hash.
    • pub orphanage_history: VecDeque<(time::Instant, time::Instant)> tracks entry/exit of orphans in the orphanage
    • pub fn orphanage_occupancy(window: time::Duration) ->f64 returns the fractional occupancy of the orphanage over window. (includes beads currently in the orphanage)
    • I'll probably trigger pruning orphanage_history in extend(). Or we can set up a timer.
  5. Move all the IBDManager fields into PeerInfo in peer_manager/mod.rs. Tracking what tips a peer has is a fundamentally peer-to-peer interaction, and I think belongs here. Add sync_batch_offset, cached_tips, and pending_beadhashes to PeerInfo and add the IBDManager methods to PeerInfo -- all these methods are with respect to a single peer anyway.

I will implement (4) as part of #299.

@Sansh2356 Sansh2356 closed this Dec 1, 2025
@mcelrath mcelrath mentioned this pull request Dec 5, 2025
@Sansh2356 Sansh2356 reopened this Dec 5, 2025
@Sansh2356 Sansh2356 marked this pull request as ready for review December 5, 2025 01:58
@mcelrath mcelrath requested review from Sansh2356 and Copilot and removed request for Sansh2356 December 5, 2025 17:35

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 14 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread node/src/ibd_manager/mod.rs
Comment thread node/src/main.rs
Comment thread node/src/stratum.rs
Comment thread node/src/braid/mod.rs Outdated
Comment thread node/src/braid/mod.rs
Comment thread node/src/stratum.rs
Comment thread node/src/db/db_handlers.rs
Comment thread node/src/braid/mod.rs
Comment thread node/src/behaviour/mod.rs
Comment thread node/src/main.rs
continue;
}
};
let _val = _ibd_bridge_rx.await.unwrap();

Copilot AI Dec 5, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling in line 1148 uses unwrap() on _ibd_bridge_rx.await, which can panic if the sender is dropped. This is inconsistent with the error handling pattern used elsewhere in the code (e.g., lines 1143-1146).

Recommendation: Handle the potential error properly:

let _val = match _ibd_bridge_rx.await {
    Ok(val) => val,
    Err(error) => {
        error!(error=?error, "Failed to receive offset initialization response");
        continue;
    }
};
Suggested change
let _val = _ibd_bridge_rx.await.unwrap();
let _val = match _ibd_bridge_rx.await {
Ok(val) => val,
Err(error) => {
error!(error=?error, "Failed to receive offset initialization response");
continue;
}
};

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
) -> anyhow::Result<(String, String, String)> {
let mut parent_set: HashMap<usize, HashSet<usize>> = HashMap::new();

for (idx, b) in beads.iter().enumerate() {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incredibly expensive, and in the hot path while the braid lock is being held.

After 299 you can just use braid.parents which has the data you need in O(1).

So I'm going to accept this, but we need to merge #240 then #299 then come back and fix this function.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed will replace it accordingly .

Comment thread node/src/lib.rs Outdated

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic if the bead is not added. Move this block of code up into the AddBeadStatus::BeadAdded match block.

Changing verbosity of logs

refactor: moving condition in lib.rs to if branch

@mcelrath mcelrath left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Merging!

@mcelrath mcelrath merged commit ae5cde0 into dev Dec 5, 2025
14 checks passed
@mcelrath mcelrath deleted the feat/GetAllBeads branch December 5, 2025 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants