You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our IBD PR #240 got a bit long and it took me a long time to see a better way, but here's a proposal.
The dynamics of the Braid are that it should never fork, and we never discard valid beads. If we had a fork, both sides of the fork need to share their beads. Then someone names both sides as parents and ties them up into a big cohort. The side with lower hashrate on the fork won't get paid, but this is expected behavior. But we have to evaluate that by actually getting those beads, computing the cohorts, and evaluating their work. We can't make payout decisions on beads we didn't receive so shouldn't be using any kind of timing decisions on whether we should ask for beads. We should always ask for beads.
I think we should remove the ibd_or_not boolean flag and the spinlock. With the above changes, it doesn't matter to peers whether we think we're synced and have decided to mine, because we'll serve beads to anyone. So we can remove the BeadSyncError::PeerSyncing error code I added...
So I propose:
Remove the P2P ibd_spinlock checks and simplify IBDManager:
- Remove variables tied to the spinlock timestamp_mapping, incoming_bead_mapping
- Remove logic in IBDManager about when to exit IBD and ibd_or_not (move to Stratum)
- Remove UpdateTimestampMapping, FetchTimestamp, FetchAllTimestamps, UpdateIncomingBeadMapping, GetIncomingBeadRetryCount, AbortWaitHandle.
- Remove related logic in main.rs lines 479-573, 1006-1041, 1241-1301
Remove BeadSyncError::PeerSyncing and every place it's returned, serve the request instead.
Move the ibd_or_not flag to stratum.rs, rename it to ready_to_mine or something.
- Requires Braid::orphanage_occupancy(10*LATENCY_ALPHA) < 0.5 or something like that.
Keep timestamps of when beads enter and exit Braid::orphans. I added an orphan_index: HashSet to the Braid anyway in Braid refactor: Close the loop #299 because I needed to look up orphans by hash. I'll expand this to:
pub orphanage: HashMap<BeadHash, (Bead, time::Instant)> actual orphan store with O(1) lookup by hash.
pub orphanage_history: VecDeque<(time::Instant, time::Instant)> tracks entry/exit of orphans in the orphanage
pub fn orphanage_occupancy(window: time::Duration) ->f64 returns the fractional occupancy of the orphanage over window. (includes beads currently in the orphanage)
- I'll probably trigger pruning orphanage_history in extend(). Or we can set up a timer.
Move all the IBDManager fields into PeerInfo in peer_manager/mod.rs. Tracking what tips a peer has is a fundamentally peer-to-peer interaction, and I think belongs here. Add sync_batch_offset, cached_tips, and pending_beadhashes to PeerInfo and add the IBDManager methods to PeerInfo -- all these methods are with respect to a single peer anyway.
This should substantially simplify the IBD code, which was largely based on an "Am I synced" decision which is inappropriate in an asynchronous network. You're never synced in an asynchronous network, but always synced, there's always something in flight you don't know about. Bitcoin intentionally slowed this down to 10 minutes and created a synchronous blockchain data structure. We need to be faster and the more-correct question is "are we seeing everything and can we connect everything we see in newly broadcast beads" which is the purpose of the orphanage concept, and occupancy thereof. It's a measure of "am I missing broadcasts". And "seeing all broadcasts" is the "I am synced" signal.
After discussing with @Sansh2356 he wants to add the following minor features in the corresponding PR as well:
Move prepare_bead_tuple_data() out of main and into the DBHandler.
Consolidate the DB lock around main.rs:475-478: only hold the lock through extend and then examine what beads were added to the braid by looking at braid_data.beads.len() and copying them out to pass to the DBHandler, releasing the lock earlier.
Our IBD PR #240 got a bit long and it took me a long time to see a better way, but here's a proposal.
The dynamics of the Braid are that it should never fork, and we never discard valid beads. If we had a fork, both sides of the fork need to share their beads. Then someone names both sides as parents and ties them up into a big cohort. The side with lower hashrate on the fork won't get paid, but this is expected behavior. But we have to evaluate that by actually getting those beads, computing the cohorts, and evaluating their work. We can't make payout decisions on beads we didn't receive so shouldn't be using any kind of timing decisions on whether we should ask for beads. We should always ask for beads.
I think we should remove the ibd_or_not boolean flag and the spinlock. With the above changes, it doesn't matter to peers whether we think we're synced and have decided to mine, because we'll serve beads to anyone. So we can remove the BeadSyncError::PeerSyncing error code I added...
So I propose:
- Remove variables tied to the spinlock timestamp_mapping, incoming_bead_mapping
- Remove logic in IBDManager about when to exit IBD and ibd_or_not (move to Stratum)
- Remove UpdateTimestampMapping, FetchTimestamp, FetchAllTimestamps, UpdateIncomingBeadMapping, GetIncomingBeadRetryCount, AbortWaitHandle.
- Remove related logic in main.rs lines 479-573, 1006-1041, 1241-1301
- Requires Braid::orphanage_occupancy(10*LATENCY_ALPHA) < 0.5 or something like that.
- I'll probably trigger pruning orphanage_history in extend(). Or we can set up a timer.
I will implement (4) as part of #299.
This should substantially simplify the IBD code, which was largely based on an "Am I synced" decision which is inappropriate in an asynchronous network. You're never synced in an asynchronous network, but always synced, there's always something in flight you don't know about. Bitcoin intentionally slowed this down to 10 minutes and created a synchronous blockchain data structure. We need to be faster and the more-correct question is "are we seeing everything and can we connect everything we see in newly broadcast beads" which is the purpose of the orphanage concept, and occupancy thereof. It's a measure of "am I missing broadcasts". And "seeing all broadcasts" is the "I am synced" signal.
After discussing with @Sansh2356 he wants to add the following minor features in the corresponding PR as well:
prepare_bead_tuple_data()out of main and into the DBHandler.braid_data.beads.len()and copying them out to pass to the DBHandler, releasing the lock earlier.