iframe-proxy

emmanuel-keller · 2026-04-17T17:05:03Z

Thank you for submitting this pull request. We really appreciate you spending the time to work on SurrealDB. 🚀 🎉

Before proceeding, please add any of the following labels that apply:

breaking-change if this PR includes a breaking change. This label is not needed for bug fixes, which change output in line with a user's original expectations.
needs-documentation if documentation is needed to explain the change made in the PR.
Modifies env vars or commands if any changes are made to environment variables or command flags.

What is the motivation?

A user reported two issues with deferred SEARCH index building:

Exponential slowdown at scale — indexing speed per 1,000 records degrades as the table grows, making large indexes impractical. The rest of the database also slows down (likely from disk pressure).
Contention failures / hanging — when records are modified during indexing, the deferred daemon either fails with "Failed to commit transaction due to a read or write conflict" or hangs indefinitely with unprocessed pending records.

Root causes identified

Slowdown: A new FtIndex was created and torn down for every single document during batch indexing (both initial build and appending phases). Each cycle loaded 4 BTree structures (DocIds, DocLengths, Postings, Terms) from storage and serialized them all back — with cost growing as the index grew. Additionally, set_status() was called per-record, acquiring status.write() + queue.read() RwLocks ~250 times per batch.

Contention: The deferred daemon was spawned before initial_build_complete was set to true. During this race window, maybe_consume would still write !ip keys (seeing stale false via Ordering::Relaxed) while the daemon was deleting the same !ip keys — causing write-write conflicts in RocksDB's optimistic transactions. The catch! macro then propagated the TxRetryable error through index_appending_loop to the daemon loop, which permanently killed the index with an Error status instead of retrying.

What does this change do?

Batch `FtIndex` work across records

Add IndexOperation::create_ft_index() / compute_search_with_ft() so an FtIndex can be reused across multiple documents within a batch.
In index_initial_batch and index_appending_range, create the FtIndex once before the record loop and call finish() once after, instead of per-document. This removes the repeated load/serialize of the DocIds, DocLengths, Postings and Terms BTrees that was causing the exponential slowdown.
Move set_status() calls from per-record to per-batch, eliminating ~250 status.write() + queue.read() RwLock acquisitions per batch.

Fix the `!ip` key race between the daemon and `maybe_consume`

Set deferred_daemon_running and drop InitialBuildGuard (which sets initial_build_complete) before spawning the daemon, matching the already-correct pattern in start_deferred_index. This closes the window where the daemon and maybe_consume could both write to the same !ip key.
Upgrade the initial_build_complete load in maybe_consume from Ordering::Relaxed to Ordering::Acquire, pairing with the Release store in InitialBuildGuard::drop.
Remove the now-redundant deferred_daemon_running store inside spawn_deferred_daemon (the flag is already set by the callers before spawning).

Retry transient conflicts instead of killing the index

Replace the catch! in index_appending_loop with explicit error handling that retries TxRetryable (cancel tx, sleep 100 ms, re-read keys, retry batch) instead of propagating the error up to the daemon.
The daemon loop also treats TxRetryable as retryable rather than fatal, as a safety net.
Cancel the transaction on TxRetryable commit failure for proper cleanup.
On TxRetryable retry, snapshot and restore updates_count around each batch attempt so the same !ib keys being re-scanned and re-processed after a cancelled transaction don't double-count the updated field reported in BuildingStatus::Indexing / BuildingStatus::Ready.

Propagate `FtIndex` finish errors on abort

In the abort branch of index_appending_range, propagate ft.finish() errors with ? (matching index_initial_batch) so the caller cancels the transaction. The earlier best-effort (log-and-continue) behaviour could persist the staged !ip/!ib deletions for already-processed records without the corresponding FtIndex BTree state, permanently dequeuing those records with missing search data.

What is your testing strategy?

All 6 existing deferindex integration tests pass (deferred_index_survives_restart, insert_parallel_full_text, multi_index_concurrent_test_create_update_delete).
All 40 idx::ft:: unit tests pass.
Clippy clean, no warnings.
User's reproduction test (defer-test/defer-indexing-performance.ts) can be used for end-to-end validation at scale.

Security Considerations

No security implications — these are performance and correctness fixes to background index building. No changes to authentication, authorization, data isolation, user input handling, or error reporting.

Is this related to any issues?

No related issues

Have you read the Contributing Guidelines?

I have read the Contributing Guidelines

Previously, a new FtIndex was created and torn down for every single document during batch indexing (both initial build and appending phases). Each cycle loaded 4 BTree structures from storage and serialized them back, with cost growing as the index grew -- causing exponential slowdown at scale. Now the FtIndex is created once per batch and reused across all records, with a single finish() call at the end. Status updates are also reduced from per-record to per-batch, eliminating ~250 RwLock acquisitions per batch.

…dexing The deferred daemon was spawned before initial_build_complete was set to true. During this window, maybe_consume would still write !ip keys while the daemon was deleting them, causing write-write conflicts on the same key in RocksDB's optimistic transactions. Fix the race by setting deferred_daemon_running and dropping the InitialBuildGuard (which sets initial_build_complete) BEFORE spawning the daemon. Also upgrade the Relaxed load in maybe_consume to Acquire to properly synchronize with the Release store. Additionally, replace the catch! macro in index_appending_loop with explicit error handling that retries on TxRetryable instead of propagating the error. The daemon loop also retries transient errors as a safety net, rather than permanently killing the index.

Remove redundant deferred_daemon_running store inside spawn_deferred_daemon (already set by callers before spawning). Cancel transaction on TxRetryable commit failure for proper cleanup. Make FtIndex finish best-effort on abort in index_appending_range to avoid masking the abort.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d3ec9b4dc0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

claude

This PR addresses real correctness and performance issues, but it's non-trivial — touching atomic memory ordering, transaction retry semantics, and a race condition in deferred index initialization. Worth a human look before merging.

Extended reasoning...

Overview\nThe PR modifies two files in the core indexing subsystem: and . Changes span three commits: (1) batching FtIndex creation across records within a batch instead of per-document, (2) fixing a race between deferred daemon spawn and `initial_build_complete` flag visibility, and (3) cleanup/retry handling.\n\n### Security Risks\nNo security-sensitive code is touched — no auth, crypto, or permissions. Risk is limited to correctness of background index building.\n\n### Level of Scrutiny\nThis warrants careful human review. The changes involve subtle atomic ordering decisions (upgrading `Ordering::Relaxed` to `Acquire`/`Release` at key synchronization points), transaction retry loops with manual cancel/sleep/retry semantics, and a race-condition fix that depends on correct ordering of flag stores relative to daemon spawning. The bug reports also flag an asymmetry between `index_initial_batch` and `index_appending_range` in how `ft.finish()` errors are handled on abort, and a potential (narrow but real) scenario where a failed `ft.finish()` on abort could allow partial data to be committed without the corresponding FtIndex state.\n\n### Other Factors\nThe PR is well-documented with clear commit messages and a thorough description. All existing integration and unit tests are reported passing. The bugs found by the hunting system are real issues — the abort-path inconsistency between the two batch functions is a genuine correctness concern that should be addressed before merge.

Previously the abort branch called ft.finish() best-effort (logging a warning and returning Ok on failure). Because the caller commits the transaction on Ok, a failed finish could persist staged !ip/!ib deletions for already-processed records without the corresponding FtIndex BTree state, permanently dequeuing those records with missing search data. Propagate the error with `?` so the caller cancels the transaction, matching the existing behaviour in index_initial_batch.

When a batch failed with TxRetryable (either from index_appending_range itself or from tx.commit()), the transaction was cancelled but updates_count kept the increments applied during the failed attempt. Because tx.cancel() also rolled back the tx.del(ib) calls, the same !ib keys were re-read and re-processed on the next iteration, so updates_count was incremented a second time for the same records - inflating the updated field reported in BuildingStatus::Indexing and BuildingStatus::Ready. Snapshot updates_count before each batch attempt and restore it in both TxRetryable arms. Index data and queue cleanup were unaffected; this is a reporting-only fix.

emmanuel-keller added 2 commits April 17, 2026 17:59

emmanuel-keller changed the title ~~Batch FtIndex across records during deferred index building~~ Fix deferred index performance and contention failures Apr 17, 2026

emmanuel-keller marked this pull request as ready for review April 17, 2026 19:32

emmanuel-keller requested review from a team as code owners April 17, 2026 19:32

chatgpt-codex-connector Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread crates/core/src/kvs/index.rs Outdated

claude Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread crates/core/src/kvs/index.rs

Comment thread crates/core/src/kvs/index.rs

Comment thread crates/core/src/kvs/index.rs

emmanuel-keller added 2 commits April 17, 2026 21:32

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix deferred index performance and contention failures#7274

Fix deferred index performance and contention failures#7274
emmanuel-keller wants to merge 5 commits intoreleases/2.xfrom
emmanuel/fix-defer-index-performance

emmanuel-keller commented Apr 17, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sunbelt Computer Software

PL/B Language Development and Support

Conversation

emmanuel-keller commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is the motivation?

Root causes identified

What does this change do?

Batch FtIndex work across records

Fix the !ip key race between the daemon and maybe_consume

Retry transient conflicts instead of killing the index

Propagate FtIndex finish errors on abort

What is your testing strategy?

Security Considerations

Is this related to any issues?

Have you read the Contributing Guidelines?

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

emmanuel-keller commented Apr 17, 2026 •

edited

Loading

Batch `FtIndex` work across records

Fix the `!ip` key race between the daemon and `maybe_consume`

Propagate `FtIndex` finish errors on abort