{{ message }}
perf: eliminate double JSON serialization in cache key hot path#7345
Open
ppraneth wants to merge 5 commits intotensorzero:mainfrom
Open
perf: eliminate double JSON serialization in cache key hot path#7345ppraneth wants to merge 5 commits intotensorzero:mainfrom
ppraneth wants to merge 5 commits intotensorzero:mainfrom
Conversation
Contributor
|
All contributors have signed the CLA ✍️ ✅ |
Author
|
I have read the Contributor License Agreement (CLA) and hereby sign the CLA. |
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Previously,
ModelProviderRequest::get_cache_key()serialized the request twice: once viaserde_json::to_valueto produce aserde_json::Value, then again via.to_string()to get a JSON string for hashing. Between the two steps it also had to allocate the intermediateValue, find and remove theinference_idkey from it, and then allocate the finalStringbuffer.This PR replaces that with a single streaming serialization pass directly into the
blake3::Hasher(which implementsstd::io::Write), usingserde_json::to_writer. To excludeinference_idfrom the hash without an intermediate allocation, it is now marked#[serde(skip)]onModelInferenceRequest.What changed
tensorzero-inference-types
#[serde(skip)]to theinference_idfield onModelInferenceRequest. The field is only serialized for cache key purposes and is intentionally excluded from the hash, so skipping it during serialization is correct and safe.tensorzero-core/src/cache.rs
to_value→remove→to_stringsequence with:Added a Criterion benchmark:
benches/cache_key.rsfor regression tracking.Benchmark results (release mode, same machine)
✅ All 88 cache unit tests pass.