patch_generate: only calculate binary diffs if requested#3922
Conversation
When generating diffs for binary files, we load and decompress the blobs in order to generate the actual diff, which can be very costly. While we cannot avoid this for the case when we are called with the `GIT_DIFF_SHOW_BINARY` flag, we do not have to load the blobs in the case where this flag is not set, as the caller is expected to have no interest in the actual content of binary files. Fix the issue by only generating a binary diff when the caller is actually interested in the diff. As libgit2 uses heuristics to determine that a blob contains binary data by inspecting its size without loading from the ODB, this saves us quite some time when diffing in a repository with binary files.
|
We should probably be passing in the minimal information there. The comment for the type doesn't say what fields you can expect to have filled, so it looks like we should define what gets filled and then fill that. Presumably we'd want to pass in the path, object ids and mode so you can display the message that it changed if that's all you're after. |
|
I think I would prefer to pass a The This is a good fix, I'm going to merge it and add Thanks @pks-t ! |
|
After poking around with this a bit more, I realized that While looking at this I realized that we didn't have any way to parse a patch that had |

When generating diffs for binary files, we load and decompress
the blobs in order to generate the actual diff, which can be very
costly. While we cannot avoid this for the case when we are
called with the
GIT_DIFF_SHOW_BINARYflag, we do not have toload the blobs in the case where this flag is not set, as the
caller is expected to have no interest in the actual content of
binary files.
Fix the issue by only generating a binary diff when the caller is
actually interested in the diff. As libgit2 uses heuristics to
determine that a blob contains binary data by inspecting its size
without loading from the ODB, this saves us quite some time when
diffing in a repository with binary files.