{{ message }}
gh-101282: Apply BOLT optimisations to libpython for shared builds#104709
Merged
erlend-aasland merged 4 commits intopython:mainfrom May 22, 2023
Merged
gh-101282: Apply BOLT optimisations to libpython for shared builds#104709erlend-aasland merged 4 commits intopython:mainfrom
erlend-aasland merged 4 commits intopython:mainfrom
Conversation
4 tasks
corona10
requested changes
May 21, 2023
Member
corona10
left a comment
There was a problem hiding this comment.
Please update docs for BOLT_INSTRUMENT_FLAGS and BOLT_APPLY_FLAGS : https://docs.python.org/3.12/using/configure.html#performance-options
(This change is a quick and dirty way to merge some of the build system improvements I'm proposing in pythongh-101093 before the 3.12 feature freeze. I wanted to scope bloat myself to fix some longstanding deficiencies in the build system around profile-guided builds. But I'm getting soft resistance to the reviews so close to the freeze deadline and it is obvious that we need a simpler solution to hit the 3.12 deadline. While this change is quick and dirty, it attempts to not make things worse.) Before this change, we only applied bolt to the main python binary. After this change, we apply bolt to libpython if it is configured. In shared library builds, most of the C code is in libpython so it is critical to apply bolt to libpython to realize bolt benefits. This change also reworks how bolt instrumentation is applied. It effectively removes the readelf based logic added in pythongh-101525 and replaces it with a mechanism that saves a copy of the pre-bolt binary and restores that copy when necessary. This allows us to perform bolt optimizations without having to manually delete the output binary to force a new bolt run. We also add a new make target for purging bolt files and hook it up to `clean` so bolt state is purged when appropriate. `.gitignore` rules have been added to ignore files related to bolt. Before and after this refactor, `make` will no-op after a previous run. Both versions should also share common make DAG deficiencies where targets fail to trigger as often as they need to or can trigger prematurely in certain scenarios. e.g. after this change you may need to `rm profile-bolt-stamp` to force a bolt run because there aren't appropriate non-phony targets for bolt's make target to depend on. Fixing this is a non-trivial amount of work that will likely have to wait until the 3.13 window. To make it easier to iterate on custom BOLT settings, the flags to pass to instrumentation and application are now defined in configure and can be overridden by passing `BOLT_INSTRUMENT_FLAGS` and `BOLT_APPLY_FLAGS`.
Contributor
Author
Done in latest push. |
Contributor
erlend-aasland
left a comment
There was a problem hiding this comment.
I cleaned up the docs and AC code; hope you don't mind.
I left some questions. Regarding BOLT technical stuff, I lean on Dong-hee's review.
Contributor
We appreciate if you don't force-push:
(This is also mentioned in the devguide.) |
erlend-aasland
approved these changes
May 22, 2023
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

(This change is a quick and dirty way to merge some of the build system improvements I'm proposing in gh-101093 before the 3.12 feature freeze. I wanted to scope bloat myself to fix some longstanding deficiencies in the build system around profile-guided builds. But I'm getting soft resistance to the reviews so close to the freeze deadline and it is obvious that we need a simpler solution to hit the 3.12 deadline. While this change is quick and dirty, it attempts to not make things worse.)
Before this change, we only applied bolt to the main python binary. After this change, we apply bolt to libpython if it is configured. In shared library builds, most of the C code is in libpython so it is critical to apply bolt to libpython to realize bolt benefits.
This change also reworks how bolt instrumentation is applied. It effectively removes the readelf based logic added in gh-101525 and replaces it with a mechanism that saves a copy of the pre-bolt binary and restores that copy when necessary. This allows us to perform bolt optimizations without having to manually delete the output binary to force a new bolt run.
We also add a new make target for purging bolt files and hook it up to
cleanso bolt state is purged when appropriate..gitignorerules have been added to ignore files related to bolt.Before and after this refactor,
makewill no-op after a previous run. Both versions should also share common make DAG deficiencies where targets fail to trigger as often as they need to or can trigger prematurely in certain scenarios. e.g. after this change you may need torm profile-bolt-stampto force a bolt run because there aren't appropriate non-phony targets for bolt's make target to depend on. Fixing this is a non-trivial amount of work that will likely have to wait until the 3.13 window.To make it easier to iterate on custom BOLT settings, the flags to pass to instrumentation and application are now defined in configure and can be overridden by passing
BOLT_INSTRUMENT_FLAGSandBOLT_APPLY_FLAGS.