{{ message }}
chore: Enrol can clear bios jobs#1979
Open
stevekeay wants to merge 4 commits into
Open
Conversation
f957cea to
6416295
Compare
6416295 to
3e1c22f
Compare
Contributor
Contributor
Author
The plan was to do that ONLY on explicit request by a command line flag. known_good_state doesn't do any harm but it adds a lot of time to a process which is already very slow. If we find ourselves needing to do it a lot then we can make it the default. |
Contributor
cardoe
reviewed
May 6, 2026
cardoe
left a comment
Contributor
There was a problem hiding this comment.
It's not clear to me why the agent inspection here a second time vs the other location?
We're seeing obscure errors from Ironic like:
failed step {'interface': 'raid', 'step': 'delete_configuration',
'abortable': False, 'priority': 0}: Unable to connect to
/redfish/v1/TaskService/Tasks/JID_768614980495. Error: Timeout waiting
for task monitor /redfish/v1/TaskService/Tasks/JID_768614980495 (timeout
= 500)
To clear this up, we are completing each operation with a separate
reboot.
e533231 to
da6c348
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

(1) It's quite easy to create "conflicting" jobs, so that a stale job that was left behind prevents any progress because the new job can't be created until the old one is deleted.
This just clears all jobs on startup. I know there is an ironic hook that ships with ironic and therefore is superior to adding code here, but running it here makes it run at the right time, when we need it to.
We also add an option to call the ironic "known_good_state" which reboots the idrac, just to make sure. Unfortunately this is broken in our Ironic because it relies on the "ping" binary (like we are still in 2006) that is not present in our container. It still resets the drac but then it errors out, forcing the user to unset maintenance mode, wait a while, and try again.
(2) If we are to update BIOS settings, we go ahead and take a second reboot, to ensure the BIOS settings change gets committed before we take on anything else. This takes a very long time, but trying to stack the bios update alongside other updates seems to result in failures. This change makes the process slower, but more likely to work first time.
(3) disable some more PXE devices - these can get left enabled by prior processes and confuse the boot process output. We're not using PXE so disable it, period.