action #18144
closed[tools] restart ipmi management controller before every ipmi job
100%
Description
ipmi cards are unreliable, but seem to be quite reliable if you restart them
before running a test, starting the card with "ipmitool mc reset hot" would be a sensible option to keep them reliable
Updated by okurz over 7 years ago
- Related to action #17816: [functional][u][ipmi] Error: Unable to establish IPMI v2 / RMCP+ session added
Updated by okurz over 7 years ago
- Related to action #13914: [qe-core][functional][ipmi] wait_serial does not get expected output because ipmi console connection is closed added
Updated by RBrownSUSE over 7 years ago
- Assignee set to RBrownSUSE
- Priority changed from Normal to Urgent
- Target version set to Milestone 7
Evaluated as important for Milestone 7
Updated by RBrownSUSE over 7 years ago
Still investigating, https://github.com/os-autoinst/os-autoinst/pull/767 should let os-autoinst get us more helpful info
Updated by okurz over 7 years ago
the PR should be in. Is https://openqa.suse.de/tests/901944#step/textinfo/3 related to that maybe?
Updated by RBrownSUSE over 7 years ago
the PR should be in
No, who are you to say what should or shouldn't be in?
Is https://openqa.suse.de/tests/901944#step/textinfo/3 related to that maybe?
No, but thank you for your opinion
Updated by okurz over 7 years ago
RBrownSUSE wrote:
the PR should be in
No, who are you to say what should or shouldn't be in?
"should be in" as in: "Hm, depending on what I heard when the installation was last deployed the content of the pull request is probably already effective on the corresponding workers."
Updated by SLindoMansilla over 7 years ago
It also happens here: https://openqa.suse.de/tests/910032#step/textinfo/3
Updated by coolo over 7 years ago
Could you guys please stop abusing this issue? I already removed okurz's tag - see #6
Updated by RBrownSUSE over 7 years ago
- Status changed from New to Resolved
Serial disconnect issues resolved by https://github.com/os-autoinst/os-autoinst/pull/777
Serial keep alive workaround removed as no longer beneficial with auto reconnect
No iKVM issues reported since regular nightly restart of the card, so no evidence that the whole mc controller needs to be restarted on every job