action #18826
action #18144: [tools] restart ipmi management controller before every ipmi job
[tools] Investigate serial over lan disconnects for ipmi
70%
Description
Found while working on issue 18144
ipmi console sometimes reports timeouts or excess errors, disconnecting its serial over LAN
Current attempt to resolve this is to enable serial keepalive, deployed to openqaw2 on 26 Apr 15:45
If this doesn't work, there are other keepalive options, or else the rather expensive approach of monitoring if the SOL is live and reconnecting within the backend.
History
#1
Updated by coolo about 6 years ago
the keepalive option seems to be harmful - there was not a single pass in https://openqa.suse.de/tests/latest?arch=x86_64&machine=64bit-ipmi&distri=sle&flavor=Server-DVD&version=12-SP3&test=gnome#previous
#2
Updated by coolo about 6 years ago
but the virt tests are okayish
#3
Updated by RBrownSUSE about 6 years ago
Yes, noted, investigating
#4
Updated by RBrownSUSE about 6 years ago
- Status changed from New to In Progress
#5
Updated by RBrownSUSE about 6 years ago
- Status changed from In Progress to Resolved
Serial disconnect issues resolved by https://github.com/os-autoinst/os-autoinst/pull/777
Serial keep alive workaround removed as no longer beneficial with auto reconnect
No iKVM issues reported since regular nightly restart of the card, so no evidence that the whole mc controller needs to be restarted on every job