action #116473
open
Add OSD PowerPC workers to automatic recovery we already have for ARM workers
Added by mkittler about 2 years ago.
Updated over 1 year ago.
Description
There workers are often failing similarly to the ARM workers¹ and at this point we should not need to manually recover them so frequently.
Suggestions¶
- Note that for these workers a
power cycle
does not always work but power reset
seems to work always. So maybe that detail needs to be adjusted for PowerPC workers.
- I suppose all PowerPC workers controllable via IPMI should be considered (see
workerconf.sls
in salt pillars).
¹ They just randomly crash and logs just end without further clues, e.g. #114565#note-40. In addition, they sometimes also get stuck at boot.
- Target version set to future
Can you provide a little bit more context regarding "failing similarly"? Didn't you also have a bug report and there were suggestions regarding kdump and such?
Can you provide a little bit more context regarding "failing similarly"?
There's not much to say about it. They just randomly crash and the journal doesn't give one any clues; it just ends at some point. In addition, they sometimes also get stuck at boot.
Didn't you also have a bug report and there were suggestions regarding kdump and such?
Yes. I can link the relevant progress ticket for additional context. However, I'm not sure whether we can fix this problem anytime soon.
- Description updated (diff)
- Related to action #114565: recover qa-power8-4+qa-power8-5 size:M added
- Description updated (diff)
Also available in: Atom
PDF