action #58346
closedo3 openqaworker1 and openqaworker4 are completely down on 2019-10-18
0%
Updated by okurz about 5 years ago
- Priority changed from Urgent to Normal
checked responsiveness of both hosts over IPMI SOL but there is nothing. power status is on. power cycled both machines, both are up. Side-effect: The only x86_64 worker that was up is imagetester:1 and :2 and they did not seem to be very stable: https://openqa.opensuse.org/tests/1059689#next_previous shows two "random" failures in a row.
Updated by okurz about 5 years ago
- Due date set to 2019-11-03
- Status changed from In Progress to Feedback
- Priority changed from Normal to Low
I will check if this happens again to see what I can do about debugging. I could apply the same monitor+reboot check as done for aarch64.o.o
openqaworker1 and w4 were down 2019-10-19, potentially one more time lately in the past days.
Oct 19 03:30:37 openqaworker4 systemd-journald[777]: Journal stopped
-- Reboot --
Oct 20 09:21:15 openqaworker4 kernel: microcode: microcode updated early to revision 0x43, date = 2019-03-01
after forced power cycle. I suspect a recent kernel upgrade.
Updated by okurz about 5 years ago
- Has duplicate action #58403: openqaworker1 and w4 are repeatedly down added
Updated by okurz about 5 years ago
- Due date deleted (
2019-11-03) - Status changed from Feedback to Resolved
Added recovery to okurz's crontab on lord.arch same as aarch64.o.o . Let's see if these trigger at all and how often.