action #122575
closed[alert] jenkins host up alert
0%
Description
Observation¶
https://monitor.qa.suse.de/d/GDjenkins/dashboard-for-jenkins?editPanel=65105&tab=alert&orgId=1 shows host not up since about https://monitor.qa.suse.de/d/GDjenkins/dashboard-for-jenkins?editPanel=65105&tab=alert&orgId=1&from=1672539711104&to=1672540596193 which is our usual weekly maintenance reboot window on Sunday early morning.
Updated by okurz almost 2 years ago
- Status changed from New to In Progress
- Assignee set to okurz
Updated by okurz almost 2 years ago
I looked into the serial console output using virt manager and found the instance in recovery mode. After logging in I could not find anything wrong and a normal boot succeeded just fine. So jenkins.qa.suse.de was back up after manual recovery. Running reboot stability loop.
Updated by okurz almost 2 years ago
- Status changed from In Progress to Resolved
jenkins.qa.suse.de passed the review loop test:
$ for run in {01..30}; do for host in jenkins.qa.suse.de; do echo -n "run: $run, $host: ping .. " && timeout -k 5 600 sh -c "until ping -c30 $host >/dev/null; do :; done" && echo -n "ok, ssh .. " && timeout -k 5 600 sh -c "until nc -z -w 1 $host 22; do :; done" && echo -n "ok, uptime/reboot: " && ssh $host "uptime && sudo reboot" && sleep 120 || break; done || break; done
run: 01, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:32:16 up 1:16, 0 users, load average: 0.06, 0.33, 0.23
Connection to jenkins.qa.suse.de closed by remote host.
run: 02, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:33:01 up 0:00, 0 users, load average: 1.10, 0.24, 0.08
Connection to jenkins.qa.suse.de closed by remote host.
run: 03, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:33:41 up 0:00, 0 users, load average: 1.33, 0.29, 0.10
Connection to jenkins.qa.suse.de closed by remote host.
run: 04, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:34:29 up 0:00, 0 users, load average: 1.90, 0.45, 0.15
Connection to jenkins.qa.suse.de closed by remote host.
run: 05, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:35:11 up 0:00, 0 users, load average: 0.88, 0.20, 0.06
Connection to jenkins.qa.suse.de closed by remote host.
run: 06, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:35:54 up 0:00, 0 users, load average: 1.10, 0.24, 0.08
Connection to jenkins.qa.suse.de closed by remote host.
run: 07, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:36:37 up 0:00, 0 users, load average: 0.91, 0.20, 0.06
Connection to jenkins.qa.suse.de closed by remote host.
run: 08, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:37:18 up 0:00, 0 users, load average: 1.10, 0.24, 0.08
Connection to jenkins.qa.suse.de closed by remote host.
run: 09, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:38:00 up 0:00, 0 users, load average: 1.52, 0.34, 0.11
Connection to jenkins.qa.suse.de closed by remote host.
run: 10, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:38:48 up 0:00, 0 users, load average: 0.95, 0.22, 0.07
Connection to jenkins.qa.suse.de closed by remote host.
run: 11, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:39:35 up 0:00, 0 users, load average: 1.02, 0.24, 0.08
run: 12, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:42:05 up 0:02, 0 users, load average: 0.47, 0.39, 0.16
Connection to jenkins.qa.suse.de closed by remote host.
run: 13, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:42:45 up 0:00, 0 users, load average: 1.33, 0.29, 0.10
Connection to jenkins.qa.suse.de closed by remote host.
run: 14, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:43:27 up 0:00, 0 users, load average: 1.36, 0.31, 0.10
Connection to jenkins.qa.suse.de closed by remote host.
run: 15, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:44:14 up 0:00, 0 users, load average: 0.95, 0.22, 0.07
Connection to jenkins.qa.suse.de closed by remote host.
run: 16, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:44:55 up 0:00, 0 users, load average: 0.44, 0.10, 0.03
Connection to jenkins.qa.suse.de closed by remote host.
run: 17, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:45:36 up 0:00, 0 users, load average: 1.47, 0.33, 0.11
Connection to jenkins.qa.suse.de closed by remote host.
run: 18, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:46:19 up 0:00, 0 users, load average: 0.91, 0.20, 0.06
Connection to jenkins.qa.suse.de closed by remote host.
run: 19, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:47:03 up 0:00, 0 users, load average: 2.14, 0.49, 0.16
Connection to jenkins.qa.suse.de closed by remote host.
run: 20, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:47:50 up 0:00, 0 users, load average: 1.43, 0.32, 0.11
Connection to jenkins.qa.suse.de closed by remote host.
run: 21, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:48:35 up 0:00, 0 users, load average: 1.25, 0.28, 0.09
Connection to jenkins.qa.suse.de closed by remote host.
run: 22, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:49:19 up 0:00, 0 users, load average: 1.97, 0.45, 0.15
Connection to jenkins.qa.suse.de closed by remote host.
run: 23, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:50:01 up 0:00, 0 users, load average: 1.34, 0.31, 0.10
Connection to jenkins.qa.suse.de closed by remote host.
run: 24, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:50:46 up 0:00, 0 users, load average: 1.08, 0.26, 0.08
Connection to jenkins.qa.suse.de closed by remote host.
run: 25, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:51:32 up 0:00, 0 users, load average: 1.02, 0.24, 0.08
Connection to jenkins.qa.suse.de closed by remote host.
run: 26, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:52:14 up 0:00, 0 users, load average: 0.96, 0.21, 0.07
Connection to jenkins.qa.suse.de closed by remote host.
run: 27, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:53:00 up 0:00, 0 users, load average: 0.75, 0.18, 0.06
Connection to jenkins.qa.suse.de closed by remote host.
run: 28, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:53:42 up 0:00, 0 users, load average: 1.18, 0.26, 0.09
Connection to jenkins.qa.suse.de closed by remote host.
run: 29, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:54:25 up 0:00, 0 users, load average: 1.82, 0.41, 0.13
Connection to jenkins.qa.suse.de closed by remote host.
run: 30, jenkins.qa.suse.de: ping .. ok, ssh .. ok, uptime/reboot: 11:55:13 up 0:00, 0 users, load average: 1.08, 0.26, 0.08
Connection to jenkins.qa.suse.de closed by remote host.
so problem can not be reproduced. https://jenkins.qa.suse.de is up-to-date and looks clean. I applied updates within the jenkins system and handled all warnings and error messages showing up in the webUI. The package installation state is also up-to-date and clean. All good now.