action #105621
closed[Alerting] Failed systemd services alert
Added by okurz almost 3 years ago. Updated almost 3 years ago.
0%
Description
Observation¶
https://stats.openqa-monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services currently shows
2022-01-27 06:23:00 openqaworker-arm-1 openqa-worker-cacheservice-minion
2022-01-26 15:19:01 openqa systemd-journal-flush
Updated by okurz almost 3 years ago
- Copied from action #105618: [Alerting] CPU Load alert size:S added
Updated by mkittler almost 3 years ago
- Status changed from New to Feedback
The cache service logs don't show what the problem was, just that the main process exited:
Jan 27 05:17:23 openqaworker-arm-1 openqa-worker-cacheservice-minion[25950]: [25950] [i] [#50] Downloading "SLE-15-SP3-Full-aarch64-GM-Media1.iso" from "http://openqa.suse.de/tests/8039274/asset/iso/SLE-15-SP3-Full-aarch64-GM-Media1.is>
Jan 27 05:17:29 openqaworker-arm-1 openqa-worker-cacheservice-minion[26636]: [26636] [i] [#51] Downloading: "SLES-15-SP3-aarch64-mru-install-minimal-with-addons-Build20220127-1-Server-DVD-Updates-aarch64-virtio-uefi-vars.qcow2"
Jan 27 05:17:29 openqaworker-arm-1 openqa-worker-cacheservice-minion[26636]: [26636] [i] [#51] Cache size of "/var/lib/openqa/cache" is 49 GiB, with limit 50 GiB
Jan 27 05:17:29 openqaworker-arm-1 openqa-worker-cacheservice-minion[26636]: [26636] [i] [#51] Downloading "SLES-15-SP3-aarch64-mru-install-minimal-with-addons-Build20220127-1-Server-DVD-Updates-aarch64-virtio-uefi-vars.qcow2" from "ht>
Jan 27 05:17:39 openqaworker-arm-1 openqa-worker-cacheservice-minion[26677]: [26677] [i] [#52] Sync: "rsync://openqa.suse.de/tests" to "/var/lib/openqa/cache/openqa.suse.de"
Jan 27 05:17:39 openqaworker-arm-1 openqa-worker-cacheservice-minion[26677]: [26677] [i] [#52] Calling: rsync -avHP --timeout 1800 rsync://openqa.suse.de/tests/ --delete /var/lib/openqa/cache/openqa.suse.de/tests/
Jan 27 05:22:55 openqaworker-arm-1 systemd[1]: Stopping OpenQA Worker Cache Service Minion...
Jan 27 05:22:59 openqaworker-arm-1 openqa-worker-cacheservice-minion[2378]: [2378] [i] Worker 2378 stopped
Jan 27 05:23:00 openqaworker-arm-1 systemd[1]: openqa-worker-cacheservice-minion.service: Main process exited, code=exited, status=192/n/a
Jan 27 05:23:00 openqaworker-arm-1 systemd[1]: openqa-worker-cacheservice-minion.service: Failed with result 'exit-code'.
Jan 27 05:23:00 openqaworker-arm-1 systemd[1]: Stopped OpenQA Worker Cache Service Minion.
Jan 27 05:23:00 openqaworker-arm-1 systemd[1]: Started OpenQA Worker Cache Service Minion.
n 27 05:29:45 openqaworker-arm-1 openqa-worker-cacheservice-minion[28656]: [28656] [i] Cache size of "/var/lib/openqa/cache" is 49 GiB, with limit 50 GiB
Jan 27 05:29:45 openqaworker-arm-1 openqa-worker-cacheservice-minion[28656]: [28656] [i] Resetting all leftover locks after restart
Jan 27 05:29:45 openqaworker-arm-1 openqa-worker-cacheservice-minion[28656]: [28656] [i] Worker 28656 started
Jan 27 05:29:45 openqaworker-arm-1 openqa-worker-cacheservice-minion[37710]: [37710] [i] [#53] Downloading: "SLES-15-SP3-aarch64-mru-install-minimal-with-addons-Build20220127-1-Server-DVD-Updates-aarch64-virtio.qcow2"
Jan 27 05:29:50 openqaworker-arm-1 openqa-worker-cacheservice-minion[37749]: [37749] [i] [#54] Downloading: "SLE-15-SP3-Full-aarch64-GM-Media1.iso"
Jan 27 05:30:01 openqaworker-arm-1 openqa-worker-cacheservice-minion[37804]: [37804] [i] [#55] Downloading: "SLES-15-SP3-aarch64-mru-install-minimal-with-addons-Build20220127-1-Server-DVD-Updates-aarch64-virtio-uefi-vars.qcow2"
Jan 27 05:30:11 openqaworker-arm-1 openqa-worker-cacheservice-minion[37894]: [37894] [i] [#56] Sync: "rsync://openqa.suse.de/tests" to "/var/lib/openqa/cache/openqa.suse.de"
Jan 27 05:30:11 openqaworker-arm-1 openqa-worker-cacheservice-minion[37894]: [37894] [i] [#56] Calling: rsync -avHP --timeout 1800 rsync://openqa.suse.de/tests/ --delete /var/lib/openqa/cache/openqa.suse.de/tests/
Jan 27 05:34:23 openqaworker-arm-1 openqa-worker-cacheservice-minion[44451]: [44451] [i] [#57] Downloading: "SLES-12-SP5-aarch64-mru-install-minimal-with-addons-Build20220127-1-Server-DVD-Updates-aarch64-virtio.qcow2"
Jan 27 05:34:43 openqaworker-arm-1 openqa-worker-cacheservice-minion[44451]: [44451] [i] [#57] Cache size of "/var/lib/openqa/cache" is 49 GiB, with limit 50 GiB
Jan 27 05:34:43 openqaworker-arm-1 openqa-worker-cacheservice-minion[44451]: [44451] [i] [#57] Downloading "SLES-12-SP5-aarch64-mru-install-minimal-with-addons-Build20220127-1-Server-DVD-Updates-aarch64-virtio.qcow2" from "http://openq>
Jan 27 05:34:48 openqaworker-arm-1 openqa-worker-cacheservice-minion[44538]: [44538] [i] [#58] Downloading: "SLE-12-SP5-Server-DVD-aarch64-GM-DVD1.iso"
Jan 27 05:35:45 openqaworker-arm-1 openqa-worker-cacheservice-minion[44538]: [44538] [i] [#58] Cache size of "/var/lib/openqa/cache" is 47 GiB, with limit 50 GiB
Jan 27 05:35:45 openqaworker-arm-1 openqa-worker-cacheservice-minion[44538]: [44538] [i] [#58] Downloading "SLE-12-SP5-Server-DVD-aarch64-GM-DVD1.iso" from "http://openqa.suse.de/tests/8039492/asset/iso/SLE-12-SP5-Server-DVD-aarch64-GM>
Jan 27 05:35:49 openqaworker-arm-1 openqa-worker-cacheservice-minion[44821]: [44821] [i] [#59] Downloading: "SLES-12-SP5-aarch64-mru-install-minimal-with-addons-Build20220127-1-Server-DVD-Updates-aarch64-virtio-uefi-vars.qcow2"
Jan 27 05:35:50 openqaworker-arm-1 openqa-worker-cacheservice-minion[44821]: [44821] [i] [#59] Cache size of "/var/lib/openqa/cache" is 46 GiB, with limit 50 GiB
Jan 27 05:35:50 openqaworker-arm-1 openqa-worker-cacheservice-minion[44821]: [44821] [i] [#59] Downloading "SLES-12-SP5-aarch64-mru-install-minimal-with-addons-Build20220127-1-Server-DVD-Updates-aarch64-virtio-uefi-vars.qcow2" from "ht>
Jan 27 05:35:55 openqaworker-arm-1 openqa-worker-cacheservice-minion[44832]: [44832] [i] [#60] Sync: "rsync://openqa.suse.de/tests" to "/var/lib/openqa/cache/openqa.suse.de"
Jan 27 05:35:55 openqaworker-arm-1 openqa-worker-cacheservice-minion[44832]: [44832] [i] [#60] Calling: rsync -avHP --timeout 1800 rsync://openqa.suse.de/tests/ --delete /var/lib/openqa/cache/openqa.suse.de/tests/
Jan 27 05:47:56 openqaworker-arm-1 openqa-worker-cacheservice-minion[48557]: [48557] [i] [#61] Downloading: "SLE-15-SP4-Online-aarch64-Build88.4-Media1.iso.sha256"
Jan 27 05:47:57 openqaworker-arm-1 openqa-worker-cacheservice-minion[48557]: [48557] [i] [#61] Cache size of "/var/lib/openqa/cache" is 46 GiB, with limit 50 GiB
Jan 27 05:47:57 openqaworker-arm-1 openqa-worker-cacheservice-minion[48557]: [48557] [i] [#61] Downloading "SLE-15-SP4-Online-aarch64-Build88.4-Media1.iso.sha256" from "http://openqa.suse.de/tests/8039541/asset/other/SLE-15-SP4-Online->
Jan 27 05:48:02 openqaworker-arm-1 openqa-worker-cacheservice-minion[48599]: [48599] [i] [#62] Downloading: "SLES-15-SP4-aarch64-Build88.4@aarch64-gnome.qcow2"
Jan 27 05:48:31 openqaworker-arm-1 openqa-worker-cacheservice-minion[48599]: [48599] [i] [#62] Cache size of "/var/lib/openqa/cache" is 46 GiB, with limit 50 GiB
Jan 27 05:48:31 openqaworker-arm-1 openqa-worker-cacheservice-minion[48599]: [48599] [i] [#62] Downloading "SLES-15-SP4-aarch64-Build88.4@aarch64-gnome.qcow2" from "http://openqa.suse.de/tests/8039541/asset/hdd/SLES-15-SP4-aarch64-Buil>
Jan 27 05:48:32 openqaworker-arm-1 openqa-worker-cacheservice-minion[48679]: [48679] [i] [#63] Downloading: "SLE-15-SP4-Online-aarch64-Build88.4-Media1.iso"
Jan 27 05:48:39 openqaworker-arm-1 openqa-worker-cacheservice-minion[48679]: [48679] [i] [#63] Cache size of "/var/lib/openqa/cache" is 48 GiB, with limit 50 GiB
Jan 27 05:48:39 openqaworker-arm-1 openqa-worker-cacheservice-minion[48679]: [48679] [i] [#63] Downloading "SLE-15-SP4-Online-aarch64-Build88.4-Media1.iso" from "http://openqa.suse.de/tests/8039541/asset/iso/SLE-15-SP4-Online-aarch64-B>
Jan 27 05:48:43 openqaworker-arm-1 openqa-worker-cacheservice-minion[48707]: [48707] [i] [#64] Downloading: "SLES-15-SP4-aarch64-Build88.4@aarch64-gnome-uefi-vars.qcow2"
Jan 27 05:48:53 openqaworker-arm-1 openqa-worker-cacheservice-minion[48759]: [48759] [i] [#65] Sync: "rsync://openqa.suse.de/tests" to "/var/lib/openqa/cache/openqa.suse.de"
Jan 27 05:48:53 openqaworker-arm-1 openqa-worker-cacheservice-minion[48759]: [48759] [i] [#65] Calling: rsync -avHP --timeout 1800 rsync://openqa.suse.de/tests/ --delete /var/lib/openqa/cache/openqa.suse.de/tests/
Jan 27 05:52:36 openqaworker-arm-1 openqa-worker-cacheservice-minion[1457]: [1457] [i] [#66] Downloading: "SLE-15-SP4-Online-aarch64-Build43.1-Media1.iso.sha256"
Jan 27 05:52:36 openqaworker-arm-1 openqa-worker-cacheservice-minion[1457]: [1457] [i] [#66] Cache size of "/var/lib/openqa/cache" is 48 GiB, with limit 50 GiB
Jan 27 05:52:36 openqaworker-arm-1 openqa-worker-cacheservice-minion[1457]: [1457] [i] [#66] Downloading "SLE-15-SP4-Online-aarch64-Build43.1-Media1.iso.sha256" from "http://openqa.suse.de/tests/8036768/asset/other/SLE-15-SP4-Online-aa>
Jan 27 05:52:43 openqaworker-arm-1 openqa-worker-cacheservice-minion[1470]: [1470] [i] [#67] Downloading: "SLE-15-SP4-Online-aarch64-Build43.1-Media1.iso.sha256"
Jan 27 05:52:46 openqaworker-arm-1 openqa-worker-cacheservice-minion[1481]: [1481] [i] [#68] Downloading: "sle-15-SP4-aarch64-43.1-textmode@aarch64.qcow2"
Jan 27 05:52:49 openqaworker-arm-1 openqa-worker-cacheservice-minion[1484]: [1484] [i] [#69] Downloading: "sle-15-SP4-aarch64-43.1-textmode@aarch64.qcow2"
-- Reboot --
Jan 27 07:23:36 openqaworker-arm-1 systemd[1]: Started OpenQA Worker Cache Service Minion.
Jan 27 07:23:46 openqaworker-arm-1 openqa-worker-cacheservice-minion[2323]: [2323] [i] Creating cache directory tree for "/var/lib/openqa/cache"
Jan 27 07:23:46 openqaworker-arm-1 systemd[1]: Stopping OpenQA Worker Cache Service Minion...
Jan 27 07:23:46 openqaworker-arm-1 systemd[1]: openqa-worker-cacheservice-minion.service: Succeeded.
Jan 27 07:23:46 openqaworker-arm-1 systemd[1]: Stopped OpenQA Worker Cache Service Minion.
Jan 27 07:23:46 openqaworker-arm-1 systemd[1]: Started OpenQA Worker Cache Service Minion.
Jan 27 07:26:17 openqaworker-arm-1 openqa-worker-cacheservice-minion[2500]: [2500] [i] Cache size of "/var/lib/openqa/cache" is 0 Byte, with limit 50 GiB
Jan 27 07:26:17 openqaworker-arm-1 openqa-worker-cacheservice-minion[2500]: [2500] [i] Resetting all leftover locks after restart
Jan 27 07:26:17 openqaworker-arm-1 openqa-worker-cacheservice-minion[2500]: [2500] [i] Worker 2500 started
There are no coredumps present. The machine wasn't up that long before it happened and also rebooted shortly afterwards (likely after a crash). So the relevant logs before and after the failure aren't that long. However, I couldn't find any clues in them.
The logs for the journal service on the web UI host aren't much more helpful:
martchus@openqa:~> sudo journalctl -fu systemd-journal-flush
-- Logs begin at Sun 2022-01-23 02:46:40 CET. --
Jan 24 12:44:22 openqa systemd[1]: Stopped Flush Journal to Persistent Storage.
Jan 24 12:44:23 openqa systemd[1]: Starting Flush Journal to Persistent Storage...
Jan 24 12:45:05 openqa systemd[1]: Finished Flush Journal to Persistent Storage.
Jan 24 13:08:06 openqa systemd[1]: Stopping Flush Journal to Persistent Storage...
Jan 24 13:08:06 openqa systemd[1]: systemd-journal-flush.service: Succeeded.
Jan 24 13:08:06 openqa systemd[1]: Stopped Flush Journal to Persistent Storage.
Jan 24 13:08:06 openqa systemd[1]: Starting Flush Journal to Persistent Storage...
Jan 24 13:08:07 openqa systemd[1]: Finished Flush Journal to Persistent Storage.
Jan 26 15:19:02 openqa systemd[1]: Starting Flush Journal to Persistent Storage...
Jan 26 15:19:02 openqa systemd[1]: Finished Flush Journal to Persistent Storage.
The journal on OSD is generally working so whatever the problem was, it hasn't had much impact.
Updated by mkittler almost 3 years ago
- Status changed from Feedback to Resolved
I'm resolving this due to lack of information and the low impact. If it (those are actually two distinct issues) happens more often we can still try to investigate further.
Note that the journal service failure might have something to do with our recent manual tampering of the journal service when reacting to the file system alert.