Project

General

Profile

Actions

action #175836

closed

[alert][FIRING:1] (Broken workers alert Salt dZ025mf4z) due to osd reboot, many broken workers size:S

Added by okurz about 1 month ago. Updated about 1 month ago.

Status:
Resolved
Priority:
High
Category:
Regressions/Crashes
Start date:
2025-01-20
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&viewPanel=panel-96&from=2025-01-15T02:00:38.153Z&to=2025-01-15T20:10:14.812Z&var-host_disks=$__all
&
https://monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&viewPanel=panel-96&from=2025-01-18T20:48:41.457Z&to=2025-01-19T14:58:18.116Z&var-host_disks=$__all

This was during the weekly maintenance reboot when OSD can reboot but planned automatic reboots shouldn't trigger alerts. There shouldn't be reports about 350 broken workers for an hour during reboot.

Suggestions

  • Silence and reference #162296
  • Block on #162296
  • Remove silence and ensure we are back to good during weekly OSD reboots

Related issues 1 (0 open1 closed)

Blocked by openQA Project (public) - action #162296: openQA workers crash with Linux 6.4 after upgrade openSUSE Leap 15.6 size:SResolveddheidler2024-06-14

Actions
Actions

Also available in: Atom PDF