Actions
action #182021
closedcoordination #161414: [epic] Improved salt based infrastructure management
[alert] web UI: Too many Minion job failures alert
Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2025-01-23
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Observation¶
Date: Wed, 08 May 2025
https://monitor.qa.suse.de/alerting/grafana/liA25iB4k/view?orgId=1
Looking at https://openqa.suse.de/minion/jobs?state=failed most failed jobs seem to be obs_rsync.
---
args:
- project: SUSE:SLFO:1.1:Staging:S
url: https://api.suse.de/build/SUSE:SLFO:1.1:Staging:S/_result?package=000product
attempts: 1
children: []
created: 2025-05-06T11:51:52.133879Z
delayed: 2025-05-06T11:51:52.133879Z
expires: ~
finished: 2025-05-06T11:51:52.605078Z
id: 15461168
lax: 0
notes:
gru_id: 40709522
project_lock: 1
parents: []
priority: 100
queue: default
result:
code: 256
message: read_files.sh failed for SUSE:SLFO:1.1:Staging:S in enviroment SUSE:SLFO:1.1:Staging:S
retried: ~
retries: 0
started: 2025-05-06T11:51:52.141177Z
state: failed
task: obs_rsync_run
time: 2025-05-08T10:10:45.251805Z
worker: 1993
however more recent failures look like this
---
args:
- project: SUSE:SLFO:1.1:Staging:L
url: https://api.suse.de/build/SUSE:SLFO:1.1:Staging:L/_result?package=000product
attempts: 1
children: []
created: 2025-05-08T06:29:49.087313Z
delayed: 2025-05-08T06:29:49.087313Z
expires: ~
finished: 2025-05-08T06:29:54.475994Z
id: 15500317
lax: 0
notes:
gru_id: 40744546
project_lock: 1
parents: []
priority: 100
queue: default
result:
code: 256
message: read_files.sh failed for SUSE:SLFO:1.1:Staging:L in enviroment SUSE:SLFO:1.1:Staging:L
retried: ~
retries: 0
started: 2025-05-08T06:29:49.089502Z
state: failed
task: obs_rsync_run
time: 2025-05-08T10:08:47.877942Z
worker: 1995
Suggestions¶
- Confirm if this is a (temporary) network connectivity issue OR a case of repos deleted which are still getting picked up by OBS sync
- Look into ignoring related failes OR adjusting repo configs
- Or ask nicely if one of the maintainers would care to fix the config
- Also check older failed minion jobs, consider filing new tickets if there are separate issues there
- Look at schedules configured in openqa-trigger-from-ibs-plugin
Updated by gpathak 27 days ago
- Copied from action #176013: [alert] web UI: Too many Minion job failures alert size:S added
Updated by tinita 27 days ago
- Related to action #179038: Gracious handling of longer remote git clones outages size:S added
Updated by tinita 27 days ago
- Related to action #180962: Many minion failures related to obs_rsync_run due to SLFO submissions size:S added
Actions