action #92110
closedSeveral Job age (scheduled) alerts on Sunday
0%
Description
Grafana osd-admins@suse.de[OK] Job age (scheduled) (max) alert
[OK] Job age (scheduled) (max) alert Jobs not scheduled for 4 days (345600s). Possible reasons: * There are no online workers for selected scheduled jobs, misconfiguration on the side of tests likely See https://progress.opensuse.org/issues/73174#note-2 for an explanation of the selection of th…
Grafana osd-admins@suse.de[OK] Job age (scheduled) (median) alert
[OK] Job age (scheduled) (median) alert Check for overall decrease of "time to start". Possible reasons for regression: * Not enough ressources * Too many tests scheduled due to misconfiguration 2020-11-27: Alert limit set to 259200s = 3d, see https://progress.opensuse.org/issues/73174#note-2 a…
Grafana osd-admins@suse.de[Alerting] Job age (scheduled) (max) alert
[Alerting] Job age (scheduled) (max) alert Jobs not scheduled for 4 days (345600s). Possible reasons: * There are no online workers for selected scheduled jobs, misconfiguration on the side of tests likely See https://progress.opensuse.org/issues/73174#note-2 for an explanation of the selection…
Grafana osd-admins@suse.de[Alerting] Job age (scheduled) (median) alert
[Alerting] Job age (scheduled) (median) alert Check for overall decrease of "time to start". Possible reasons for regression: * Not enough ressources * Too many tests scheduled due to misconfiguration 2020-11-27: Alert limit set to 259200s = 3d, see https://progress.opensuse.org/issues/73174#no…
Grafana osd-admins@suse.de[OK] Job age (scheduled) (median) alert
[OK] Job age (scheduled) (median) alert Check for overall decrease of "time to start". Possible reasons for regression: * Not enough ressources * Too many tests scheduled due to misconfiguration 2020-11-27: Alert limit set to 259200s = 3d, see https://progress.opensuse.org/issues/73174#note-2 a…
Grafana osd-admins@suse.de[Alerting] Job age (scheduled) (median) alert
[Alerting] Job age (scheduled) (median) alert Check for overall decrease of "time to start". Possible reasons for regression: * Not enough ressources * Too many tests scheduled due to misconfiguration 2020-11-27: Alert limit set to 259200s = 3d, see https://progress.opensuse.org/issues/73174#no…
Updated by okurz over 3 years ago
- Project changed from openQA Project (public) to openQA Infrastructure (public)
- Due date set to 2021-05-18
- Status changed from New to Feedback
- Assignee set to mkittler
- Priority changed from Normal to High
- Target version set to Ready
@mkittler you already did something for these cases, related to pc_gce, was it?
Updated by mkittler over 3 years ago
I did nothing besides informing users. I suppose some of the tests got cancelled, indeed. However, there are still a few 7 days old scheduled jobs. Apparently not enough to trigger the job age alerts, though.
Updated by okurz over 3 years ago
- Status changed from Feedback to New
- Assignee deleted (
mkittler)
ok, unassigning you again then.
Updated by okurz over 3 years ago
- Status changed from New to In Progress
- Assignee set to okurz
Updated by okurz over 3 years ago
- Status changed from In Progress to Resolved
https://openqa.suse.de/tests/ shows currently 4 jobs that are scheduled for 8 days, e.g. https://openqa.suse.de/tests/5918513 with worker class "WORKER_CLASS=qemu_x86_64,pc_gce" which is not fulfilled by any worker instances anymore after my MR https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/308 which I worked on for #91458 . I checked the job templates in the job group https://openqa.suse.de/admin/job_templates/275 and it looks like someone fixed that problem in the meantime. So this explains the alerts. I cancelled all four remaining jobs and pointed to the current ticket in the openQA jobs and pinged @jlausuch in https://chat.suse.de/channel/testing . Currently no failing alert.