action #63451
closed
coordination #102915: [saga][epic] Automated classification of failures
coordination #166655: [epic] openqa-label-known-issues
Improve openqa-monitor-incompletes and openqa-label-known-issues to not report about incompletes with clone
Added by okurz almost 5 years ago.
Updated 2 months ago.
Category:
Feature requests
Description
Observation¶
On Thursday, 13 February 2020 22.35.33 CET Grafana wrote:
[Alerting] New incompletes alert
Metric name
Value
New incompletes
27.000
keep in minds. I bumped the alert threshold so that only if 25 new incompletes occur within one reporting period, that is just 10 seconds (!). I checked 2 out of many jobs and found the same reason:
"Reason: associated worker re-connected but abandoned the job"
The good thing is that they have all been automatically cloned.
Suggestions¶
I guess we need just one more change to the scripts in https://github.com/os-autoinst/scripts/ :
- ignore incompletes that have a clone
Who wants to give it a shot? :)
- Target version set to Ready
- Status changed from New to Workable
- Related to action #69178: workaround for #64776 using https://github.com/os-autoinst/scripts/blob/master/openqa-label-known-issues added
- Subject changed from Improve openqa-monitor-incompletes and openqa-label-known-issues to not report about incompletes with clone / no complain about no logs when there is "reason" to Improve openqa-monitor-incompletes and openqa-label-known-issues to not report about incompletes with clone
- Description updated (diff)
- Status changed from Workable to Rejected
- Assignee set to okurz
I am thinking about the following use-case: A human reviewer is grumpily retriggering an incomplete every day so there is a clone in this case but that person never reports the issue so we would not catch these problems. Rather than skipping the incompletes in openqa-monitor-incompletes we could skip over trying to clone the incompletes in openqa-label-known-issues but the API call is a no-op anyway so let's just not care.
- Status changed from Rejected to Resolved
- Assignee changed from okurz to livdywan
Actually cdywan and me discussed this and only then realized what the proper use case is so the honor goes to him.
- Parent task set to #166655
Also available in: Atom
PDF