Actions
action #162323
openno alert about multi-machine test failures 2024-06-14+
Start date:
2024-06-15
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Observation¶
https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&viewPanel=24&from=1718145766820&to=1718474617885 shows significant (too) high failed+parallel_failed jobs. But no alert was triggered. We should make our alerts trigger in such situations.
Updated by okurz about 1 month ago
- Copied from action #162320: multi-machine test failures 2024-06-14+, auto_review:"ping with packet size 100 failed.*can be GRE tunnel setup issue":retry added
Updated by okurz about 1 month ago
- Status changed from New to Rejected
- Assignee set to okurz
- Priority changed from High to Normal
nevermind. I guess with the failing https://gitlab.suse.de/openqa/scripts-ci/-/pipelines we are still informed enough?
Updated by okurz 30 days ago
- Tags set to infra, monitoring, alert, multi-machine
- Status changed from Rejected to New
- Assignee deleted (
okurz)
no, scripts-ci tests can not uncover all problems as they might not run certain worker combinations. I think an alert in grafana is helpful and would be workable for us.
Actions