Project

General

Profile

Actions

action #162323

open

no alert about multi-machine test failures 2024-06-14+

Added by okurz about 1 month ago. Updated 27 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Regressions/Crashes
Target version:
Start date:
2024-06-15
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&viewPanel=24&from=1718145766820&to=1718474617885 shows significant (too) high failed+parallel_failed jobs. But no alert was triggered. We should make our alerts trigger in such situations.


Related issues 1 (0 open1 closed)

Copied from openQA Project - action #162320: multi-machine test failures 2024-06-14+, auto_review:"ping with packet size 100 failed.*can be GRE tunnel setup issue":retryResolvedokurz2024-06-15

Actions
Actions #1

Updated by okurz about 1 month ago

  • Copied from action #162320: multi-machine test failures 2024-06-14+, auto_review:"ping with packet size 100 failed.*can be GRE tunnel setup issue":retry added
Actions #2

Updated by okurz about 1 month ago

  • Status changed from New to Rejected
  • Assignee set to okurz
  • Priority changed from High to Normal

nevermind. I guess with the failing https://gitlab.suse.de/openqa/scripts-ci/-/pipelines we are still informed enough?

Actions #3

Updated by okurz 30 days ago

  • Tags set to infra, monitoring, alert, multi-machine
  • Status changed from Rejected to New
  • Assignee deleted (okurz)

no, scripts-ci tests can not uncover all problems as they might not run certain worker combinations. I think an alert in grafana is helpful and would be workable for us.

Actions #4

Updated by okurz 27 days ago

  • Target version changed from Ready to Tools - Next
Actions

Also available in: Atom PDF