Project

General

Profile

Actions

action #174586

open

Incomplete jobs (not restarted) of last 24h alert Salt

Added by jbaier_cz about 6 hours ago. Updated about 1 hour ago.

Status:
In Progress
Priority:
Urgent
Assignee:
Category:
-
Start date:
2024-12-19
Due date:
% Done:

0%

Estimated time:

Description

Observation

Values
B0=365 
Labels
alertname   Incomplete jobs (not restarted) of last 24h alert
grafana_folder  Salt
rule_uid    cXo2cmBVk

https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&viewPanel=panel-17&from=2024-12-17T07:35:39.219Z&to=2024-12-19T07:38:58.850Z

The spike could be due to the CDN and SCC problems, see https://suse.slack.com/archives/C02AYV7UJSD/p1734451708127589 for context. The alert is ok right now, but there is still a bunch of incompletes in the panel, might be worth investigating.


Related issues 1 (0 open1 closed)

Copied from openQA Infrastructure (public) - action #154345: Incomplete jobs (not restarted) of last 24h alert SaltResolvedmkittler

Actions
Actions #1

Updated by jbaier_cz about 6 hours ago

  • Copied from action #154345: Incomplete jobs (not restarted) of last 24h alert Salt added
Actions #2

Updated by okurz about 5 hours ago

  • Tags changed from reactive work, alert to reactive work, alert, infra
  • Priority changed from High to Urgent
Actions #3

Updated by gpathak about 5 hours ago

  • Assignee set to gpathak
Actions #4

Updated by gpathak about 1 hour ago ยท Edited

  • Status changed from New to In Progress

Seems like the issue related to SCC CDN is resolved.
The tests are passing:

In particular this job always fails https://openqa.suse.de/tests/16248796#line-66, not sure why the YAML schedule is missing and how we can provide the Schedule file.
And tests related to baremetal worker is failing with error Could not retrieve required variable SUT_IP: https://openqa.suse.de/tests/16251337#line-117

Actions

Also available in: Atom PDF