Project

General

Custom queries

Profile

Actions

action #174586

closed

Incomplete jobs (not restarted) of last 24h alert Salt

Added by jbaier_cz 2 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Start date:
2024-12-19
Due date:
2025-01-03
% Done:

0%

Estimated time:

Description

Observation

Values
B0=365 
Labels
alertname   Incomplete jobs (not restarted) of last 24h alert
grafana_folder  Salt
rule_uid    cXo2cmBVk

https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&viewPanel=panel-17&from=2024-12-17T07:35:39.219Z&to=2024-12-19T07:38:58.850Z

The spike could be due to the CDN and SCC problems, see https://suse.slack.com/archives/C02AYV7UJSD/p1734451708127589 for context. The alert is ok right now, but there is still a bunch of incompletes in the panel, might be worth investigating.


Related issues 2 (0 open2 closed)

Copied from openQA Infrastructure (public) - action #154345: Incomplete jobs (not restarted) of last 24h alert SaltResolvedmkittler

Actions
Copied to openQA Infrastructure (public) - action #175473: OpenQA Jobs test - Incomplete jobs (not restarted) of last 24h alert SaltResolvedokurz2024-12-19

Actions
Actions #1

Updated by jbaier_cz 2 months ago

  • Copied from action #154345: Incomplete jobs (not restarted) of last 24h alert Salt added
Actions #2

Updated by okurz 2 months ago

  • Tags changed from reactive work, alert to reactive work, alert, infra
  • Priority changed from High to Urgent
Actions #3

Updated by gpathak 2 months ago

  • Assignee set to gpathak
Actions #4

Updated by gpathak 2 months ago · Edited

  • Status changed from New to In Progress

Seems like the issue related to SCC CDN is resolved.
The tests are passing:

In particular this job always fails https://openqa.suse.de/tests/16248796#line-66, not sure why the YAML schedule is missing and how we can provide the Schedule file.
And tests related to baremetal worker is failing with error Could not retrieve required variable SUT_IP: https://openqa.suse.de/tests/16251337#line-117

Actions #5

Updated by openqa_review 2 months ago

  • Due date set to 2025-01-03

Setting due date based on mean cycle time of SUSE QE Tools

Actions #6

Updated by gpathak 2 months ago

  • Status changed from In Progress to Resolved
Actions #7

Updated by gpathak about 1 month ago

  • Copied to action #175473: OpenQA Jobs test - Incomplete jobs (not restarted) of last 24h alert Salt added
Actions

Also available in: Atom PDF