Project

General

Profile

Actions

action #179317

closed

coordination #161414: [epic] Improved salt based infrastructure management

[osd] openQA does not pick up most scheduled jobs, sluggish start, multiple failures on 2025-03-21

Added by waynechen55 14 days ago. Updated 10 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2025-03-21
Due date:
% Done:

0%

Estimated time:

Description

Observation

There are many scheduled test runs and, at the same time, many idle ipmi workers. But these workers do not pick up scheduled jobs to run.


Steps to reproduce

  • Schedule test runs with ipmi workers
  • Wait for it to be picked up by idle workers

Impact

All test runs can not start

Problem

Looks like something wrong with workers

Suggestions

  • Check workers states
  • Check workers machines

Workaround

n/a


Files

scheduled_jobs.png (60.5 KB) scheduled_jobs.png waynechen55, 2025-03-21 03:11
idle_ipmi_workers.png (119 KB) idle_ipmi_workers.png waynechen55, 2025-03-21 03:11
37_jobs_running.png (31.2 KB) 37_jobs_running.png waynechen55, 2025-03-21 06:40

Related issues 1 (1 open0 closed)

Related to openQA Infrastructure (public) - action #176175: [alert] Grafana failed to start due to corrupted config fileBlockedokurz2025-01-26

Actions
Actions

Also available in: Atom PDF