Project

General

Profile

Actions

action #105804

closed

Job age (scheduled) (median) alert size:S

Added by tinita about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2022-02-01
Due date:
% Done:

0%

Estimated time:

Description

Scheduled jobs are piling up:
https://stats.openqa-monitor.qa.suse.de/d/7W06NBWGk/job-age?from=now-7d&to=now

The oldest jobs all seem to be ppc64le + ppc64le-2g

Acceptance criteria

  • AC1: Alerts for Job age (scheduled) (max) and Job age (scheduled) (median) are active
  • AC2: All worker slots for ppc64le are active

Related issues 2 (1 open1 closed)

Related to openQA Infrastructure - coordination #102882: [epic] All OSD PPC64LE workers except malbec appear to have horribly broken cache serviceResolvedkraih2022-02-10

Actions
Related to openQA Infrastructure - action #135008: Max job age graphs use mean aggregation when max would make more senseNew2023-09-01

Actions
Actions #1

Updated by okurz about 2 years ago

  • Related to coordination #102882: [epic] All OSD PPC64LE workers except malbec appear to have horribly broken cache service added
Actions #2

Updated by okurz about 2 years ago

  • Priority changed from High to Urgent

That ppc jobs are piling up is no surprise due to #102882 . I suggest to pause the alert and track this ticket as blocked as long as #102882 is not solved.

Actions #3

Updated by mkittler about 2 years ago

  • Status changed from New to Blocked
  • Assignee set to mkittler
  • Priority changed from Urgent to High

Since I've already checked the performance of the power workers today I'll track this as blocked. I'm also lowering the prio because we have this issue now for quite a while and the figures from the performance test don't suggest it has gotten worse. Considering the alert turned off again after 2 hours I assume the impact is still not that high.

Actions #4

Updated by tinita about 2 years ago

Although this ticket is assigned to mkittler, as okurz asked me directly to pause the alerts, I did that now and set both Job age (scheduled) (max) and Job age (scheduled) (median) to paused.
Please remember when resolving, I will also try to.

Actions #5

Updated by tinita about 2 years ago

  • Description updated (diff)
Actions #6

Updated by mkittler about 2 years ago

Due to #102882#note-65 I dared to resume the alerts.

Actions #7

Updated by livdywan about 2 years ago

  • Subject changed from Job age (scheduled) (median) alert to Job age (scheduled) (median) alert size:S
  • Description updated (diff)
Actions #8

Updated by mkittler about 2 years ago

  • Status changed from Blocked to Resolved

It looks good so far so I'm resolving the ticket.

Actions #9

Updated by okurz 8 months ago

  • Related to action #135008: Max job age graphs use mean aggregation when max would make more sense added
Actions

Also available in: Atom PDF