Project

General

Profile

Actions

action #162521

open

coordination #162524: [epic] Optimized o3 infrastructure

Reconsider the global job limit on o3, try higher than 170

Added by okurz 28 days ago. Updated 28 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2024-06-19
Due date:
% Done:

0%

Estimated time:

Description

Motivation

In #151807-10 the global job limit on o3 was set to 170. The previous limit wasn't mentioned so I don't know what it was but I assume much higher. 170 is rather low considering also that we have so many worker instance availble. #151807 saw multiple changes so I assume we can actually use a much higher job limit again.

Acceptance criteria

  • AC1: The global job limit on o3 is significantly higher than 170 or blocking improvement tasks are planned

Suggestions

  • Understand why the original selection of 170 jobs was done
  • Carefully increase the job limit and monitor over at least 10 days
  • Try to find a hard upper limit and select a job limit below that with a sane buffer

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure - action #151807: [alert] o3 zabbix: Problem: /var/lib/snapshot-changes: Disk space is critically low (used > 94%) size:MResolvedtinita2023-11-302023-12-16

Actions
Related to openQA Infrastructure - action #138545: Munin - minion hook failed - opensuse.org :: openqa.opensuse.org size:SResolvedtinita2023-11-28

Actions
Actions

Also available in: Atom PDF