action #166802: Recover worker37, worker38, worker39 size:S - openQA Infrastructure (public) - openSUSE Project Management Tool

Actions

action #166802

open

openQA Project (public) - coordination #110833: [saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances

openQA Project (public) - coordination #139010: [epic] Long OSD ppc64le job queue

Recover worker37, worker38, worker39 size:S

Added by okurz 7 months ago. Updated 6 months ago.

Status:

Blocked

Priority:

Normal

Assignee:

okurz

Category:

Feature requests

Target version:

QA (public) - future

Start date:

Due date:

% Done:

Estimated time:

Tags:

osd, infra, prg2, nue2

Description

Motivation¶

After #139103 we should ensure that all remaining currently offline machines in PRG2 oQA infra are up and running again

Acceptance criteria¶

AC1: All w37-w39 run OSD production jobs
AC2: non-x86_64, non-qemu jobs are still executed and not starved out by too many x86_64

Suggestions¶

Take care to apply the workarounds from #157975-12 to prevent accidental distribution upgrades
Read what was done in #139103, bring up all w37-w39 again into production
Monitor for the impact on qemu_ppc64le job age as well as other non-x86_64, non-qemu jobs

Related issues 7 (2 open — 5 closed)

Related to openQA Infrastructure (public) - action #157726: osd-deployment | Failed pipeline for master (worker3[6-9].oqa.prg2.suse.org)

Resolved

okurz

2024-03-18

Actions

Related to openQA Infrastructure (public) - action #167057: Run more standard, qemu OSD openQA jobs in CC-compliant PRG2 and none in NUE2 size:S

Resolved

okurz

2024-09-19

Actions

Related to openQA Project (public) - action #134924: Websocket server overloaded, affected worker slots shown as "broken" with graceful disconnect in workers table

New

2023-08-31

Actions

Related to openQA Infrastructure (public) - action #167081: test fails in support_server/setup on osd worker37 size:S

Resolved

mkittler

2024-09-19

2024-10-09

Actions

Related to openQA Project (public) - action #157690: Simple global limit of registered/online workers size:M

Resolved

mkittler

2024-03-21

Actions

Related to openQA Project (public) - action #168178: Limit connected online workers based on websocket+scheduler load size:M

Workable

mkittler

Actions

Copied from openQA Infrastructure (public) - action #139103: Long OSD ppc64le job queue - Decrease number of x86_64 worker slots on osd to give ppc64le jobs a better chance to be assigned jobs size:M

Resolved

okurz

2023-11-04

Project

General

Profile

QA (public) » openQA Project (public) » openQA Infrastructure (public)

Tags

Custom queries

action #166802

Recover worker37, worker38, worker39 size:S

Motivation¶

Acceptance criteria¶

Suggestions¶

Updated by okurz 7 months ago

Updated by okurz 7 months ago

Updated by okurz 7 months ago

Updated by okurz 7 months ago

Updated by okurz 7 months ago

Updated by okurz 7 months ago · Edited

Updated by openqa_review 7 months ago

Updated by okurz 7 months ago

Updated by okurz 7 months ago

Updated by okurz 7 months ago

Updated by nicksinger 7 months ago · Edited

Updated by nicksinger 7 months ago

Updated by livdywan 7 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago · Edited

Updated by okurz 6 months ago

Updated by okurz 6 months ago