Project

General

Profile

action #134132

Updated by livdywan 9 months ago

## Motivation 
 Currently we have multiple openQA OSD bare-metal test machines in NUE2 FC Basement. They are still controlled by openQA workers running from NUE1. To rely less on NUE1 in preparation for further move and also to reduce unnecessary cross-site network transfers the controlling openQA worker should also be in NUE2 FC Basement. This has the additional benefit that if a network outage affects NUE2 then also the according openQA worker affected by the same condition would not try to execute openQA jobs using not available machines. 

 ## Acceptance criteria 
 * **AC1:** All OSD openQA bare-metal machines in NUE2 are controlled by an openQA worker in NUE2 
 * **AC2:** All OSD openQA bare-metal machines in NUE2 are still able to run openQA jobs as before 

 ## Suggestions 
 * "controlled by" could mean ipmi being used on a worker to execute tests on baremetal 
 * Identify relevant machines in https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls , e.g. every bare-metal test machine in .qe.nue2.suse.org like sonic https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls#L1207 
 * Move those entries to a suitable openQA worker within NUE2, e.g. openqaworker1, see https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls#L1486 
 * Verify that according openQA jobs still work fine 
 * Ensure that the worker classes location settings match for the according openQA worker (not the bare-metal target host) 
 * Inform affected users 
 * Finish before the currently controlling machines go offline, e.g. grenache-1 as part of #132140 
 * Consider using multiple controlling hosts to avoid several bare metal workers being down at once

Back