action #157753
Updated by okurz 8 months ago
## Motivation
In #132614 openqaworker-arm-1 was moved to FC Basement so that we have one hot-redundant aarch64 OSD machine outside of PRG2. For that to be setup we need to also accomodate the automatic recovery feature.
## Acceptance criteria
* **AC1:** The automatic recovery of openqaworker-arm-1 on crashes works
* **AC2:** openqaworker-arm-1 runs OSD production jobs in a stable way
## Suggestions
* Read #133748 about notes regarding PDU auto-control
* Find on https://wiki.suse.net/index.php/SUSE-Quality_Assurance/Labs how the new PDU can be used
* Integrate the new PDU in https://gitlab.suse.de/openqa/grafana-webhook-actions
* After openqaworker-arm-1 is fully back including recovery remove silences in https://monitor.qa.suse.de/alerting/silences
* Remove the "Mute All times" in https://monitor.qa.suse.de/alerting/routes for `__contacts__ =~ .*"Trigger reboot of openqaworker-arm-1".*`
## Rollback actions
* Bring back openqaworker-arm-1 into production https://progress.opensuse.org/projects/openqav3/wiki/#Bring-back-machines-into-salt-controlled-production
Back