action #103692
closedOnly upgrade o3 workers if package checks are good, same as for o3 webui size:M
Description
Motivation¶
As seen in #103422 we could have prevented problems within the o3 infrastructure by blocking o3 worker upgrades by package checks which already reproduced problems. As we do the o3 worker upgrade with transactional update we could look into executing similar checks that we do in https://github.com/os-autoinst/openQA/blob/master/script/openqa-auto-update#L27
Acceptance criteria¶
- AC1: o3 workers do not automatically upgrade overnight if package checks in https://build.opensuse.org/project/show/devel:openQA fail in the corresponding repository version
- AC2: o3 workers still regularly upgrade overnight
Suggestions¶
- Create a systemd override for "transactional-update.service" that executes the package check from https://github.com/os-autoinst/openQA/blob/master/script/openqa-auto-update#L27 and abort if any fail before the actual upgrade check
- Consider refacotring the "line" into a little cleaner snippet
- Add
-S
to thecurl
command
Updated by okurz about 3 years ago
- Copied from action #103422: [sporadic] os-autoinst: 13-osutils.t:167 Failed test 'Exit code appear in log' in GHA size:M added
Updated by livdywan about 3 years ago
- Subject changed from Only upgrade o3 workers if package checks are good, same as for o3 webui to Only upgrade o3 workers if package checks are good, same as for o3 webui size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by mkittler about 3 years ago
- Status changed from Workable to In Progress
PR for extracting the check into a separate script: https://github.com/os-autoinst/openQA/pull/4398
Updated by mkittler about 3 years ago
Once the PR is merged and deployed, I'll add an override for transactional-update.service
:
[Service]
ExecCondition=/usr/share/openqa/script/openqa-check-devel-repo
This way the update won't be executed if the devel repo isn't in a good state but we still don't end up with a failing unit.
Updated by okurz about 3 years ago
Wow, didn't know about ExecCondition, that's great!
Updated by openqa_review about 3 years ago
- Due date set to 2021-12-25
Setting due date based on mean cycle time of SUSE QE Tools
Updated by mkittler about 3 years ago
Tested on openqaworker7. It works in both cases (devel:openQA is good or broken). I've tested the broken case by putting an exit 1
in the script to see whether systemd behaves as expected. The script itself should be fine (tested locally and it is basically just what we already do for OSD). I'm currently installing it on the remaining o3 workers.
Updated by mkittler about 3 years ago
- Status changed from In Progress to Feedback
Installed openQA-auto-update
on all workers and added the override via mkdir -p /etc/systemd/system/transactional-update.service.d && echo -e "[Service]\nExecCondition=/usr/share/openqa/script/openqa-check-devel-repo" > /etc/systemd/system/transactional-update.service.d/override.conf
.
Updated by mkittler about 3 years ago
- Status changed from Feedback to Resolved
Looks like the o3 workers are still rebooted in the good case. I think my artificial tests for the bad case are good enough. So I'm resolving the issue.