Project

General

Profile

action #103692

Only upgrade o3 workers if package checks are good, same as for o3 webui size:M

Added by okurz about 2 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
Due date:
2021-12-25
% Done:

0%

Estimated time:
Difficulty:

Description

Motivation

As seen in #103422 we could have prevented problems within the o3 infrastructure by blocking o3 worker upgrades by package checks which already reproduced problems. As we do the o3 worker upgrade with transactional update we could look into executing similar checks that we do in https://github.com/os-autoinst/openQA/blob/master/script/openqa-auto-update#L27

Acceptance criteria

Suggestions


Related issues

Copied from openQA Project - action #103422: [sporadic] os-autoinst: 13-osutils.t:167 Failed test 'Exit code appear in log' in GHA size:MResolved2021-11-30

History

#1 Updated by okurz about 2 months ago

  • Copied from action #103422: [sporadic] os-autoinst: 13-osutils.t:167 Failed test 'Exit code appear in log' in GHA size:M added

#2 Updated by cdywan about 2 months ago

  • Subject changed from Only upgrade o3 workers if package checks are good, same as for o3 webui to Only upgrade o3 workers if package checks are good, same as for o3 webui size:M
  • Description updated (diff)
  • Status changed from New to Workable

#3 Updated by mkittler about 2 months ago

  • Assignee set to mkittler

#4 Updated by mkittler about 2 months ago

  • Status changed from Workable to In Progress

PR for extracting the check into a separate script: https://github.com/os-autoinst/openQA/pull/4398

#5 Updated by mkittler about 2 months ago

Once the PR is merged and deployed, I'll add an override for transactional-update.service:

[Service]
ExecCondition=/usr/share/openqa/script/openqa-check-devel-repo

This way the update won't be executed if the devel repo isn't in a good state but we still don't end up with a failing unit.

#6 Updated by okurz about 2 months ago

Wow, didn't know about ExecCondition, that's great!

#7 Updated by openqa_review about 2 months ago

  • Due date set to 2021-12-25

Setting due date based on mean cycle time of SUSE QE Tools

#8 Updated by mkittler about 1 month ago

Tested on openqaworker7. It works in both cases (devel:openQA is good or broken). I've tested the broken case by putting an exit 1 in the script to see whether systemd behaves as expected. The script itself should be fine (tested locally and it is basically just what we already do for OSD). I'm currently installing it on the remaining o3 workers.

#9 Updated by mkittler about 1 month ago

  • Status changed from In Progress to Feedback

Installed openQA-auto-update on all workers and added the override via mkdir -p /etc/systemd/system/transactional-update.service.d && echo -e "[Service]\nExecCondition=/usr/share/openqa/script/openqa-check-devel-repo" > /etc/systemd/system/transactional-update.service.d/override.conf.

#10 Updated by mkittler about 1 month ago

  • Status changed from Feedback to Resolved

Looks like the o3 workers are still rebooted in the good case. I think my artificial tests for the bad case are good enough. So I'm resolving the issue.

Also available in: Atom PDF