action #103554
closedo3 s390x worker instances 102+103 down whereas 101+104 are up
0%
Description
Observation¶
As discussed in https://matrix.to/#/!ilXMcHXPOjTZeauZcg:libera.chat/$0ET_hLkIi4dP9a-MiTNhBOfvpht1_cQWLGasoZgobjI I found that openqaworker1_container:102 and openqaworker1_container:103 were offline whereas :101+:104 are up.
Acceptance criteria¶
- AC1: All four s390x VMs linux144-linux147 can be used within o3
- AC2: According openQA worker instances are persistently running also over nightly worker host reboots
Updated by okurz almost 3 years ago
- Description updated (diff)
So nsinger wasn't sure what's the expected state. I guess openqaworker1_containter:101..104 should be online, right?
nsinger or someone else might have the services disabled on purpose but does not remember. So a simple systemctl start openqaworker1_container102
should be the next step. That did not work because the service is autogenerated by podman and has a btrfs container file hash that is not valid anymore
I now did for i in 102 103; do podman container rm openqaworker1_container_10$i; done
and then followed https://progress.opensuse.org/projects/openqav3/wiki/#o3-s390-workers. Now I see both additional instances up in https://openqa.opensuse.org/admin/workers and ready to take jobs. Let's await automatic reboots.
Updated by okurz almost 3 years ago
- Due date set to 2021-12-20
- Status changed from New to Feedback
Updated by okurz almost 3 years ago
- Status changed from Feedback to Resolved
openqaworker1 rebooted multiple times and was changed but all four worker instances are alive and happy (I assume, they did not complain, at least)