action #103554: o3 s390x worker instances 102+103 down whereas 101+104 are up - openQA Infrastructure (public) - openSUSE Project Management Tool

Actions

Copy link

action #103554

closed

o3 s390x worker instances 102+103 down whereas 101+104 are up

Added by okurz over 3 years ago. Updated over 3 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

okurz

Category:

Target version:

openQA Project (public) - Ready

Start date:

2021-12-06

Due date:

2021-12-20

% Done:

Estimated time:

Description

Observation¶

As discussed in https://matrix.to/#/!ilXMcHXPOjTZeauZcg:libera.chat/$0ET_hLkIi4dP9a-MiTNhBOfvpht1_cQWLGasoZgobjI I found that openqaworker1_container:102 and openqaworker1_container:103 were offline whereas :101+:104 are up.

Acceptance criteria¶

AC1: All four s390x VMs linux144-linux147 can be used within o3
AC2: According openQA worker instances are persistently running also over nightly worker host reboots

Actions

Copy link

Updated by okurz over 3 years ago

Description updated (diff)

So nsinger wasn't sure what's the expected state. I guess openqaworker1_containter:101..104 should be online, right?
nsinger or someone else might have the services disabled on purpose but does not remember. So a simple systemctl start openqaworker1_container102 should be the next step. That did not work because the service is autogenerated by podman and has a btrfs container file hash that is not valid anymore
I now did for i in 102 103; do podman container rm openqaworker1_container_10$i; done and then followed https://progress.opensuse.org/projects/openqav3/wiki/#o3-s390-workers. Now I see both additional instances up in https://openqa.opensuse.org/admin/workers and ready to take jobs. Let's await automatic reboots.

Actions

Copy link