action #77209
closedworkers on o3 machine rebel provide no "WORKER_HOSTNAME" value anymore but it shows up in journal of worker service
0%
Description
Observation¶
For example https://openqa.opensuse.org/admin/workers/384 shows an empty field for "WORKER_HOSTNAME".
Last good: https://openqa.opensuse.org/tests/1464503 from 2020-11-08 22:56Z
First bad: https://openqa.opensuse.org/tests/1464540 from 2020-11-09 04:40Z
what looks certainly related is that
https://github.com/os-autoinst/openQA/pull/3520
was merged just days ago but that was already deployed at 2020-11-08 01:25:01
says grep 'openQA-worker' /var/log/zypp/history
so I see it as unlikely that the issue has been caused directly by that revert.
Acceptance criteria¶
- AC1: The s390x test scenarios are not incompleting anymore due to "no WORKER_CLASS defined" on rebel
Suggestions¶
- Compare with other workers
- Check what could cause this, e.g. in /etc/openqa/workers.ini
- Also look into #77014
Updated by okurz about 4 years ago
- Related to action #77014: openQA webui entry "Assigned worker" shows ip instead of names as formerly added
Updated by okurz about 4 years ago
- Related to action #69328: [o3][s390x] Early fail on s390x workers: connection refused added
Updated by SLindoMansilla about 4 years ago
- Blocks action #77116: test fails in bootloader_s390 - ftp installation media directory repo is too long for using in parmfile - linux144, linux145, linux146, linux147 (rebel) added
Updated by SLindoMansilla about 4 years ago
I notice something weird, where the WORKER_HOSTNAME was under the webui section, so the setting was then not applied to the global section.
But, even after fixing that and restarting workers only worker 3 receives the WORKER_HOSTNAME setting.
- https://openqa.opensuse.org/admin/workers/181
- https://openqa.opensuse.org/admin/workers/384
- https://openqa.opensuse.org/admin/workers/383
- https://openqa.opensuse.org/admin/workers/385
It is maybe related to the fact that yesterday only linux146(rebel:3) could access the FTP repo. It could be that the other z/VM guests still need tweaking. I am going to continue investigating.
Updated by okurz about 4 years ago
- Assignee deleted (
SLindoMansilla)
ok, also Ihno pointed out that the IPv4 address for the SUT is reused among the multiple instances. The machine "s390x" in https://openqa.opensuse.org/admin/machines had:
S390_NETWORK_PARAMS=OSAHWAddr= OSAMedium=eth InstNetDev=osa OSAInterface=qdio HostIP=192.168.112.10/24 Gateway=192.168.112.254 Nameserver=192.168.112.100 Domain=opensuse.org PortNo=0 Layer2=1 ReadChannel=0.0.0800 WriteChannel=0.0.0801 DataChannel=0.0.0802 Hostname=192.168.112.10
I have changed that now to
S390_NETWORK_PARAMS=OSAHWAddr= OSAMedium=eth InstNetDev=osa OSAInterface=qdio HostIP=192.168.112.@S390_HOST@/24 Hostname=@S390_HOST@ Gateway=192.168.112.254 Nameserver=192.168.112.100 Domain=opensuse.org PortNo=0 Layer2=1 ReadChannel=0.0.0800 WriteChannel=0.0.0801 DataChannel=0.0.0802
similar to what we had on osd.
on o3 I have added according entries in
o3:/etc/hosts
# s390x SUT IPs, see rebel:/etc/openqa/workers.ini
192.168.112.144 s390linux144
192.168.112.145 s390linux145
192.168.112.146 s390linux146
192.168.112.147 s390linux147
Updated by okurz about 4 years ago
- Status changed from Workable to In Progress
- Assignee set to SLindoMansilla
Updated by SLindoMansilla about 4 years ago
I have changed the machine name to s390x-zVM-vswitch-l2
in O3 (like in OSD) to distinguish from vswitch-l3, kvm-sle12*, kvm-sle15*
- Job group template updated
- workers.ini updated
Updated by okurz about 4 years ago
- Subject changed from workers on o3 machine provide no "WORKER_HOSTNAME" value anymore but it shows up in journal of worker service to workers on o3 machine rebel provide no "WORKER_HOSTNAME" value anymore but it shows up in journal of worker service
Updated by SLindoMansilla about 4 years ago
- Status changed from In Progress to Resolved
It looks like my change in workers.ini actually fixed it. But, the WORKER_HOSTNAME
setting in the worker page is not updated until next job is run.
Good to know :)
I have verified that the missing WORKER_HOSTNAME is no more an issue.
Updated by okurz about 4 years ago
SLindoMansilla wrote:
But, the
WORKER_HOSTNAME
setting in the worker page is not updated until next job is run.
wow! That was the missing part for me :D Thanks a lot.
Updated by livdywan about 4 years ago
Do we know if #3520 was deployed before you made the .ini
file changes? Should we revert the revert?
Updated by SLindoMansilla about 4 years ago
cdywan wrote:
Do we know if #3520 was deployed before you made the
.ini
file changes? Should we revert the revert?
In any case, now it is for sure deployed, and it is working. The problem was in the workers.ini. I don't think that this PR had something to do with it.