power8 fails to execute jobs successfully, no kvm, but also no sshd auto_review:"(?s)power8.*no kvm-img/qemu-img found":retry
Brought up by maxlin in https://matrix.to/#/!ilXMcHXPOjTZeauZcg:libera.chat/$KnzYkYddVbHWHofrb5F3EEl1haJ_imSCFatVCfL1Jp8
looks like qemu was removed by an upgrade gone wrong. If you or anyone else can login I suggest to look into the zypper log what has gone wrong and install missing packages. Also can't log in over ssh
Steps to reproduce¶
Find jobs referencing this ticket with the help of
- Subject changed from power8 fails to execute jobs successfully, no kvm, but also no sshd to power8 fails to execute jobs successfully, no kvm, but also no sshd auto_review:"(?s)power8.*no kvm-img/qemu-img found":retry
- Description updated (diff)
Reinstalled openssh-server, enabled the service. Could login over ssh again. Installed qemu-tools. Added auto-review regex.
zypper has a problem with the ppc repo-oss:
# zypper se -t pattern Problem retrieving files from 'repo-oss'. Download (curl) error for 'http://download.opensuse.org/ports/ppc/distribution/leap/15.3/repo/oss/repodata/repomd.xml': Error code: HTTP response: 425 Error message: The requested URL returned error: 425 Too Early
I suspect that the "--force-resolution" option in our openQA upgrade scripting is just too dangerous and we should remove it.
After triggering a reboot the system did not boot and was stuck in petitboot. Well, there was no kernel so that is understandable. I did
for i in dev sys proc ; do mount -o bind /$i /var/petitboot/mnt/dev/sda3/$i; done ln -sf /etc/resolv.conf /var/petitboot/mnt/dev/sda3/etc chroot /var/petitboot/mnt/dev/sda3
Then manually loaded files for the "repo-oss" with something like
curl https://download.opensuse.org/ports/ppc/distribution/leap/15.3/repo/oss/repodata/repomd.xml > /var/cache/zypp/raw/repo-oss/repodata/repomd.xml
and another one that
zypper -n in wget mentioned, with a hashsum included.
zypper --no-refresh -n in kernel-default worked. Also did
zypper --no-refresh -n in -t pattern kvm_server base and triggered a reboot.
Now system booted up to the point of showing a getty login prompt on IPMI SOL but wicked seems to be not installed.
ip link set dev eth4 up ip addr add 192.168.112.2/24 dev eth4 ip route add default via 192.168.112.254 echo -e 'search openqanet.opensuse.org nameserver 192.168.112.100 ' >> /etc/resolv.conf
After that installed wicked and reinstalled os-autoinst and openQA to ensure all deps are there. Then openQA jobs immediately started to be picked up. But I rebooted the machine anyway to check if it is stable. Also did
failed_since=2022-05-25 worker=power8 ./openqa-advanced-retrigger-jobs.
Now monitoring jobs.
Found more jobs that still failed trying to sync stuff. rsync was there but a dependency was missing. Did
zypper -n in --force rsync which reinstalled dependencies to fix this. Well, now jobs are fine like e.g. https://openqa.opensuse.org/tests/2395231#
Also did https://github.com/os-autoinst/openQA/pull/4678 and merged so same problem shouldn't happen again.
But the problem with HTTP response 425 still happens when I do
zypper ref on power8, asking around:
hi, on the machine power8.openqa.opensuse.org I am seeingDownload (curl) error for 'http://download.opensuse.org/ports/ppc/distribution/leap/15.3/repo/oss/repodata/repomd.xml': Error code: HTTP response: 425 Error message: The requested URL returned error: 425 Too Early
zypper ref. curl of the mentioned file itself looks fine locally for me as well as on the host but zypper can't read it. Removing the repo with
zypper rrand adding back does not help, neither does switching to https. Anyone has an idea?
- Asked in https://matrix.to/#/!ilXMcHXPOjTZeauZcg:libera.chat/$_iXxEczzPNCLQyg4sJjgMGUgmMfG-stWExmriEXiDkQ?via=libera.chat&via=matrix.org&via=m4u.asia
EDIT: Answer by anikitin in https://suse.slack.com/archives/C028VS8TM2B/p1653917247360799?thread_ts=1653916397.654209&cid=C028VS8TM2B, nothing conclusive yet.
- Description updated (diff)
- Due date deleted (
- Status changed from In Progress to Resolved
Clarified with anikitin. He confirmed that the observed behaviour is a bug in the mirror infrastructure code. He applied a workaround and will look into a proper fix eventually. I can confirm that
zypper ref works fine now.