Project

General

Profile

Actions

action #152461

closed

[core][tools] test fails in various s390x-kvm tests with "s390x-kvm[\S\s]*(command 'zypper -n in[^\n]*timed out|sh install_k3s.sh[^\n]*failed)"

Added by okurz 11 months ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
Start date:
2023-12-12
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

Various openQA tests fail on s390x-kvm, e.g. in scenario sle-15-SP4-Server-DVD-Updates-s390x-mau-extratests-phub@s390x-kvm fails in
python_scientific
with

command 'zypper -n in python3 python3-numpy python3-scipy | cat; ( exit ${PIPESTATUS[0]} )' timed out

https://openqa.suse.de/tests/13047998#step/kubectl/14 is particularly interesting because already when trying to refresh repos this goes into a timeout

Reproducible

Fails since Build 20231207-1
2023-12-08

Expected result

Last good: 20231206-1 (or more recent)

Problem

https://openqa.suse.de/tests/12996343#investigation shows

diff_to_last_good   

…
-   "BUILD" : "20231206-1",
+   "BUILD" : "20231207-1",
-   "HDD_1" : "SLES-15-SP4-s390x-mru-install-minimal-with-addons-Build20231206-1-Server-DVD-Updates-s390x-kvm.qcow2",
+   "HDD_1" : "SLES-15-SP4-s390x-mru-install-minimal-with-addons-Build20231207-1-Server-DVD-Updates-s390x-kvm.qcow2",
+   "LEGACY_TEST_ISSUES" : "31724",
+   "LEGACY_TEST_REPOS" : "http://download.suse.de/ibs/SUSE:/Maintenance:/31724/SUSE_Updates_SLE-Module-Legacy_15-SP4_s390x/",
…
-   "NEEDLES_GIT_HASH" : "52c3ca3c09d640ded913f02d4c71178552c5bc07",
+   "NEEDLES_GIT_HASH" : "78d2d6be3d0d587e9e5f0cc1f3edc7cbd312d523",
…
-   "REPOHASH" : "24e73cca5581cc19eedea1937953f3d0",
+   "REPOHASH" : "bef082847e1ac3bde609f8f317c8b877",
…
-   "SUT_IP" : "s390kvm090.oqa.prg2.suse.org",
+   "SUT_IP" : "s390kvm080.oqa.prg2.suse.org",
-   "TEST_GIT_HASH" : "2139f98b566ccda1c6da21fa079f6e074f8c6b7c",
+   "TEST_GIT_HASH" : "a76c65d6134fd10718e5285588d998e10964c46c",
-   "UEFI_PFLASH_VARS" : "SLES-15-SP4-s390x-mru-install-minimal-with-addons-Build20231206-1-Server-DVD-Updates-s390x-kvm-uefi-vars.qcow2",
+   "UEFI_PFLASH_VARS" : "SLES-15-SP4-s390x-mru-install-minimal-with-addons-Build20231207-1-Server-DVD-Updates-s390x-kvm-uefi-vars.qcow2",
-   "VIRSH_GUEST" : "10.145.10.90",
-   "VIRSH_HOSTNAME" : "s390zl13.oqa.prg2.suse.org",
+   "VIRSH_GUEST" : "10.145.10.80",
+   "VIRSH_HOSTNAME" : "s390zl12.oqa.prg2.suse.org",
-   "VIRSH_MAC" : "52:54:00:9d:1a:57",
+   "VIRSH_MAC" : "52:54:00:77:af:b4",
-   "WORKER_CLASS" : "s390-kvm,s390-kvm-sle12-mm,s390zl13,s390kvm090,prg,prg2,worker33,cpu-x86_64,cpu-x86_64-v2,cpu-x86_64-v3",
-   "WORKER_HOSTNAME" : "worker33.oqa.prg2.suse.org",
-   "WORKER_ID" : 2647,
+   "WORKER_CLASS" : "s390-kvm,s390-kvm-sle12-mm,s390zl12,s390kvm080,prg,prg2,worker31,cpu-x86_64,cpu-x86_64-v2,cpu-x86_64-v3",
+   "WORKER_HOSTNAME" : "worker31.oqa.prg2.suse.org",
+   "WORKER_ID" : 2567,

last_good   12986251
needles_diff_stat   

 autoyast-system-login-console-minimal-20231017.json |  15 +++++++++++++++
 autoyast-system-login-console-minimal-20231017.png  | Bin 0 -> 71874 bytes
 2 files changed, 15 insertions(+)

needles_log 

+ aa361be42 Adding a new needle for poo 133997.

test_diff_stat  

 data/publiccloud/terraform/azure.tf           | 43 +++---------------
 data/publiccloud/terraform/azure_nfstest.tf   | 44 +++----------------
 data/publiccloud/terraform/gce.tf             |  7 +--
 lib/publiccloud/provider.pm                   |  2 +
 lib/qesapdeployment.pm                        | 63 +++++++++++++++------------
 schedule/security/alp/container_selinux.yaml  |  5 +++
 schedule/security/alp/fde_misc.yaml           |  6 +++
 schedule/security/alp/fips_crypt_core.yaml    | 54 +++++++++++++++++++++++
 schedule/security/fips_crypt_core.yaml        | 46 ++++---------------
 schedule/security/selinux.yaml                |  5 +++

Show more
test_log    

+ baa7c2038 Remove subscription_id from az related qesap API
+ d1ce35810 openldap_configuration: only disbale nscd if installed
+ fd8310c6a Update BCI openQA test runs to new version scheme
+ 711fd1271 gdb: disable debuginfod url during test
+ 54ad70f0c Refactor fips/fips_setup.pm

+ ca6eb7aef Use suseconnect_scc on SLEM
+ 1cbd94785 Public Cloud: Reuse resources in Google and Azure

so there are some changes in os-autoinst-distri-opensuse, some schedule changes and in the diff of settings I see that "LEGACY_TEST_REPOS" now appeared in the "first bad" which was not there in before, maybe related?

Impact

It seems many if not all s390x-kvm SLE maintenance tests are impacted

Further details

Always latest result in this scenario: latest


Related issues 3 (0 open3 closed)

Related to Containers and images - action #152368: [docker] failing because it's trying to install podman instead of dockerRejected2023-12-11

Actions
Related to openQA Project - action #152389: significant increase in MM-test failure ratio 2023-12-11: test fails in multipath_iscsi and other multi-machine scenarios due to MTU size auto_review:"ping with packet size 1350 failed, problems with MTU" size:MResolvedmkittler2023-12-11

Actions
Related to Containers and images - action #152452: Slow network on s390xResolvedokurz2023-12-12

Actions
Actions #1

Updated by okurz 11 months ago

  • Subject changed from test fails in various s390x-kvm tests with "command 'zypper -n in.*(python3-numpy|ltp-stable).*timed out" to test fails in various s390x-kvm tests with "s390x-kvm[\S\s]*command 'zypper -n in[^\n]*timed out"
  • Description updated (diff)
Actions #2

Updated by okurz 11 months ago

  • Subject changed from test fails in various s390x-kvm tests with "s390x-kvm[\S\s]*command 'zypper -n in[^\n]*timed out" to [core] test fails in various s390x-kvm tests with "s390x-kvm[\S\s]*command 'zypper -n in[^\n]*timed out"
Actions #3

Updated by okurz 11 months ago

  • Related to action #152368: [docker] failing because it's trying to install podman instead of docker added
Actions #5

Updated by okurz 11 months ago

  • Description updated (diff)
Actions #6

Updated by okurz 11 months ago

  • Subject changed from [core] test fails in various s390x-kvm tests with "s390x-kvm[\S\s]*command 'zypper -n in[^\n]*timed out" to [core] test fails in various s390x-kvm tests with "s390x-kvm[\S\s]*(command 'zypper -n in[^\n]*timed out|sh install_k3s.sh[^\n]*failed)"
Actions #7

Updated by okurz 11 months ago

  • Related to action #152389: significant increase in MM-test failure ratio 2023-12-11: test fails in multipath_iscsi and other multi-machine scenarios due to MTU size auto_review:"ping with packet size 1350 failed, problems with MTU" size:M added
Actions #8

Updated by okurz 11 months ago

This might also be related to #152389

Actions #9

Updated by okurz 11 months ago

I suggest to try out a revert of https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1061/diffs applied manually to see if that has any relation.

Actions #10

Updated by okurz 11 months ago

Actions #11

Updated by livdywan 11 months ago

okurz wrote in #note-9:

I suggest to try out a revert of https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1061/diffs applied manually to see if that has any relation.

Discussing this within the Tools team jitsi we suspect it's slow network and this change may be irrelevant here

Actions #12

Updated by okurz 11 months ago

Given the comments from pdostal in #152452 regarding slow network I am tending to rule out the strong connection to MTU settings. I suggest to debug the routing within those hosts and check performance of connectivity.

Actions #13

Updated by okurz 11 months ago

  • Status changed from New to In Progress
  • Assignee set to okurz

From one of those s390 VMs

# cat /etc/resolv.conf 
### /etc/resolv.conf is a symlink to /run/netconfig/resolv.conf
### autogenerated by netconfig!
#
# Before you change this file manually, consider to define the
# static DNS configuration using the following variables in the
# /etc/sysconfig/network/config file:
#     NETCONFIG_DNS_STATIC_SEARCHLIST
#     NETCONFIG_DNS_STATIC_SERVERS
#     NETCONFIG_DNS_FORWARDER
# or disable DNS configuration updates via netconfig by setting:
#     NETCONFIG_DNS_POLICY=''
#
# See also the netconfig(8) manual page and other documentation.
#
### Call "netconfig update -f" to force adjusting of /etc/resolv.conf.
search suse.de oqa.prg2.suse.org oqa.suse.de
nameserver 10.160.0.1
nameserver 10.144.53.53
nameserver 10.144.53.54

so that's still the old DNS server which is not available anymore. I wil update references in test code.

Actions #14

Updated by okurz 11 months ago · Edited

  • Subject changed from [core] test fails in various s390x-kvm tests with "s390x-kvm[\S\s]*(command 'zypper -n in[^\n]*timed out|sh install_k3s.sh[^\n]*failed)" to [core][tools] test fails in various s390x-kvm tests with "s390x-kvm[\S\s]*(command 'zypper -n in[^\n]*timed out|sh install_k3s.sh[^\n]*failed)"
  • Target version set to Ready
Actions #16

Updated by okurz 11 months ago

  • Priority changed from Urgent to Normal

Urgent issue resolved, just waiting for feedback on the MRs for cleanup

Actions #17

Updated by okurz 11 months ago

  • Status changed from Feedback to Resolved

MRs merged. All done now.

Actions

Also available in: Atom PDF