action #122539
Updated by okurz almost 2 years ago
## Observation openQA test in scenario sle-15-SP5-Online-s390x-xfstests_xfs-generic@s390x-kvm-sle15 fails in [generate_report](http://openqa.suse.de/tests/10218783/modules/generate_report/steps/4) All xfstests runs in sle-15-SP5 s390x fails on that issue. In this specific case the connection attempt with failed curl was from (reading out from vars.json) "SUT_IP" : "s390kvm082.suse.de", "VIRSH_GUEST" : "10.161.145.82", "VIRSH_HOSTNAME" : "s390zp18.suse.de", ## Reproducible Fails since (at least) Build [40.1](http://openqa.suse.de/tests/9918151#step/generate_report/4) ## Expected result Last good: build38.1 http://openqa.suse.de/tests/9886322#step/generate_report/2 ## Further details At first, I thought this is the same issue under debugging in #120261, https://progress.opensuse.org/issues/120261, but after that solution(https://github.com/os-autoinst/openQA/pull/4935/files) merged our fails in s390x still. By looking into the details I don't know why these tests still use worker2.oqa.suse.de as the download IP. Previous last good used IP address not use FQDN. May need some help by the tools team. okurz ran `time curl -O http://worker2.oqa.suse.de:20343/rfhqRYw7W_g045X2/files/status.log` which reproduces the problem quite explicitly: ``` # time curl -O http://worker2.oqa.suse.de:20343/rfhqRYw7W_g045X2/files/status.log % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:02:10 --:--:-- 0curl: (7) Failed to connect to worker2.oqa.suse.de port 20343: Connection timed out real 2m11.316s ``` so very likely the firewall for the .oqa.suse.de zone just drops packets from 10.161.0.0 ## Reproducible Fails since (at least) Build [40.1](http://openqa.suse.de/tests/9918151#step/generate_report/4) ## Expected result Last good: build38.1 http://openqa.suse.de/tests/9886322#step/generate_report/2 ## Suggestions 1. Ask SUSE-IT network admins to REJECT packets instead of DROP so that we get more clear results 2. Ask SUSE-IT network admins to *not* block this traffic which we need for tests 3. As it looks like default connect timeout for curl resolves to 2m10s (see above) so that is above our default timeouts for script_run, etc., so find a combination where curl has a chance to provide a proper error earlier 4. Consider using `upload_logs` in this specific example but this does not completely help. `upload_logs` uses a default timeout of 90s which is higher than the default for `script_run` of 30s which is still below the default for curl accounting to 2m10s. Maybe we add the parameter `--connect-timeout 20` to curl or bump the timeout for upload_logs ## Further details Link to [latest](https://openqa.suse.de/tests/latest?arch=s390x&distri=sle&flavor=Online&machine=s390x-kvm-sle15&test=xfstests_xfs-generic&version=15-SP5)