coordination #122650
Updated by okurz almost 2 years ago
## Observation openQA test in scenario sle-15-SP5-Online-s390x-xfstests_xfs-generic@s390x-kvm-sle15 fails in [generate_report](http://openqa.suse.de/tests/10218783/modules/generate_report/steps/4) All xfstests runs in sle-15-SP5 s390x fails on that issue. In this specific case the connection attempt with failed curl was from (reading out from vars.json) "SUT_IP" : "s390kvm082.suse.de", "VIRSH_GUEST" : "10.161.145.82", "VIRSH_HOSTNAME" : "s390zp18.suse.de", At first, I thought this is the same issue under debugging in #120261, but after that solution(https://github.com/os-autoinst/openQA/pull/4935/files) merged our fails in s390x still. By looking into the details I don't know why these tests still use worker2.oqa.suse.de as the download IP. Previous last good used IP address not use FQDN. May need some help by the tools team. okurz ran `time curl -O http://worker2.oqa.suse.de:20343/rfhqRYw7W_g045X2/files/status.log` which reproduces the problem quite explicitly: ``` # time curl -O http://worker2.oqa.suse.de:20343/rfhqRYw7W_g045X2/files/status.log % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:02:10 --:--:-- 0curl: (7) Failed to connect to worker2.oqa.suse.de port 20343: Connection timed out real 2m11.316s ``` so very likely the firewall for the .oqa.suse.de zone just drops packets from 10.161.0.0 ## Reproducible Fails since (at least) Build [40.1](http://openqa.suse.de/tests/9918151#step/generate_report/4) ## Expected result Last good: build38.1 http://openqa.suse.de/tests/9886322#step/generate_report/2 ## Suggestions 1. Ask SUSE-IT network admins to REJECT packets instead of DROP so that we get more clear results #122653 2. Ask SUSE-IT network admins to *not* block this traffic which we need for tests #122656 3. As it looks like default connect timeout for curl resolves to 2m10s (see above) so that is above our default timeouts for script_run, etc., so find a combination where curl has a chance to provide a proper error earlier. earlier 4. Consider using `upload_logs` in this specific example but this does not completely help. `upload_logs` uses a default timeout of 90s which is higher than the default for `script_run` of 30s which is still below the default for curl accounting to 2m10s. Maybe we add the parameter `--connect-timeout 20` to curl or bump the timeout for upload_logs #122659 5. Ensure the original problem is fixed #122539 ## Further details Link to [latest](https://openqa.suse.de/tests/latest?arch=s390x&distri=sle&flavor=Online&machine=s390x-kvm-sle15&test=xfstests_xfs-generic&version=15-SP5)