openQA Project - coordination #122650: [epic] Fix firewall block and improve error reporting when test fails in curl log upload
Ask SUSE-IT network admins to *not* block this traffic which we need for tests regarding s390x within SUSE network size:M
Tests like #122539 need access from s390x to .oqa.suse.de.
- AC1: Tests like mentioned in #122539 are shown to succeed uploading logs from s390x kvm instances back to machines within the .oqa.suse.de. domain
- Ask Lazaros Haleplidis in chat and wait for response -> https://suse.slack.com/archives/C0488BZNA5S/p1672748720536449?thread_ts=1670498352.238729&cid=C0488BZNA5S
- If not successful within reasonable time open SD ticket and wait for that
#1 Updated by okurz 3 months ago
- Copied from action #122653: Ask SUSE-IT network admins to REJECT packets instead of DROP so that we get more clear results size:S added
#2 Updated by okurz 3 months ago
- Related to action #122608: exit code of shell command not received by script_run added
#3 Updated by okurz 3 months ago
- Due date set to 2023-01-13
- Status changed from New to Feedback
- Assignee set to okurz
Lazaros Haleplidis answered in https://suse.slack.com/archives/C0488BZNA5S/p1672846756759229?thread_ts=1670498352.238729&cid=C0488BZNA5S
(Lazaros Haleplidis) Regarding: 1) I will let you know when ready 2)I have performed some tests, to configure it as requested (with SUMA team that volunteer for the same request) but those were unsuccessful. […] regarding (1) could we narrow down the protocols and ports? (also, how shoud I name the 10.161.144.0/20 network?)
(Oliver Kurz) Well, as long as nobody besides your team can see the port definitions I recommend to allow the complete traffic. IMHO the IP range as source should be defined enough. Would this be ok?
(Lazaros Haleplidis) the port definitions will be reviewed by CyberSecurity when I am done on whether they are not specific enough (i.e. allow all ports and protocols, especially on incoming. how shoud I name the 10.161.144.0/20 network?
(Oliver Kurz) s390x
(Lazaros Haleplidis) you know that those will need to be migrated as well but the /20 are from other teams as well not all yours zVMs. and would need access to the whole QE network ??? on all ports??
(Oliver Kurz) not the QE network, openQA. I don't know the ports and I suggest to allow all as long as openQA test reviewers do not have access to the information what is allowed and what is blocked but if you insist then open TCP port 20k-30k or something.
(Lazaros Haleplidis) 20k-30k is set please verify
(Oliver Kurz) https://openqa.suse.de/tests/10273060#live is currently running. This is a heavy, long-running test that will take like 8h (!) so expect final results tomorrow
#4 Updated by okurz 3 months ago
The command to test should be
timeout 2 curl -sS -O http://worker2.oqa.suse.de:20343/rfhqRYw7W_g045X2/files/status.log 2>&1 | grep -q "Connection refused" && echo "ok" && ssh email@example.com "timeout 2 curl -sS -O http://worker2.oqa.suse.de:20343/rfhqRYw7W_g045X2/files/status.log" 2>&1 | grep -q "Connection refused" && echo "ok" || echo "still blocked"
which should return
#6 Updated by okurz 2 months ago
- Due date changed from 2023-01-13 to 2023-01-24
#7 Updated by okurz 2 months ago
- Due date deleted (
- Status changed from Feedback to Blocked
There was no sufficient response so I created https://sd.suse.com/servicedesk/customer/portal/1/SD-109594
#9 Updated by okurz about 2 months ago
- Status changed from Blocked to Resolved
Worked lhaleplidis to unblock TCP traffic in port range 20k+ and curl seems to show this working. So this part has been done. But checking again right now with
curl --max-time 2 http://worker2.oqa.suse.de:20999/foo I get:
curl: (7) Failed to connect to worker2.oqa.suse.de port 20999: Connection timed out
which should be "Connection refused". Why is that?
https://sd.suse.com/servicedesk/customer/portal/1/SD-109594 was closed. I asked to reopen in the ticket and also mentioned the issue in https://suse.slack.com/archives/C0488BZNA5S/p1674739578626049
But then crosschecking with other machines in QA it looks as expected:
$ ssh qamaster.qa.suse.de "curl -sS --max-time 2 http://worker2.oqa.suse.de:20999/foo" curl: (7) Failed to connect to worker2.oqa.suse.de port 20999 after 2 ms: Connection refused
So, anyway, looking on the history of jobs on that worker instance I find e.g. https://openqa.suse.de/tests/10380166 and many other examples of jobs that are running just fine and I am sure the log uploading works in general so considering the ticket resolved.