action #170473
closedk2.qe.suse.de not reachable from mania:2 size:S
0%
Description
Observation¶
The PostGREST API endpoint at k2.qe.suse.de is not reachable from mania:2
, see https://openqa.suse.de/tests/16036779#step/bci_version_check/36
k2.qe.suse.de is a database for collecting various BCI container stats. We need this connection to push container stats while testing. This used to work in the past, and still works for all architectures except ppc64
mania was recently announced to be part of the wireguard-tunnel project for getting CC-compliant zone-cc
connectivity to those workers
Suggestions¶
- Look into what was done for #170338 but don't block on it
- This might need specific firewall rules, e.g. either for specific target host or port
- There is already https://sd.suse.com/servicedesk/customer/portal/1/SD-174357 but we don't have access -> just ask @ph03nix to give us feedback based on progress in the SD ticket. We don't need to and probably don't want to participate in the ticket itself
Further details¶
Updated by nicksinger 2 months ago
mania: https://racktables.suse.de/index.php?page=object&tab=default&object_id=9588 -> fc basement
k2.qe.suse.de: https://racktables.suse.de/index.php?page=object&tab=default&object_id=19927 -> morla cluster, most likely prg2
the assessment that this could be connected to our (partial) wg setup came from me in https://suse.slack.com/archives/C02CANHLANP/p1732797379802919 - it might be totally unrelated but seems very plausible
Updated by jbaier_cz 2 months ago
This likely needs the same type of care like in https://progress.opensuse.org/issues/170338#note-5
Updated by jbaier_cz 2 months ago
- Related to action #170338: No monitoring data from OSD since 2024-11-25 1449Z size:M added
Updated by dzedro 2 months ago
- Related to action #170407: [qe-core][sle15sp7][xen]test fails in bootloader_svirt, nfsmount `openqa.suse.de:/var/lib/openqa/share/factory/hdd/fixed 6427781120 4264771584 2163009536 67% /var/lib/openqa/share/factory/hdd/fixed` seems gone added
Updated by dzedro 2 months ago ยท Edited
Also unreal*
svirt workers via sapworker1.qe.nue2.suse.org
can't reach osd nfs e.g. from https://openqa.suse.de/tests/16032904#step/bootloader_svirt/15
Updated by okurz 2 months ago
- Due date set to 2024-12-18
- Status changed from In Progress to Feedback
https://suse.slack.com/archives/C02CANHLANP/p1733323013781639
@Felix Niederwanger hi, in https://progress.opensuse.org/issues/170473 you referenced https://sd.suse.com/servicedesk/customer/portal/1/SD-174357 but we don't have access. Should we proceed to investigate why mania can't reach k2.qe.suse.de or track the ticket and you share with "OSD Admins"?
Updated by okurz 2 months ago
- Due date deleted (
2024-12-18) - Status changed from Feedback to Blocked
fniederwanger shared the ticket with us. It's already in progress so we can just block on https://sd.suse.com/servicedesk/customer/portal/1/SD-174357
Updated by ph03nix about 2 months ago
- Related to action #173542: [BCI] Re-Enable PowerKVM BCI test runs added
Updated by ph03nix about 2 months ago
I can see now that the workers themselves can reach the host in question, however the openQA tests are still failing:
ph03nix@diesel:~> curl -qLf http://k2.qe.suse.de:8080/size >/dev/null && echo "OK"
...
OK
ph03nix@mania:~> curl -qLf http://k2.qe.suse.de:8080/size >/dev/null && echo "OK"
...
OK
ph03nix@petrol:~> curl -qLf http://k2.qe.suse.de:8080/size >/dev/null && echo "OK"
...
OK
Failures fresh from this morning:
- https://openqa.suse.de/tests/16155931#step/bci_version_check/36
- https://openqa.suse.de/tests/16155873#step/bci_version_check/36
- https://openqa.suse.de/tests/16155835#step/bci_version_check/36
- https://openqa.suse.de/tests/16155800#step/bci_version_check/36
- https://openqa.suse.de/tests/16155759#step/bci_version_check/36
- https://openqa.suse.de/tests/16155758#step/bci_version_check/36
- https://openqa.suse.de/tests/16155729#step/bci_version_check/36
- https://openqa.suse.de/tests/16155569#step/bci_version_check/36
- https://openqa.suse.de/tests/16155684#step/bci_version_check/36
- https://openqa.suse.de/tests/16155647#step/bci_version_check/36
- https://openqa.suse.de/tests/16155646#step/bci_version_check/36
- https://openqa.suse.de/tests/16155615#step/bci_version_check/36
Hypothesis: The openQA jobs are not routed over the wireguard tunnel. Asked for help in #eng-testing.
Updated by okurz about 2 months ago
- Status changed from Blocked to Workable
- Assignee deleted (
okurz)
https://sd.suse.com/servicedesk/customer/portal/1/SD-174357 was resovled with "We can confirm that the hosts are reachable from the machines themselves. Remaining issues are due to routing problems in the machines themselves. The firewall rules work however, ticket resolved.". Unassigning