action #94940
closedmultiple network related problems, gitlab CI pipelines not working, workers not reachable, proxySCC not reachable
0%
Description
Observation¶
see many reports by you and also in many other channels about network related problems. The only reference regarding someone "working on this problem" AFAIK is https://chat.suse.de/channel/suse-it-ama?msg=WA9YfB7CumBiPfupu but there had been no real update from anyone within EngInfra on who is working on that, what to do, workarounds to apply or estimates. Does anyone have a proper information source that we can follow?
Updated by okurz over 3 years ago
- Related to action #94919: All arm workers down 2021-06-30 , NUE SRV2 Rack A8 was switched off by EngInfra size:S added
Updated by okurz over 3 years ago
If you like you can use the ticket as a label in failed openQA jobs as well. I hope someone can provide better insight as so far I don't know what we should do about it. This is so far just for tracking, not solving anything
Updated by okurz over 3 years ago
- Status changed from Feedback to Blocked
- Priority changed from Immediate to Urgent
gitlab CI pipelines seems to be able to access the network again. I assume also proxySCC works again. All osd workers are reachable with exception of openqaworker-arm-1/2/3, see #94919 , blocked on that.
Wrote a message in https://chat.suse.de/channel/testing?msg=FmSxdsSvkxgEzcLgA as update:
All OSD aarch64 workers are still offline and unreachable since 2021-06-30 . There was no update by EngInfra in the ticket https://infra.nue.suse.com/SelfService/Display.html?id=191475 since yesterday midday so I can't tell you more than that
Updated by okurz over 3 years ago
- Due date set to 2021-07-07
- Status changed from Blocked to Feedback
All aarch64 workers online and working on jobs again. asmorodskyi wrote an email that there might be more network related problems. Crosschecking before closing.
Updated by asmorodskyi over 3 years ago
okurz wrote:
All aarch64 workers online and working on jobs again. asmorodskyi wrote an email that there might be more network related problems. Crosschecking before closing.
I can confirm that all problems which I mentioned in my emails are gone now. We also did some timeouts increase on our side :
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/12848
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/12847
Updated by okurz over 3 years ago
- Status changed from Feedback to Resolved
Thank you. All good then.