action #163745
closed[tools] tests on worker31 time out on yast2 firewall services add zone=EXT service=service:target
0%
Description
Observation¶
openQA test in scenario sle-15-SP4-Server-DVD-HA-Incidents-x86_64-qam_ha_priorityfencing_supportserver@64bit fails in
setup
Also https://openqa.suse.de/tests/14884549#step/setup/43
Test suite description¶
The base test suite is used for job templates defined in YAML documents. It has no settings of its own.
Reproducible¶
Fails since (at least) Build :32329:nftables (current job)
Expected result¶
Last good: :34692:apache2 (or more recent)
Further details¶
Always latest result in this scenario: latest
Mitigations¶
- Following Take machines out of salt-controlled production:
sudo salt-key -y -d worker31.oqa.prg2.suse.org
sudo systemctl disable --now telegraf $(systemctl list-units | gr ep openqa-worker-auto-restart | cut -d . -f 1 | xargs)
Updated by slo-gin 4 months ago
This ticket was set to High priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by livdywan 4 months ago
- Related to action #162293: SMART errors on bootup of worker31, worker32 and worker34 size:M added
Updated by okurz 4 months ago
- Tags set to infra, worker31
- Project changed from openQA Tests (public) to openQA Infrastructure (public)
- Subject changed from tests on worker31 time out on yast2 firewall services add zone=EXT service=service:target to [tools] tests on worker31 time out on yast2 firewall services add zone=EXT service=service:target
- Category changed from Bugs in existing tests to Regressions/Crashes
- Target version set to Ready
Updated by okurz 4 months ago
- Status changed from New to Resolved
- Assignee set to okurz
https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Server-DVD-HA-Incidents&machine=64bit&test=qam_ha_priorityfencing_supportserver&version=15-SP4#next_previous shows more than 100 jobs passing so we assume that worker31 should not even be in production due to #162293 at the time of the failure. Maybe the machine was partially up and destroying jobs unnoticed leading the reported error. anyway we are good by now.