action #33700
closed[slenkins][qam] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite
100%
Description
Observation¶
openQA test in scenario sle-12-SP3-Server-DVD-Updates-x86_64-slenkins-twopence-tcpd-control@64bit fails in
slenkins_control
coolo: thehejik: https://openqa.suse.de/tests/1566506#step/2_tcpdmatch/1 - this looks like a problem with the openvswitch network. it started 2 weeks ago - but is not consistent. can you throw theories at the problem please? :)
thehejik: coolo: yes, vsvecova already reported, maybe it has something to do with https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4537 and mkravec told me that we shouldn't set fqdn hostname by hostnamectl but just hostname without domain so we need to investigate
coolo: thehejik: checking the salt commits - we did the openvswitch config 9 days before the problem started
thehejik: coolo: hopefully its not openvswitch related this time
coolo: thehejik: as our DNS setup was fixed, we should just revert these hacks
mkravec: coolo: I will do it
Reproducible¶
Fails since (at least) Build 20180323-1
Expected result¶
Last good: 20180321-3 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by mkravec about 7 years ago
DNS workaround disabled: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4688
We have similar random issue (etcd does not start for some reason) at CaaSP lately.
Updated by pcervinka about 7 years ago
Maybe, similar failure in kdc-init https://openqa.suse.de/tests/1601875 ?
Updated by okurz almost 7 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: slenkins-twopence-krb5-control
https://openqa.suse.de/tests/1640966
Updated by okurz almost 7 years ago
- Subject changed from tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite to [slenkins] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite
@thehejik do you plan to work on this yourself or what are your expectations?
Updated by okurz almost 7 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: slenkins-twopence-tcpd-control
https://openqa.suse.de/tests/1685527
Updated by coolo almost 7 years ago
- Subject changed from [slenkins] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite to [slenkins][qam] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite
Updated by okurz almost 7 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: slenkins-twopence-krb5-control
https://openqa.suse.de/tests/1734779
Updated by okurz almost 7 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: slenkins-twopence-tcpd-control
https://openqa.suse.de/tests/1764603
Updated by coolo almost 7 years ago
- Assignee set to thehejik
- Priority changed from Normal to High
Ludwig helped me understand what is going on. The normal flow is:
- server node boots
- server node disables wicked
- server node sets hostname as 'server'
- support server starts a named
- support server creates 'dns' lock
- server node restarts network
- server node queries dns as hostname 'server'
- support server will resolve the Server IP as 'server' from then on
But what happens in the failing case is that
the server node boots while the support server already setup the named (classic race)
and then the dns server will resolve the Server IP as 'susetest' and the tests
fail.
-> The fix discussed was to create a barrier within the support server that marks
all slenkins nodes to have disabled their network and only after that start the
dhcp/dns server.
Updated by thehejik almost 7 years ago
Possible fix was created within https://progress.opensuse.org/issues/37258
Updated by thehejik almost 7 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
The issue seems to be fixed by https://progress.opensuse.org/issues/37258, please reopen in case the problem occur again.
https://openqa.suse.de/tests/overview?distri=sle&version=12-SP3&build=20180629-1&groupid=108