[slenkins][qam] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite
openQA test in scenario sle-12-SP3-Server-DVD-Updates-x86_64-slenkins-twopence-tcpd-control@64bit fails in
coolo: thehejik: https://openqa.suse.de/tests/1566506#step/2_tcpdmatch/1 - this looks like a problem with the openvswitch network. it started 2 weeks ago - but is not consistent. can you throw theories at the problem please? :) thehejik: coolo: yes, vsvecova already reported, maybe it has something to do with https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4537 and mkravec told me that we shouldn't set fqdn hostname by hostnamectl but just hostname without domain so we need to investigate coolo: thehejik: checking the salt commits - we did the openvswitch config 9 days before the problem started thehejik: coolo: hopefully its not openvswitch related this time coolo: thehejik: as our DNS setup was fixed, we should just revert these hacks mkravec: coolo: I will do it
Fails since (at least) Build 20180323-1
Last good: 20180321-3 (or more recent)
Always latest result in this scenario: latest
#1 Updated by mkravec over 3 years ago
DNS workaround disabled: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4688
We have similar random issue (etcd does not start for some reason) at CaaSP lately.
#4 Updated by okurz over 3 years ago
- Subject changed from tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite to [slenkins] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite
thehejik do you plan to work on this yourself or what are your expectations?
#9 Updated by coolo over 3 years ago
- Assignee set to thehejik
- Priority changed from Normal to High
Ludwig helped me understand what is going on. The normal flow is:
- server node boots
- server node disables wicked
- server node sets hostname as 'server'
- support server starts a named
- support server creates 'dns' lock
- server node restarts network
- server node queries dns as hostname 'server'
- support server will resolve the Server IP as 'server' from then on
But what happens in the failing case is that
the server node boots while the support server already setup the named (classic race)
and then the dns server will resolve the Server IP as 'susetest' and the tests
-> The fix discussed was to create a barrier within the support server that marks
all slenkins nodes to have disabled their network and only after that start the
#11 Updated by thehejik about 3 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
The issue seems to be fixed by https://progress.opensuse.org/issues/37258, please reopen in case the problem occur again.