Project

General

Profile

Actions

action #33700

closed

[slenkins][qam] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite

Added by thehejik about 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Target version:
-
Start date:
2018-03-23
Due date:
% Done:

100%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario sle-12-SP3-Server-DVD-Updates-x86_64-slenkins-twopence-tcpd-control@64bit fails in
slenkins_control

coolo: thehejik: https://openqa.suse.de/tests/1566506#step/2_tcpdmatch/1 - this looks like a problem with the openvswitch network. it started 2 weeks ago - but is not consistent. can you throw theories at the problem please? :)
thehejik: coolo: yes, vsvecova already reported, maybe it has something to do with https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4537 and mkravec told me that we shouldn't set fqdn hostname by hostnamectl but just hostname without domain so we need to investigate
coolo: thehejik: checking the salt commits - we did the openvswitch config 9 days before the problem started
thehejik: coolo: hopefully its not openvswitch related this time
coolo: thehejik: as our DNS setup was fixed, we should just revert these hacks
mkravec: coolo: I will do it

Reproducible

Fails since (at least) Build 20180323-1

Expected result

Last good: 20180321-3 (or more recent)

Further details

Always latest result in this scenario: latest

Actions #1

Updated by mkravec about 6 years ago

DNS workaround disabled: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4688

We have similar random issue (etcd does not start for some reason) at CaaSP lately.

Actions #2

Updated by pcervinka almost 6 years ago

Maybe, similar failure in kdc-init https://openqa.suse.de/tests/1601875 ?

Actions #3

Updated by okurz almost 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: slenkins-twopence-krb5-control
https://openqa.suse.de/tests/1640966

Actions #4

Updated by okurz almost 6 years ago

  • Subject changed from tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite to [slenkins] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite

@thehejik do you plan to work on this yourself or what are your expectations?

Actions #5

Updated by okurz almost 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: slenkins-twopence-tcpd-control
https://openqa.suse.de/tests/1685527

Actions #6

Updated by coolo almost 6 years ago

  • Subject changed from [slenkins] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite to [slenkins][qam] tcpd test fails in 2_tcpdmatch - hostnamectl/dns issue in slenkins tcpd testsuite
Actions #7

Updated by okurz almost 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: slenkins-twopence-krb5-control
https://openqa.suse.de/tests/1734779

Actions #8

Updated by okurz almost 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: slenkins-twopence-tcpd-control
https://openqa.suse.de/tests/1764603

Actions #9

Updated by coolo almost 6 years ago

  • Assignee set to thehejik
  • Priority changed from Normal to High

Ludwig helped me understand what is going on. The normal flow is:

  • server node boots
  • server node disables wicked
  • server node sets hostname as 'server'
  • support server starts a named
  • support server creates 'dns' lock
  • server node restarts network
  • server node queries dns as hostname 'server'
  • support server will resolve the Server IP as 'server' from then on

But what happens in the failing case is that
the server node boots while the support server already setup the named (classic race)
and then the dns server will resolve the Server IP as 'susetest' and the tests
fail.

-> The fix discussed was to create a barrier within the support server that marks
all slenkins nodes to have disabled their network and only after that start the
dhcp/dns server.

Actions #10

Updated by thehejik almost 6 years ago

Possible fix was created within https://progress.opensuse.org/issues/37258

Actions #11

Updated by thehejik over 5 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF