Project

General

Profile

Actions

action #29583

closed

[opensuse][functional] extra_tests_in_textmode@64bit Fix salt test

Added by JERiveraMoya over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Start date:
2017-12-19
Due date:
2018-01-16
% Done:

0%

Estimated time:
Difficulty:

Description

see https://openqa.opensuse.org/tests/546400#step/salt/25

Need a fix for this test.

Related bug: https://bugzilla.opensuse.org/show_bug.cgi?id=1069711

Note: this ticket has been recreated due to accidental deletion of https://progress.opensuse.org/issues/28723


Related issues 4 (0 open4 closed)

Related to openQA Tests - action #28964: [sle][functional][medium] test fails in salt - No salt-master in SLES/SLEDResolvedokurz2017-12-062017-12-19

Actions
Related to openQA Tests - action #29247: [sle][functional][medium] test fails in salt - no return received - needs investigationRejected2017-12-11

Actions
Copied from openQA Tests - action #29447: [opensuse][functional][medium] Enhance salt test (no sleep, no fail on bsc#1069711)ResolvedJERiveraMoya2018-02-13

Actions
Copied to openQA Tests - action #30165: [opensuse][functional][hard][bsc#1069711] extra_tests_in_textmode@64bit master is not able to ping minions (investigate)ResolvedJERiveraMoya2017-12-192018-02-27

Actions
Actions #1

Updated by JERiveraMoya over 6 years ago

  • Copied from action #29447: [opensuse][functional][medium] Enhance salt test (no sleep, no fail on bsc#1069711) added
Actions #2

Updated by JERiveraMoya over 6 years ago

  • Status changed from New to In Progress

Starting to work, trying to reproduce it locally but it doesn't happen at the moment. The error is probably related with this line in salt logs: salt-minion[6810]: [ERROR ] The Salt Master has cached the public key for this node, this salt minion will wait for 10 seconds before attempting to re-authenticate
It looks like at the moment to start the minion service there are errors and we are not waiting for the service to start properly. Continue investigating.

Actions #3

Updated by JERiveraMoya over 6 years ago

I need to probe in osd following hypothesis about two issues that we see:
(1) When starting the minion, it takes time to send its public key, even if the service is properly started. We accept immediately all keys and is not there yet in the master (network latency perhaps?). After googling it I cannot say is a bug because when this process is used in automated-fashion, scheduled jobs are used for this tasks.
(2) When pinging the minions we might also set a delay with salt '*' test.ping -t 10 (default to 5)

Actions #4

Updated by JERiveraMoya over 6 years ago

Salt was failing in several scenarios and with PR https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4079 we have tried to introduce a delay in communication client-server.
Once is merged if OSD displays successful results, try to improve this test using library functions and search for the specific file (the public key stored in the master) instead of sleep the machine.

Actions #5

Updated by JERiveraMoya over 6 years ago

  • Status changed from In Progress to Feedback

Verification runs not available in OSD.

Actions #6

Updated by SLindoMansilla over 6 years ago

  • Related to action #28964: [sle][functional][medium] test fails in salt - No salt-master in SLES/SLED added
Actions #8

Updated by JERiveraMoya over 6 years ago

  • Status changed from Feedback to In Progress

Increasing timeout. Some errors go away with previous PR. See new one: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4098

Actions #9

Updated by JERiveraMoya over 6 years ago

  • Due date changed from 2017-12-19 to 2018-01-02
Actions #10

Updated by JERiveraMoya over 6 years ago

After restarting, these tests (not reviewed yet in the current build) passed with the new timeout:
https://openqa.suse.de/tests/1334312
https://openqa.suse.de/tests/1334313

Bug updated reflecting those findings.

Actions #11

Updated by JERiveraMoya over 6 years ago

  • Description updated (diff)
Actions #12

Updated by JERiveraMoya over 6 years ago

  • Status changed from In Progress to Feedback

Need a new build not yet available for verification that all failures went away.

Actions #13

Updated by jorauch over 6 years ago

  • Related to action #29247: [sle][functional][medium] test fails in salt - no return received - needs investigation added
Actions #14

Updated by JERiveraMoya over 6 years ago

30s worked fine for listing the keys, now only visible some errors when pinging node, but not acceptable for master and minion in the same machine. Bug updated with this info.

Actions #15

Updated by okurz over 6 years ago

  • Due date changed from 2018-01-02 to 2018-01-16
  • Status changed from Feedback to In Progress
  • Target version changed from Milestone 12 to Milestone 13

Please add a workaround with a record_soft_failure for both timeouts but referencing the same bug so that we do not fail the test but also not silently introduce a too long sleep time.

Actions #17

Updated by JERiveraMoya over 6 years ago

  • Status changed from In Progress to Blocked
Actions #18

Updated by JERiveraMoya over 6 years ago

  • Status changed from Blocked to Feedback
Actions #19

Updated by JERiveraMoya over 6 years ago

  • Status changed from Feedback to Resolved

Workaround successful for tests previously failing: ooo#1366125#22 ooo#1366126#22

Actions #20

Updated by riafarov over 6 years ago

  • Status changed from Resolved to In Progress

@JERiveraMoya, could you please take a look on this failure: https://openqa.suse.de/tests/1370487#step/salt/22 looks like it's related to your changes, but I'm not sure what exactly is the issue there

Actions #21

Updated by JERiveraMoya over 6 years ago

We can improve the workaround: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/4169 and test in prod this particular job, but nothing else, is a product bug.

Actions #22

Updated by riafarov over 6 years ago

Do we have a bug reference? For latest build this test failed also in other scenarios. Let's discuss it in irc.

Actions #23

Updated by riafarov over 6 years ago

  • Status changed from In Progress to Resolved

As per discussion, it's same issue and is same bug, so resolve and will consider soft-failure in case there is no fix any time soon.

Actions #24

Updated by JERiveraMoya over 6 years ago

  • Copied to action #30165: [opensuse][functional][hard][bsc#1069711] extra_tests_in_textmode@64bit master is not able to ping minions (investigate) added
Actions

Also available in: Atom PDF