action #69346: flaky/unstable os-autoinst test "22-svirt.t" - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #69346

closed

flaky/unstable os-autoinst test "22-svirt.t"

Added by okurz almost 5 years ago. Updated over 4 years ago.

Status:

Resolved

Priority:

Low

Assignee:

okurz

Category:

Regressions/Crashes

Target version:

Ready

Start date:

2020-07-25

Due date:

% Done:

Estimated time:

Description

Observation¶

https://travis-ci.org/github/os-autoinst/os-autoinst/builds/711450994#L1527 on master:

    #   Failed test 'Ensure run_ssh_cmd(keep_open => 0) uses a new SSH connection'
    #   at ./22-svirt.t line 192.
    #          got: '6'
    #     expected: '5'
    # Looks like you failed 1 test of 36.
#   Failed test 'SSH usage in svirt'
#   at ./22-svirt.t line 212.
# Looks like you failed 1 test of 8.

Acceptance criteria¶

AC1: test is stable locally and on travis CI

Suggestions¶

Call locally and see if problem reproduces (takes only a second to execute):

for i in {1..100}; do prove -I. t/22-svirt.t || break; done

Apply fix either verifiable locally or also just based on travis CI results
Optional, if one can not reproduce it locally we could also exclude it from travis CI tests

Actions

Copy link

Updated by okurz almost 5 years ago

Description updated (diff)
Difficulty set to easy

Actions

Copy link

Updated by okurz almost 5 years ago

Description updated (diff)

Actions

Copy link

Updated by mkittler almost 5 years ago

Assignee set to mkittler

I could not reproduce the issue locally after 1000 runs. If it is a race condition it seems to be quite sticky to a certain outcome. Maybe a timeout is just set too low.

Actions

Copy link

Updated by mkittler almost 5 years ago

Assignee deleted (~~mkittler~~)

When reading the code I only noticed that the parameters for actual/expected are swapped: https://github.com/os-autoinst/os-autoinst/pull/1494

So there's actually one connection too less. This can be provoked by passing keep_open => 1 instead of => 0. I don't see any race conditions or timeouts within the code so I'm not sure how to fix this - especially since it can not be reproduced locally.

Actions

Copy link

Updated by okurz almost 5 years ago

Status changed from Workable to Resolved
Assignee set to okurz

Also while happening multiple times last week or so I have not seen that again. Glad you found at least something to fix :) I guess we can call it "Resolved" then because work has been done even though no actual "fix" was applied. I will know where to find the ticket if I see that again. Thanks.

Actions

Copy link

Updated by okurz over 4 years ago

Status changed from Resolved to Workable
Assignee changed from okurz to cfconrad
Priority changed from High to Low
Target version changed from Ready to future

Just saw that again in https://travis-ci.org/github/os-autoinst/os-autoinst/builds/743373314#L1294 so it can apparently still happen.

@cfconrad you introduced the code in https://github.com/os-autoinst/os-autoinst/commit/cc0d5e79ba88f5638e38f7b2b3c2f0842450148d#diff-ae45f3a3e9cf2344d12a6e8860efe913511c68e154cccbf82c587e70afb7298bR184

Maybe you can help to fix that?

Actions

Copy link

Updated by cfconrad over 4 years ago

Status changed from Workable to Feedback
Assignee changed from cfconrad to okurz

Can we give this change a try: https://github.com/os-autoinst/os-autoinst/pull/1568

@okurz I guess you have better ways to monitor if the results are still sometimes fail, so I reassign to you. Feel free to throw back -- if it doesn't solve it.

Actions

Copy link

Updated by okurz over 4 years ago

Target version changed from future to Ready

sure, can do. Your PR is still open and Martchus already commented on it with just tiny remarks. I assume you will still follow these changes. Then I am happy to monitor how it behaves.

Actions

Copy link

Updated by okurz over 4 years ago

Status changed from Feedback to Resolved

so the tests in the PR as well as in master were fine. As the original problem did not really happen often anyway I will just set this to "Resolved". We can hopefully find back to this ticket in case we see the test module failing again :)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #69346

flaky/unstable os-autoinst test "22-svirt.t"

Observation¶

Acceptance criteria¶

Suggestions¶

Updated by okurz almost 5 years ago

Updated by okurz almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by okurz almost 5 years ago

Updated by okurz over 4 years ago

Updated by cfconrad over 4 years ago

Updated by okurz over 4 years ago

Updated by okurz over 4 years ago