Project

General

Profile

Actions

action #95721

closed

[Sporadic] containers: tests fail with "Test died: no candidate needle with tag(s) 'inst-console' matched" size:M

Added by ilausuch over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2021-07-20
Due date:
2021-08-19
% Done:

0%

Estimated time:

Description

Motivation

Eventually in some of the openQA container tests in openQA we find this error "Test died: no candidate needle with tag(s) 'inst-console' matched"
See these examples:

https://openqa.opensuse.org/tests/1847722#step/openqa_webui/3
https://openqa.opensuse.org/tests/1848261#step/openqa_webui/3

Acceptance Criteria

  • AC 1: the test pass without these errors in high enough runs (check the current frequency of the problem)

Suggestion

  • Try to reproduce the problem locally
  • In the worst case increase the timeout
Actions #1

Updated by ilausuch over 2 years ago

Comparing a bad run with a good one in the same point (https://openqa.opensuse.org/tests/1847721#step/openqa_webui/1) I checked that in the good one we have a console view instead of graphical view

Actions #2

Updated by okurz over 2 years ago

  • Priority changed from Normal to High
  • Target version set to Ready
Actions #3

Updated by mkittler over 2 years ago

The test is switching to tty3 and waits for it by looking for the needle inst-console. After the assert_screen fails tty3 shows up. So was the SUT just too slow (slower than the timeout of 30 seconds)?

Actions #4

Updated by ilausuch over 2 years ago

SUT: system under test

Actions #5

Updated by ilausuch over 2 years ago

  • Subject changed from [Sporadic] containers: tests fail with "Test died: no candidate needle with tag(s) 'inst-console' matched" to [Sporadic] containers: tests fail with "Test died: no candidate needle with tag(s) 'inst-console' matched" size:M
  • Description updated (diff)
Actions #6

Updated by ilausuch over 2 years ago

  • Description updated (diff)
Actions #7

Updated by ilausuch over 2 years ago

  • Status changed from New to Workable
Actions #8

Updated by ilausuch over 2 years ago

I found that since now (in one month) we had 8 occurrences, and more o less one per day

Actions #9

Updated by ilausuch over 2 years ago

  • Assignee set to ilausuch
Actions #10

Updated by ilausuch over 2 years ago

  • Status changed from Workable to In Progress

The problem I can see here is that the send_key is not working eventually. The send_key is used to change to the console https://github.com/os-autoinst/os-autoinst-distri-openQA/blob/cd7288c4f14bb11ad24155f0a9777c29b2d563c8/tests/install/openqa_webui.pm#L77

[2021-07-19T16:39:25.588 CEST] [debug] no match: 29.0s, best candidate: gnome-desktop-20190509 (0.00)
[2021-07-19T16:39:26.617 CEST] [debug] >>> testapi::_handle_found_needle: found openqa-boot-menu-Tumbleweed-20190329, similarity 1.00 @ 65/11
[2021-07-19T16:39:26.618 CEST] [debug] /tests/install/boot.pm:6 called utils::wait_for_desktop -> lib/utils.pm:20 called testapi::send_key
[2021-07-19T16:39:26.618 CEST] [debug] <<< testapi::send_key(key="ret", wait_screen_change=0, do_wait=0)
[2021-07-19T16:39:26.888 CEST] [debug] /tests/install/boot.pm:6 called utils::wait_for_desktop -> lib/utils.pm:23 called testapi::assert_screen
[2021-07-19T16:39:26.888 CEST] [debug] <<< testapi::assert_screen(mustmatch="openqa-desktop", timeout=500)
[2021-07-19T16:39:27.490 CEST] [debug] no match: 499.4s, best candidate: gnome-desktop-20190509 (0.00)

There are other old cases
https://progress.opensuse.org/issues/72898
https://progress.opensuse.org/issues/88436
https://progress.opensuse.org/issues/89197

I am preparing an solution based on the investigation

Actions #11

Updated by openqa_review over 2 years ago

  • Due date set to 2021-08-19

Setting due date based on mean cycle time of SUSE QE Tools

Actions #13

Updated by ilausuch over 2 years ago

I want to probe that this solution is good enough launching 40 tests.
If we have all en green we could assume that the solution is ok (but not with all garnaties). On the other hand, if we have a red one failing in this point we can say that 1 minute is enough to change to the terminal screen and therefore the solution is not good and we have to find an other one

Actions #14

Updated by ilausuch over 2 years ago

I launched these tests
1884704, 1884706 - 1884725, 1884733 - 1884753 (exclue 1884705)

Actions #15

Updated by ilausuch over 2 years ago

All tests passed this part of the test.
Few failed in other steps like https://openqa.opensuse.org/tests/1884716
This could be considered as enough probe then

Actions #16

Updated by okurz over 2 years ago

  • Status changed from In Progress to Resolved

Thanks. With this information we can regard the initial problem as something temporary that we do not want to fix right now. Although I assume the very same could come back and then we should look into extending the initial timeout.

Actions

Also available in: Atom PDF