Project

General

Profile

action #95721

[Sporadic] containers: tests fail with "Test died: no candidate needle with tag(s) 'inst-console' matched" size:M

Added by ilausuch 2 months ago. Updated 27 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Concrete Bugs
Target version:
Start date:
2021-07-20
Due date:
2021-08-19
% Done:

0%

Estimated time:
Difficulty:

Description

Motivation

Eventually in some of the openQA container tests in openQA we find this error "Test died: no candidate needle with tag(s) 'inst-console' matched"
See these examples:

https://openqa.opensuse.org/tests/1847722#step/openqa_webui/3
https://openqa.opensuse.org/tests/1848261#step/openqa_webui/3

Acceptance Criteria

  • AC 1: the test pass without these errors in high enough runs (check the current frequency of the problem)

Suggestion

  • Try to reproduce the problem locally
  • In the worst case increase the timeout

History

#1 Updated by ilausuch 2 months ago

Comparing a bad run with a good one in the same point (https://openqa.opensuse.org/tests/1847721#step/openqa_webui/1) I checked that in the good one we have a console view instead of graphical view

#2 Updated by okurz 2 months ago

  • Priority changed from Normal to High
  • Target version set to Ready

#3 Updated by mkittler 2 months ago

The test is switching to tty3 and waits for it by looking for the needle inst-console. After the assert_screen fails tty3 shows up. So was the SUT just too slow (slower than the timeout of 30 seconds)?

#4 Updated by ilausuch 2 months ago

SUT: system under test

#5 Updated by ilausuch 2 months ago

  • Subject changed from [Sporadic] containers: tests fail with "Test died: no candidate needle with tag(s) 'inst-console' matched" to [Sporadic] containers: tests fail with "Test died: no candidate needle with tag(s) 'inst-console' matched" size:M
  • Description updated (diff)

#6 Updated by ilausuch 2 months ago

  • Description updated (diff)

#7 Updated by ilausuch 2 months ago

  • Status changed from New to Workable

#8 Updated by ilausuch about 2 months ago

I found that since now (in one month) we had 8 occurrences, and more o less one per day

#9 Updated by ilausuch about 2 months ago

  • Assignee set to ilausuch

#10 Updated by ilausuch about 2 months ago

  • Status changed from Workable to In Progress

The problem I can see here is that the send_key is not working eventually. The send_key is used to change to the console https://github.com/os-autoinst/os-autoinst-distri-openQA/blob/cd7288c4f14bb11ad24155f0a9777c29b2d563c8/tests/install/openqa_webui.pm#L77

[2021-07-19T16:39:25.588 CEST] [debug] no match: 29.0s, best candidate: gnome-desktop-20190509 (0.00)
[2021-07-19T16:39:26.617 CEST] [debug] >>> testapi::_handle_found_needle: found openqa-boot-menu-Tumbleweed-20190329, similarity 1.00 @ 65/11
[2021-07-19T16:39:26.618 CEST] [debug] /tests/install/boot.pm:6 called utils::wait_for_desktop -> lib/utils.pm:20 called testapi::send_key
[2021-07-19T16:39:26.618 CEST] [debug] <<< testapi::send_key(key="ret", wait_screen_change=0, do_wait=0)
[2021-07-19T16:39:26.888 CEST] [debug] /tests/install/boot.pm:6 called utils::wait_for_desktop -> lib/utils.pm:23 called testapi::assert_screen
[2021-07-19T16:39:26.888 CEST] [debug] <<< testapi::assert_screen(mustmatch="openqa-desktop", timeout=500)
[2021-07-19T16:39:27.490 CEST] [debug] no match: 499.4s, best candidate: gnome-desktop-20190509 (0.00)

There are other old cases
https://progress.opensuse.org/issues/72898
https://progress.opensuse.org/issues/88436
https://progress.opensuse.org/issues/89197

I am preparing an solution based on the investigation

#11 Updated by openqa_review about 2 months ago

  • Due date set to 2021-08-19

Setting due date based on mean cycle time of SUSE QE Tools

#13 Updated by ilausuch 28 days ago

I want to probe that this solution is good enough launching 40 tests.
If we have all en green we could assume that the solution is ok (but not with all garnaties). On the other hand, if we have a red one failing in this point we can say that 1 minute is enough to change to the terminal screen and therefore the solution is not good and we have to find an other one

#14 Updated by ilausuch 28 days ago

I launched these tests
1884704, 1884706 - 1884725, 1884733 - 1884753 (exclue 1884705)

#15 Updated by ilausuch 28 days ago

All tests passed this part of the test.
Few failed in other steps like https://openqa.opensuse.org/tests/1884716
This could be considered as enough probe then

#16 Updated by okurz 27 days ago

  • Status changed from In Progress to Resolved

Thanks. With this information we can regard the initial problem as something temporary that we do not want to fix right now. Although I assume the very same could come back and then we should look into extending the initial timeout.

Also available in: Atom PDF